diff --git "a/sf_log.txt" "b/sf_log.txt" new file mode 100644--- /dev/null +++ "b/sf_log.txt" @@ -0,0 +1,48128 @@ +[2024-06-17 21:58:39,292][12645] Saving configuration to /workspace/metta/train_dir/p2.dr4/config.json... +[2024-06-17 21:58:39,308][12645] Rollout worker 0 uses device cpu +[2024-06-17 21:58:39,309][12645] Rollout worker 1 uses device cpu +[2024-06-17 21:58:39,309][12645] Rollout worker 2 uses device cpu +[2024-06-17 21:58:39,310][12645] Rollout worker 3 uses device cpu +[2024-06-17 21:58:39,310][12645] Rollout worker 4 uses device cpu +[2024-06-17 21:58:39,310][12645] Rollout worker 5 uses device cpu +[2024-06-17 21:58:39,310][12645] Rollout worker 6 uses device cpu +[2024-06-17 21:58:39,311][12645] Rollout worker 7 uses device cpu +[2024-06-17 21:58:39,311][12645] Rollout worker 8 uses device cpu +[2024-06-17 21:58:39,311][12645] Rollout worker 9 uses device cpu +[2024-06-17 21:58:39,311][12645] Rollout worker 10 uses device cpu +[2024-06-17 21:58:39,312][12645] Rollout worker 11 uses device cpu +[2024-06-17 21:58:39,312][12645] Rollout worker 12 uses device cpu +[2024-06-17 21:58:39,312][12645] Rollout worker 13 uses device cpu +[2024-06-17 21:58:39,313][12645] Rollout worker 14 uses device cpu +[2024-06-17 21:58:39,313][12645] Rollout worker 15 uses device cpu +[2024-06-17 21:58:39,313][12645] Rollout worker 16 uses device cpu +[2024-06-17 21:58:39,313][12645] Rollout worker 17 uses device cpu +[2024-06-17 21:58:39,313][12645] Rollout worker 18 uses device cpu +[2024-06-17 21:58:39,313][12645] Rollout worker 19 uses device cpu +[2024-06-17 21:58:39,314][12645] Rollout worker 20 uses device cpu +[2024-06-17 21:58:39,314][12645] Rollout worker 21 uses device cpu +[2024-06-17 21:58:39,314][12645] Rollout worker 22 uses device cpu +[2024-06-17 21:58:39,314][12645] Rollout worker 23 uses device cpu +[2024-06-17 21:58:39,314][12645] Rollout worker 24 uses device cpu +[2024-06-17 21:58:39,314][12645] Rollout worker 25 uses device cpu +[2024-06-17 21:58:39,315][12645] Rollout worker 26 uses device cpu +[2024-06-17 21:58:39,315][12645] Rollout worker 27 uses device cpu +[2024-06-17 21:58:39,315][12645] Rollout worker 28 uses device cpu +[2024-06-17 21:58:39,315][12645] Rollout worker 29 uses device cpu +[2024-06-17 21:58:39,315][12645] Rollout worker 30 uses device cpu +[2024-06-17 21:58:39,315][12645] Rollout worker 31 uses device cpu +[2024-06-17 21:58:39,889][12645] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-06-17 21:58:39,889][12645] InferenceWorker_p0-w0: min num requests: 10 +[2024-06-17 21:58:39,964][12645] Starting all processes... +[2024-06-17 21:58:39,965][12645] Starting process learner_proc0 +[2024-06-17 21:58:40,196][12645] Starting all processes... +[2024-06-17 21:58:40,199][12645] Starting process inference_proc0-0 +[2024-06-17 21:58:40,199][12645] Starting process rollout_proc0 +[2024-06-17 21:58:40,199][12645] Starting process rollout_proc1 +[2024-06-17 21:58:40,199][12645] Starting process rollout_proc2 +[2024-06-17 21:58:40,199][12645] Starting process rollout_proc3 +[2024-06-17 21:58:40,200][12645] Starting process rollout_proc4 +[2024-06-17 21:58:40,201][12645] Starting process rollout_proc5 +[2024-06-17 21:58:40,249][12645] Starting process rollout_proc6 +[2024-06-17 21:58:40,250][12645] Starting process rollout_proc7 +[2024-06-17 21:58:40,251][12645] Starting process rollout_proc8 +[2024-06-17 21:58:40,252][12645] Starting process rollout_proc9 +[2024-06-17 21:58:40,252][12645] Starting process rollout_proc10 +[2024-06-17 21:58:40,254][12645] Starting process rollout_proc11 +[2024-06-17 21:58:40,254][12645] Starting process rollout_proc12 +[2024-06-17 21:58:40,255][12645] Starting process rollout_proc13 +[2024-06-17 21:58:40,256][12645] Starting process rollout_proc14 +[2024-06-17 21:58:40,256][12645] Starting process rollout_proc15 +[2024-06-17 21:58:40,256][12645] Starting process rollout_proc16 +[2024-06-17 21:58:40,256][12645] Starting process rollout_proc17 +[2024-06-17 21:58:40,257][12645] Starting process rollout_proc18 +[2024-06-17 21:58:40,257][12645] Starting process rollout_proc19 +[2024-06-17 21:58:40,258][12645] Starting process rollout_proc20 +[2024-06-17 21:58:40,258][12645] Starting process rollout_proc21 +[2024-06-17 21:58:40,261][12645] Starting process rollout_proc22 +[2024-06-17 21:58:40,262][12645] Starting process rollout_proc23 +[2024-06-17 21:58:40,265][12645] Starting process rollout_proc24 +[2024-06-17 21:58:40,265][12645] Starting process rollout_proc25 +[2024-06-17 21:58:40,265][12645] Starting process rollout_proc26 +[2024-06-17 21:58:40,272][12645] Starting process rollout_proc27 +[2024-06-17 21:58:40,279][12645] Starting process rollout_proc28 +[2024-06-17 21:58:40,286][12645] Starting process rollout_proc29 +[2024-06-17 21:58:40,293][12645] Starting process rollout_proc30 +[2024-06-17 21:58:40,294][12645] Starting process rollout_proc31 +[2024-06-17 21:58:42,210][12932] Worker 18 uses CPU cores [18] +[2024-06-17 21:58:42,472][12887] Worker 4 uses CPU cores [4] +[2024-06-17 21:58:42,490][12862] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-06-17 21:58:42,491][12862] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2024-06-17 21:58:42,499][12862] Num visible devices: 1 +[2024-06-17 21:58:42,508][12889] Worker 5 uses CPU cores [5] +[2024-06-17 21:58:42,512][12862] Setting fixed seed 0 +[2024-06-17 21:58:42,513][12862] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-06-17 21:58:42,513][12862] Initializing actor-critic model on device cuda:0 +[2024-06-17 21:58:42,529][12884] Worker 2 uses CPU cores [2] +[2024-06-17 21:58:42,532][12929] Worker 15 uses CPU cores [15] +[2024-06-17 21:58:42,542][12883] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-06-17 21:58:42,543][12883] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2024-06-17 21:58:42,553][12883] Num visible devices: 1 +[2024-06-17 21:58:42,561][12885] Worker 1 uses CPU cores [1] +[2024-06-17 21:58:42,631][12891] Worker 8 uses CPU cores [8] +[2024-06-17 21:58:42,640][13069] Worker 31 uses CPU cores [31] +[2024-06-17 21:58:42,663][12927] Worker 11 uses CPU cores [11] +[2024-06-17 21:58:42,671][13035] Worker 27 uses CPU cores [27] +[2024-06-17 21:58:42,680][12937] Worker 23 uses CPU cores [23] +[2024-06-17 21:58:42,685][12925] Worker 12 uses CPU cores [12] +[2024-06-17 21:58:42,688][12936] Worker 22 uses CPU cores [22] +[2024-06-17 21:58:42,708][12882] Worker 0 uses CPU cores [0] +[2024-06-17 21:58:42,725][12892] Worker 10 uses CPU cores [10] +[2024-06-17 21:58:42,744][12935] Worker 21 uses CPU cores [21] +[2024-06-17 21:58:42,749][12928] Worker 14 uses CPU cores [14] +[2024-06-17 21:58:42,760][12933] Worker 19 uses CPU cores [19] +[2024-06-17 21:58:42,784][12967] Worker 25 uses CPU cores [25] +[2024-06-17 21:58:42,848][13068] Worker 30 uses CPU cores [30] +[2024-06-17 21:58:42,859][12926] Worker 13 uses CPU cores [13] +[2024-06-17 21:58:42,868][13067] Worker 29 uses CPU cores [29] +[2024-06-17 21:58:42,869][12893] Worker 9 uses CPU cores [9] +[2024-06-17 21:58:42,888][12931] Worker 17 uses CPU cores [17] +[2024-06-17 21:58:42,898][12934] Worker 20 uses CPU cores [20] +[2024-06-17 21:58:42,916][12890] Worker 7 uses CPU cores [7] +[2024-06-17 21:58:42,931][12886] Worker 3 uses CPU cores [3] +[2024-06-17 21:58:42,941][12888] Worker 6 uses CPU cores [6] +[2024-06-17 21:58:43,003][13033] Worker 26 uses CPU cores [26] +[2024-06-17 21:58:43,011][12930] Worker 16 uses CPU cores [16] +[2024-06-17 21:58:43,052][13034] Worker 28 uses CPU cores [28] +[2024-06-17 21:58:43,086][12970] Worker 24 uses CPU cores [24] +[2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,384][12862] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:43,385][12862] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:43,385][12862] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:43,385][12862] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:43,385][12862] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:43,425][12862] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:43,429][12862] Created Actor Critic model with architecture: +[2024-06-17 21:58:43,429][12862] SampleFactoryAgentWrapper( + (obs_normalizer): ObservationNormalizer() + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (agent): MettaAgent( + (_encoder): MultiFeatureSetEncoder( + (feature_set_encoders): ModuleDict( + (grid_obs): FeatureSetEncoder( + (_normalizer): FeatureListNormalizer( + (_norms_dict): ModuleDict( + (agent): RunningMeanStdInPlace() + (altar): RunningMeanStdInPlace() + (clock): RunningMeanStdInPlace() + (converter): RunningMeanStdInPlace() + (generator): RunningMeanStdInPlace() + (wall): RunningMeanStdInPlace() + (agent:dir): RunningMeanStdInPlace() + (agent:energy): RunningMeanStdInPlace() + (agent:frozen): RunningMeanStdInPlace() + (agent:hp): RunningMeanStdInPlace() + (agent:id): RunningMeanStdInPlace() + (agent:inv_r1): RunningMeanStdInPlace() + (agent:inv_r2): RunningMeanStdInPlace() + (agent:inv_r3): RunningMeanStdInPlace() + (agent:shield): RunningMeanStdInPlace() + (altar:hp): RunningMeanStdInPlace() + (altar:state): RunningMeanStdInPlace() + (converter:hp): RunningMeanStdInPlace() + (converter:state): RunningMeanStdInPlace() + (generator:amount): RunningMeanStdInPlace() + (generator:hp): RunningMeanStdInPlace() + (generator:state): RunningMeanStdInPlace() + (wall:hp): RunningMeanStdInPlace() + ) + ) + (embedding_net): Sequential( + (0): Linear(in_features=125, out_features=512, bias=True) + (1): ELU(alpha=1.0) + (2): Linear(in_features=512, out_features=512, bias=True) + (3): ELU(alpha=1.0) + (4): Linear(in_features=512, out_features=512, bias=True) + (5): ELU(alpha=1.0) + (6): Linear(in_features=512, out_features=512, bias=True) + (7): ELU(alpha=1.0) + ) + ) + (global_vars): FeatureSetEncoder( + (_normalizer): FeatureListNormalizer( + (_norms_dict): ModuleDict( + (_steps): RunningMeanStdInPlace() + ) + ) + (embedding_net): Sequential( + (0): Linear(in_features=5, out_features=8, bias=True) + (1): ELU(alpha=1.0) + (2): Linear(in_features=8, out_features=8, bias=True) + (3): ELU(alpha=1.0) + ) + ) + (last_action): FeatureSetEncoder( + (_normalizer): FeatureListNormalizer( + (_norms_dict): ModuleDict( + (last_action_id): RunningMeanStdInPlace() + (last_action_val): RunningMeanStdInPlace() + ) + ) + (embedding_net): Sequential( + (0): Linear(in_features=5, out_features=8, bias=True) + (1): ELU(alpha=1.0) + (2): Linear(in_features=8, out_features=8, bias=True) + (3): ELU(alpha=1.0) + ) + ) + (last_reward): FeatureSetEncoder( + (_normalizer): FeatureListNormalizer( + (_norms_dict): ModuleDict( + (last_reward): RunningMeanStdInPlace() + ) + ) + (embedding_net): Sequential( + (0): Linear(in_features=5, out_features=8, bias=True) + (1): ELU(alpha=1.0) + (2): Linear(in_features=8, out_features=8, bias=True) + (3): ELU(alpha=1.0) + ) + ) + (kinship): FeatureSetEncoder( + (_normalizer): FeatureListNormalizer( + (_norms_dict): ModuleDict( + (kinship): RunningMeanStdInPlace() + ) + ) + (embedding_net): Sequential( + (0): Linear(in_features=125, out_features=8, bias=True) + (1): ELU(alpha=1.0) + (2): Linear(in_features=8, out_features=8, bias=True) + (3): ELU(alpha=1.0) + ) + ) + ) + (merged_encoder): Sequential( + (0): Linear(in_features=544, out_features=512, bias=True) + (1): ELU(alpha=1.0) + (2): Linear(in_features=512, out_features=512, bias=True) + (3): ELU(alpha=1.0) + (4): Linear(in_features=512, out_features=512, bias=True) + (5): ELU(alpha=1.0) + ) + ) + (_core): ModelCoreRNN( + (core): GRU(512, 512) + ) + (_decoder): Decoder( + (mlp): Identity() + ) + (_critic_linear): Linear(in_features=512, out_features=1, bias=True) + (_action_parameterization): ActionParameterizationDefault( + (distribution_linear): Linear(in_features=512, out_features=16, bias=True) + ) + ) +) +[2024-06-17 21:58:43,499][12862] Using optimizer +[2024-06-17 21:58:43,684][12862] No checkpoints found +[2024-06-17 21:58:43,685][12862] Did not load from checkpoint, starting from scratch! +[2024-06-17 21:58:43,685][12862] Initialized policy 0 weights for model version 0 +[2024-06-17 21:58:43,686][12862] LearnerWorker_p0 finished initialization! +[2024-06-17 21:58:43,686][12862] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,456][12883] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:44,457][12883] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:44,457][12883] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:44,457][12883] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:44,457][12883] RunningMeanStd input shape: (11, 11) +[2024-06-17 21:58:44,497][12883] RunningMeanStd input shape: (1,) +[2024-06-17 21:58:44,520][12645] Inference worker 0-0 is ready! +[2024-06-17 21:58:44,520][12645] All inference workers are ready! Signal rollout workers to start! +[2024-06-17 21:58:46,994][12645] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2024-06-17 21:58:47,270][12934] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,281][12970] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,282][12926] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,289][12937] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,301][12889] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,313][12885] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,332][12967] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,348][13067] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,353][12930] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,361][12887] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,361][13034] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,375][12884] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,375][12886] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,383][12888] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,385][12936] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,390][12935] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,392][13068] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,403][12929] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,404][12933] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,406][12891] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,433][12892] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,443][13033] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,453][12925] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,459][13069] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,457][13035] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,460][12928] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,468][12932] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,468][12893] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,475][12882] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,481][12927] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,488][12931] Decorrelating experience for 0 frames... +[2024-06-17 21:58:47,494][12890] Decorrelating experience for 0 frames... +[2024-06-17 21:58:48,401][12934] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,468][12970] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,470][12926] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,525][12886] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,547][13034] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,547][13067] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,559][12888] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,618][12884] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,647][12885] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,663][13068] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,664][12967] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,670][12892] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,671][12937] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,699][12882] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,707][12889] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,712][12930] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,725][12893] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,726][12931] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,743][12891] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,744][12935] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,752][12936] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,764][12928] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,773][12887] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,776][13033] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,783][12925] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,802][12927] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,802][12932] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,803][12929] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,809][12933] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,823][13069] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,830][12890] Decorrelating experience for 256 frames... +[2024-06-17 21:58:48,852][13035] Decorrelating experience for 256 frames... +[2024-06-17 21:58:51,994][12645] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 748.0. Samples: 3740. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2024-06-17 21:58:56,994][12645] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 31075.6. Samples: 310760. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2024-06-17 21:58:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:58:57,045][13069] Worker 31, sleep for 145.312 sec to decorrelate experience collection +[2024-06-17 21:58:57,166][12892] Worker 10, sleep for 46.875 sec to decorrelate experience collection +[2024-06-17 21:58:57,212][12926] Worker 13, sleep for 60.938 sec to decorrelate experience collection +[2024-06-17 21:58:57,246][12884] Worker 2, sleep for 9.375 sec to decorrelate experience collection +[2024-06-17 21:58:57,280][12886] Worker 3, sleep for 14.062 sec to decorrelate experience collection +[2024-06-17 21:58:57,304][12885] Worker 1, sleep for 4.688 sec to decorrelate experience collection +[2024-06-17 21:58:57,347][12893] Worker 9, sleep for 42.188 sec to decorrelate experience collection +[2024-06-17 21:58:57,366][13034] Worker 28, sleep for 131.250 sec to decorrelate experience collection +[2024-06-17 21:58:57,404][12862] Signal inference workers to stop experience collection... +[2024-06-17 21:58:57,404][12925] Worker 12, sleep for 56.250 sec to decorrelate experience collection +[2024-06-17 21:58:57,409][12887] Worker 4, sleep for 18.750 sec to decorrelate experience collection +[2024-06-17 21:58:57,412][13068] Worker 30, sleep for 140.625 sec to decorrelate experience collection +[2024-06-17 21:58:57,417][12883] InferenceWorker_p0-w0: stopping experience collection +[2024-06-17 21:58:57,424][12929] Worker 15, sleep for 70.312 sec to decorrelate experience collection +[2024-06-17 21:58:58,026][12862] Signal inference workers to resume experience collection... +[2024-06-17 21:58:58,026][12883] InferenceWorker_p0-w0: resuming experience collection +[2024-06-17 21:58:58,052][12889] Worker 5, sleep for 23.438 sec to decorrelate experience collection +[2024-06-17 21:58:58,279][12927] Worker 11, sleep for 51.562 sec to decorrelate experience collection +[2024-06-17 21:58:58,409][12970] Worker 24, sleep for 112.500 sec to decorrelate experience collection +[2024-06-17 21:58:58,455][13067] Worker 29, sleep for 135.938 sec to decorrelate experience collection +[2024-06-17 21:58:58,465][12891] Worker 8, sleep for 37.500 sec to decorrelate experience collection +[2024-06-17 21:58:58,528][12888] Worker 6, sleep for 28.125 sec to decorrelate experience collection +[2024-06-17 21:58:58,537][12934] Worker 20, sleep for 93.750 sec to decorrelate experience collection +[2024-06-17 21:58:58,592][12928] Worker 14, sleep for 65.625 sec to decorrelate experience collection +[2024-06-17 21:58:58,594][12931] Worker 17, sleep for 79.688 sec to decorrelate experience collection +[2024-06-17 21:58:58,595][12930] Worker 16, sleep for 75.000 sec to decorrelate experience collection +[2024-06-17 21:58:58,595][12935] Worker 21, sleep for 98.438 sec to decorrelate experience collection +[2024-06-17 21:58:58,595][13033] Worker 26, sleep for 121.875 sec to decorrelate experience collection +[2024-06-17 21:58:58,595][13035] Worker 27, sleep for 126.562 sec to decorrelate experience collection +[2024-06-17 21:58:58,613][12967] Worker 25, sleep for 117.188 sec to decorrelate experience collection +[2024-06-17 21:58:58,614][12936] Worker 22, sleep for 103.125 sec to decorrelate experience collection +[2024-06-17 21:58:58,647][12890] Worker 7, sleep for 32.812 sec to decorrelate experience collection +[2024-06-17 21:58:58,696][12932] Worker 18, sleep for 84.375 sec to decorrelate experience collection +[2024-06-17 21:58:58,696][12937] Worker 23, sleep for 107.812 sec to decorrelate experience collection +[2024-06-17 21:58:58,747][12933] Worker 19, sleep for 89.062 sec to decorrelate experience collection +[2024-06-17 21:58:59,226][12883] Updated weights for policy 0, policy_version 10 (0.0013) +[2024-06-17 21:58:59,886][12645] Heartbeat connected on Batcher_0 +[2024-06-17 21:58:59,888][12645] Heartbeat connected on LearnerWorker_p0 +[2024-06-17 21:58:59,902][12645] Heartbeat connected on RolloutWorker_w0 +[2024-06-17 21:58:59,953][12645] Heartbeat connected on InferenceWorker_p0-w0 +[2024-06-17 21:59:01,994][12645] Fps is (10 sec: 16383.6, 60 sec: 10922.5, 300 sec: 10922.5). Total num frames: 163840. Throughput: 0: 21925.0. Samples: 328880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-06-17 21:59:01,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:01,996][12862] Saving new best policy, reward=0.000! +[2024-06-17 21:59:02,015][12885] Worker 1 awakens! +[2024-06-17 21:59:02,020][12645] Heartbeat connected on RolloutWorker_w1 +[2024-06-17 21:59:06,668][12884] Worker 2 awakens! +[2024-06-17 21:59:06,673][12645] Heartbeat connected on RolloutWorker_w2 +[2024-06-17 21:59:06,994][12645] Fps is (10 sec: 16384.3, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 163840. Throughput: 0: 17032.0. Samples: 340640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2024-06-17 21:59:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:11,413][12886] Worker 3 awakens! +[2024-06-17 21:59:11,423][12645] Heartbeat connected on RolloutWorker_w3 +[2024-06-17 21:59:11,994][12645] Fps is (10 sec: 3276.9, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 196608. Throughput: 0: 14449.6. Samples: 361240. Policy #0 lag: (min: 0.0, avg: 1.1, max: 10.0) +[2024-06-17 21:59:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:16,252][12887] Worker 4 awakens! +[2024-06-17 21:59:16,258][12645] Heartbeat connected on RolloutWorker_w4 +[2024-06-17 21:59:16,994][12645] Fps is (10 sec: 6553.6, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 229376. Throughput: 0: 12552.0. Samples: 376560. Policy #0 lag: (min: 0.0, avg: 4.4, max: 12.0) +[2024-06-17 21:59:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:21,588][12889] Worker 5 awakens! +[2024-06-17 21:59:21,594][12645] Heartbeat connected on RolloutWorker_w5 +[2024-06-17 21:59:21,994][12645] Fps is (10 sec: 8192.0, 60 sec: 7958.0, 300 sec: 7958.0). Total num frames: 278528. Throughput: 0: 12174.9. Samples: 426120. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) +[2024-06-17 21:59:22,001][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:26,578][12883] Updated weights for policy 0, policy_version 20 (0.0013) +[2024-06-17 21:59:26,752][12888] Worker 6 awakens! +[2024-06-17 21:59:26,758][12645] Heartbeat connected on RolloutWorker_w6 +[2024-06-17 21:59:26,994][12645] Fps is (10 sec: 9830.4, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 327680. Throughput: 0: 12341.0. Samples: 493640. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) +[2024-06-17 21:59:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:31,559][12890] Worker 7 awakens! +[2024-06-17 21:59:31,567][12645] Heartbeat connected on RolloutWorker_w7 +[2024-06-17 21:59:31,994][12645] Fps is (10 sec: 13107.0, 60 sec: 9102.2, 300 sec: 9102.2). Total num frames: 409600. Throughput: 0: 12029.3. Samples: 541320. Policy #0 lag: (min: 0.0, avg: 2.6, max: 5.0) +[2024-06-17 21:59:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:36,064][12891] Worker 8 awakens! +[2024-06-17 21:59:36,070][12645] Heartbeat connected on RolloutWorker_w8 +[2024-06-17 21:59:36,598][12883] Updated weights for policy 0, policy_version 30 (0.0012) +[2024-06-17 21:59:36,994][12645] Fps is (10 sec: 16383.9, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 491520. Throughput: 0: 14041.8. Samples: 635620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 29.0) +[2024-06-17 21:59:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 21:59:39,635][12893] Worker 9 awakens! +[2024-06-17 21:59:39,644][12645] Heartbeat connected on RolloutWorker_w9 +[2024-06-17 21:59:41,994][12645] Fps is (10 sec: 18022.5, 60 sec: 10724.1, 300 sec: 10724.1). Total num frames: 589824. Throughput: 0: 9669.8. Samples: 745900. Policy #0 lag: (min: 0.0, avg: 3.4, max: 6.0) +[2024-06-17 21:59:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:44,140][12892] Worker 10 awakens! +[2024-06-17 21:59:44,146][12645] Heartbeat connected on RolloutWorker_w10 +[2024-06-17 21:59:45,349][12883] Updated weights for policy 0, policy_version 40 (0.0016) +[2024-06-17 21:59:46,994][12645] Fps is (10 sec: 19660.6, 60 sec: 11468.8, 300 sec: 11468.8). Total num frames: 688128. Throughput: 0: 10766.7. Samples: 813380. Policy #0 lag: (min: 0.0, avg: 14.2, max: 37.0) +[2024-06-17 21:59:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:49,862][12927] Worker 11 awakens! +[2024-06-17 21:59:49,869][12645] Heartbeat connected on RolloutWorker_w11 +[2024-06-17 21:59:51,577][12883] Updated weights for policy 0, policy_version 50 (0.0015) +[2024-06-17 21:59:51,994][12645] Fps is (10 sec: 22937.5, 60 sec: 13653.3, 300 sec: 12603.1). Total num frames: 819200. Throughput: 0: 13587.5. Samples: 952080. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) +[2024-06-17 21:59:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 21:59:53,662][12925] Worker 12 awakens! +[2024-06-17 21:59:53,669][12645] Heartbeat connected on RolloutWorker_w12 +[2024-06-17 21:59:56,994][12645] Fps is (10 sec: 26214.4, 60 sec: 15837.9, 300 sec: 13575.3). Total num frames: 950272. Throughput: 0: 16641.7. Samples: 1110120. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) +[2024-06-17 21:59:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 21:59:58,250][12926] Worker 13 awakens! +[2024-06-17 21:59:58,259][12645] Heartbeat connected on RolloutWorker_w13 +[2024-06-17 21:59:58,311][12883] Updated weights for policy 0, policy_version 60 (0.0019) +[2024-06-17 22:00:01,994][12645] Fps is (10 sec: 26214.3, 60 sec: 15291.8, 300 sec: 14417.9). Total num frames: 1081344. Throughput: 0: 18011.9. Samples: 1187100. Policy #0 lag: (min: 0.0, avg: 21.6, max: 60.0) +[2024-06-17 22:00:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:04,319][12928] Worker 14 awakens! +[2024-06-17 22:00:04,327][12645] Heartbeat connected on RolloutWorker_w14 +[2024-06-17 22:00:04,766][12883] Updated weights for policy 0, policy_version 70 (0.0022) +[2024-06-17 22:00:06,994][12645] Fps is (10 sec: 27852.6, 60 sec: 17749.3, 300 sec: 15360.0). Total num frames: 1228800. Throughput: 0: 20463.0. Samples: 1346960. Policy #0 lag: (min: 0.0, avg: 4.0, max: 10.0) +[2024-06-17 22:00:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:07,836][12929] Worker 15 awakens! +[2024-06-17 22:00:07,844][12645] Heartbeat connected on RolloutWorker_w15 +[2024-06-17 22:00:10,146][12883] Updated weights for policy 0, policy_version 80 (0.0020) +[2024-06-17 22:00:11,994][12645] Fps is (10 sec: 27852.6, 60 sec: 19387.7, 300 sec: 15998.5). Total num frames: 1359872. Throughput: 0: 22864.8. Samples: 1522560. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) +[2024-06-17 22:00:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:13,692][12930] Worker 16 awakens! +[2024-06-17 22:00:13,703][12645] Heartbeat connected on RolloutWorker_w16 +[2024-06-17 22:00:15,942][12883] Updated weights for policy 0, policy_version 90 (0.0027) +[2024-06-17 22:00:16,994][12645] Fps is (10 sec: 27852.9, 60 sec: 21299.1, 300 sec: 16748.1). Total num frames: 1507328. Throughput: 0: 23800.4. Samples: 1612340. Policy #0 lag: (min: 0.0, avg: 4.5, max: 12.0) +[2024-06-17 22:00:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:18,380][12931] Worker 17 awakens! +[2024-06-17 22:00:18,391][12645] Heartbeat connected on RolloutWorker_w17 +[2024-06-17 22:00:21,278][12883] Updated weights for policy 0, policy_version 100 (0.0026) +[2024-06-17 22:00:21,994][12645] Fps is (10 sec: 29491.7, 60 sec: 22937.6, 300 sec: 17418.8). Total num frames: 1654784. Throughput: 0: 25725.8. Samples: 1793280. Policy #0 lag: (min: 0.0, avg: 6.2, max: 12.0) +[2024-06-17 22:00:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:23,169][12932] Worker 18 awakens! +[2024-06-17 22:00:23,179][12645] Heartbeat connected on RolloutWorker_w18 +[2024-06-17 22:00:26,576][12883] Updated weights for policy 0, policy_version 110 (0.0028) +[2024-06-17 22:00:26,994][12645] Fps is (10 sec: 29491.2, 60 sec: 24575.9, 300 sec: 18022.4). Total num frames: 1802240. Throughput: 0: 27466.2. Samples: 1981880. Policy #0 lag: (min: 0.0, avg: 7.0, max: 13.0) +[2024-06-17 22:00:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:27,908][12933] Worker 19 awakens! +[2024-06-17 22:00:27,930][12645] Heartbeat connected on RolloutWorker_w19 +[2024-06-17 22:00:31,211][12883] Updated weights for policy 0, policy_version 120 (0.0030) +[2024-06-17 22:00:31,994][12645] Fps is (10 sec: 32767.5, 60 sec: 26214.4, 300 sec: 18880.6). Total num frames: 1982464. Throughput: 0: 27990.6. Samples: 2072960. Policy #0 lag: (min: 0.0, avg: 7.0, max: 13.0) +[2024-06-17 22:00:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:00:31,995][12862] Saving new best policy, reward=0.001! +[2024-06-17 22:00:32,384][12934] Worker 20 awakens! +[2024-06-17 22:00:32,400][12645] Heartbeat connected on RolloutWorker_w20 +[2024-06-17 22:00:36,041][12883] Updated weights for policy 0, policy_version 130 (0.0036) +[2024-06-17 22:00:36,994][12645] Fps is (10 sec: 34406.7, 60 sec: 27579.7, 300 sec: 19511.9). Total num frames: 2146304. Throughput: 0: 29304.9. Samples: 2270800. Policy #0 lag: (min: 0.0, avg: 6.5, max: 13.0) +[2024-06-17 22:00:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000131_2146304.pth... +[2024-06-17 22:00:37,133][12935] Worker 21 awakens! +[2024-06-17 22:00:37,146][12645] Heartbeat connected on RolloutWorker_w21 +[2024-06-17 22:00:41,469][12883] Updated weights for policy 0, policy_version 140 (0.0032) +[2024-06-17 22:00:41,839][12936] Worker 22 awakens! +[2024-06-17 22:00:41,852][12645] Heartbeat connected on RolloutWorker_w22 +[2024-06-17 22:00:41,994][12645] Fps is (10 sec: 32767.9, 60 sec: 28671.9, 300 sec: 20088.2). Total num frames: 2310144. Throughput: 0: 30242.2. Samples: 2471020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 14.0) +[2024-06-17 22:00:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:46,324][12883] Updated weights for policy 0, policy_version 150 (0.0032) +[2024-06-17 22:00:46,608][12937] Worker 23 awakens! +[2024-06-17 22:00:46,622][12645] Heartbeat connected on RolloutWorker_w23 +[2024-06-17 22:00:46,994][12645] Fps is (10 sec: 32767.9, 60 sec: 29764.3, 300 sec: 20616.5). Total num frames: 2473984. Throughput: 0: 30896.5. Samples: 2577440. Policy #0 lag: (min: 0.0, avg: 16.3, max: 150.0) +[2024-06-17 22:00:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:50,528][12883] Updated weights for policy 0, policy_version 160 (0.0037) +[2024-06-17 22:00:51,010][12970] Worker 24 awakens! +[2024-06-17 22:00:51,025][12645] Heartbeat connected on RolloutWorker_w24 +[2024-06-17 22:00:51,994][12645] Fps is (10 sec: 34407.0, 60 sec: 30583.5, 300 sec: 21233.7). Total num frames: 2654208. Throughput: 0: 32045.9. Samples: 2789020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 17.0) +[2024-06-17 22:00:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:00:55,328][12883] Updated weights for policy 0, policy_version 170 (0.0027) +[2024-06-17 22:00:55,901][12967] Worker 25 awakens! +[2024-06-17 22:00:55,916][12645] Heartbeat connected on RolloutWorker_w25 +[2024-06-17 22:00:56,994][12645] Fps is (10 sec: 36044.4, 60 sec: 31402.6, 300 sec: 21803.3). Total num frames: 2834432. Throughput: 0: 32987.5. Samples: 3007000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 17.0) +[2024-06-17 22:00:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:00:59,055][12883] Updated weights for policy 0, policy_version 180 (0.0038) +[2024-06-17 22:01:00,572][13033] Worker 26 awakens! +[2024-06-17 22:01:00,588][12645] Heartbeat connected on RolloutWorker_w26 +[2024-06-17 22:01:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 32495.0, 300 sec: 22452.2). Total num frames: 3031040. Throughput: 0: 33505.9. Samples: 3120100. Policy #0 lag: (min: 0.0, avg: 7.3, max: 16.0) +[2024-06-17 22:01:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:01:03,400][12883] Updated weights for policy 0, policy_version 190 (0.0035) +[2024-06-17 22:01:05,259][13035] Worker 27 awakens! +[2024-06-17 22:01:05,274][12645] Heartbeat connected on RolloutWorker_w27 +[2024-06-17 22:01:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 33041.1, 300 sec: 22937.6). Total num frames: 3211264. Throughput: 0: 34543.5. Samples: 3347740. Policy #0 lag: (min: 0.0, avg: 43.3, max: 192.0) +[2024-06-17 22:01:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:08,545][12883] Updated weights for policy 0, policy_version 200 (0.0033) +[2024-06-17 22:01:08,716][13034] Worker 28 awakens! +[2024-06-17 22:01:08,730][12645] Heartbeat connected on RolloutWorker_w28 +[2024-06-17 22:01:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 34406.4, 300 sec: 23615.5). Total num frames: 3424256. Throughput: 0: 35381.8. Samples: 3574060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 18.0) +[2024-06-17 22:01:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:12,805][12883] Updated weights for policy 0, policy_version 210 (0.0042) +[2024-06-17 22:01:14,492][13067] Worker 29 awakens! +[2024-06-17 22:01:14,508][12645] Heartbeat connected on RolloutWorker_w29 +[2024-06-17 22:01:16,994][12645] Fps is (10 sec: 36045.1, 60 sec: 34406.5, 300 sec: 23811.4). Total num frames: 3571712. Throughput: 0: 35926.8. Samples: 3689660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) +[2024-06-17 22:01:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:17,243][12883] Updated weights for policy 0, policy_version 220 (0.0039) +[2024-06-17 22:01:18,140][13068] Worker 30 awakens! +[2024-06-17 22:01:18,155][12645] Heartbeat connected on RolloutWorker_w30 +[2024-06-17 22:01:21,148][12883] Updated weights for policy 0, policy_version 230 (0.0043) +[2024-06-17 22:01:21,994][12645] Fps is (10 sec: 37683.8, 60 sec: 35771.7, 300 sec: 24523.2). Total num frames: 3801088. Throughput: 0: 36589.8. Samples: 3917340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 22:01:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:22,460][13069] Worker 31 awakens! +[2024-06-17 22:01:22,479][12645] Heartbeat connected on RolloutWorker_w31 +[2024-06-17 22:01:25,914][12883] Updated weights for policy 0, policy_version 240 (0.0035) +[2024-06-17 22:01:26,995][12645] Fps is (10 sec: 42592.2, 60 sec: 36590.1, 300 sec: 24985.4). Total num frames: 3997696. Throughput: 0: 37200.7. Samples: 4145100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-17 22:01:26,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:29,707][12883] Updated weights for policy 0, policy_version 250 (0.0038) +[2024-06-17 22:01:31,996][12645] Fps is (10 sec: 37674.7, 60 sec: 36589.6, 300 sec: 25320.4). Total num frames: 4177920. Throughput: 0: 37460.9. Samples: 4263260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-17 22:01:31,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:34,206][12883] Updated weights for policy 0, policy_version 260 (0.0044) +[2024-06-17 22:01:36,994][12645] Fps is (10 sec: 37688.4, 60 sec: 37137.1, 300 sec: 25732.5). Total num frames: 4374528. Throughput: 0: 37856.8. Samples: 4492580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 22:01:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:01:37,021][12862] Saving new best policy, reward=0.003! +[2024-06-17 22:01:37,967][12883] Updated weights for policy 0, policy_version 270 (0.0032) +[2024-06-17 22:01:41,994][12645] Fps is (10 sec: 36052.7, 60 sec: 37137.1, 300 sec: 25933.5). Total num frames: 4538368. Throughput: 0: 38109.4. Samples: 4721920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:01:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:42,900][12883] Updated weights for policy 0, policy_version 280 (0.0036) +[2024-06-17 22:01:46,537][12883] Updated weights for policy 0, policy_version 290 (0.0035) +[2024-06-17 22:01:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 37956.3, 300 sec: 26396.4). Total num frames: 4751360. Throughput: 0: 38054.2. Samples: 4832540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 22:01:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:51,298][12883] Updated weights for policy 0, policy_version 300 (0.0033) +[2024-06-17 22:01:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 37956.3, 300 sec: 26657.2). Total num frames: 4931584. Throughput: 0: 38125.0. Samples: 5063360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:01:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:01:55,126][12883] Updated weights for policy 0, policy_version 310 (0.0031) +[2024-06-17 22:01:56,996][12645] Fps is (10 sec: 37674.9, 60 sec: 38228.0, 300 sec: 26990.2). Total num frames: 5128192. Throughput: 0: 38128.4. Samples: 5289920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 22:01:56,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:01:59,722][12883] Updated weights for policy 0, policy_version 320 (0.0040) +[2024-06-17 22:02:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 38502.4, 300 sec: 27390.7). Total num frames: 5341184. Throughput: 0: 38093.8. Samples: 5403880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) +[2024-06-17 22:02:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:02:03,931][12883] Updated weights for policy 0, policy_version 330 (0.0028) +[2024-06-17 22:02:06,998][12645] Fps is (10 sec: 37675.6, 60 sec: 38226.7, 300 sec: 27524.5). Total num frames: 5505024. Throughput: 0: 38067.5. Samples: 5630540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 22:02:06,998][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:02:08,200][12883] Updated weights for policy 0, policy_version 340 (0.0025) +[2024-06-17 22:02:10,248][12862] Signal inference workers to stop experience collection... (50 times) +[2024-06-17 22:02:10,249][12862] Signal inference workers to resume experience collection... (50 times) +[2024-06-17 22:02:10,287][12883] InferenceWorker_p0-w0: stopping experience collection (50 times) +[2024-06-17 22:02:10,288][12883] InferenceWorker_p0-w0: resuming experience collection (50 times) +[2024-06-17 22:02:11,994][12645] Fps is (10 sec: 36044.6, 60 sec: 37956.3, 300 sec: 27812.8). Total num frames: 5701632. Throughput: 0: 38143.4. Samples: 5861500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-17 22:02:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:02:12,639][12883] Updated weights for policy 0, policy_version 350 (0.0047) +[2024-06-17 22:02:16,899][12883] Updated weights for policy 0, policy_version 360 (0.0028) +[2024-06-17 22:02:16,994][12645] Fps is (10 sec: 39338.2, 60 sec: 38775.4, 300 sec: 28086.9). Total num frames: 5898240. Throughput: 0: 38178.7. Samples: 5981220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:02:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:02:21,563][12883] Updated weights for policy 0, policy_version 370 (0.0040) +[2024-06-17 22:02:21,994][12645] Fps is (10 sec: 37683.6, 60 sec: 37956.3, 300 sec: 28271.9). Total num frames: 6078464. Throughput: 0: 38054.3. Samples: 6205020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-17 22:02:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:02:25,128][12883] Updated weights for policy 0, policy_version 380 (0.0039) +[2024-06-17 22:02:26,994][12645] Fps is (10 sec: 37683.1, 60 sec: 37957.1, 300 sec: 28523.0). Total num frames: 6275072. Throughput: 0: 38260.8. Samples: 6443660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-17 22:02:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:02:29,687][12883] Updated weights for policy 0, policy_version 390 (0.0042) +[2024-06-17 22:02:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 37957.7, 300 sec: 28690.2). Total num frames: 6455296. Throughput: 0: 38264.6. Samples: 6554440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-17 22:02:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:02:33,991][12883] Updated weights for policy 0, policy_version 400 (0.0042) +[2024-06-17 22:02:37,000][12645] Fps is (10 sec: 39295.5, 60 sec: 38225.1, 300 sec: 28991.7). Total num frames: 6668288. Throughput: 0: 38195.1. Samples: 6782400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-17 22:02:37,001][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:02:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000407_6668288.pth... +[2024-06-17 22:02:38,182][12883] Updated weights for policy 0, policy_version 410 (0.0035) +[2024-06-17 22:02:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 38775.5, 300 sec: 29212.3). Total num frames: 6864896. Throughput: 0: 38193.5. Samples: 7008540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 22:02:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:02:42,930][12883] Updated weights for policy 0, policy_version 420 (0.0029) +[2024-06-17 22:02:46,848][12883] Updated weights for policy 0, policy_version 430 (0.0051) +[2024-06-17 22:02:46,994][12645] Fps is (10 sec: 37708.2, 60 sec: 38229.3, 300 sec: 29354.7). Total num frames: 7045120. Throughput: 0: 38266.1. Samples: 7125860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-17 22:02:46,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:02:51,593][12883] Updated weights for policy 0, policy_version 440 (0.0037) +[2024-06-17 22:02:51,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38229.3, 300 sec: 29491.2). Total num frames: 7225344. Throughput: 0: 38185.3. Samples: 7348720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-17 22:02:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:02:55,338][12883] Updated weights for policy 0, policy_version 450 (0.0039) +[2024-06-17 22:02:56,994][12645] Fps is (10 sec: 37682.8, 60 sec: 38230.6, 300 sec: 29687.8). Total num frames: 7421952. Throughput: 0: 38292.3. Samples: 7584660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-17 22:02:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:00,296][12883] Updated weights for policy 0, policy_version 460 (0.0041) +[2024-06-17 22:03:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 38229.3, 300 sec: 29941.0). Total num frames: 7634944. Throughput: 0: 38268.0. Samples: 7703280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-17 22:03:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:03,552][12883] Updated weights for policy 0, policy_version 470 (0.0028) +[2024-06-17 22:03:06,994][12645] Fps is (10 sec: 37683.8, 60 sec: 38232.0, 300 sec: 29995.3). Total num frames: 7798784. Throughput: 0: 38314.1. Samples: 7929160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 22:03:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:08,781][12883] Updated weights for policy 0, policy_version 480 (0.0037) +[2024-06-17 22:03:11,994][12645] Fps is (10 sec: 34406.3, 60 sec: 37956.2, 300 sec: 30109.5). Total num frames: 7979008. Throughput: 0: 38020.4. Samples: 8154580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-17 22:03:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:12,493][12883] Updated weights for policy 0, policy_version 490 (0.0043) +[2024-06-17 22:03:16,624][12883] Updated weights for policy 0, policy_version 500 (0.0039) +[2024-06-17 22:03:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 38229.3, 300 sec: 30340.7). Total num frames: 8192000. Throughput: 0: 38160.7. Samples: 8271680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 22:03:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:20,950][12883] Updated weights for policy 0, policy_version 510 (0.0034) +[2024-06-17 22:03:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 38502.3, 300 sec: 30504.0). Total num frames: 8388608. Throughput: 0: 38022.1. Samples: 8493140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-17 22:03:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:26,101][12883] Updated weights for policy 0, policy_version 520 (0.0037) +[2024-06-17 22:03:26,994][12645] Fps is (10 sec: 34407.0, 60 sec: 37683.3, 300 sec: 30485.9). Total num frames: 8536064. Throughput: 0: 38062.7. Samples: 8721360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-17 22:03:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:03:29,631][12883] Updated weights for policy 0, policy_version 530 (0.0041) +[2024-06-17 22:03:31,994][12645] Fps is (10 sec: 36044.9, 60 sec: 38229.3, 300 sec: 30698.4). Total num frames: 8749056. Throughput: 0: 37932.5. Samples: 8832820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 22:03:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:33,993][12883] Updated weights for policy 0, policy_version 540 (0.0044) +[2024-06-17 22:03:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 37960.5, 300 sec: 30847.1). Total num frames: 8945664. Throughput: 0: 38247.2. Samples: 9069840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 22:03:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:37,800][12883] Updated weights for policy 0, policy_version 550 (0.0041) +[2024-06-17 22:03:41,994][12645] Fps is (10 sec: 37683.4, 60 sec: 37683.2, 300 sec: 30935.2). Total num frames: 9125888. Throughput: 0: 37891.7. Samples: 9289780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-17 22:03:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:43,025][12883] Updated weights for policy 0, policy_version 560 (0.0042) +[2024-06-17 22:03:46,157][12862] Signal inference workers to stop experience collection... (100 times) +[2024-06-17 22:03:46,158][12862] Signal inference workers to resume experience collection... (100 times) +[2024-06-17 22:03:46,183][12883] InferenceWorker_p0-w0: stopping experience collection (100 times) +[2024-06-17 22:03:46,183][12883] InferenceWorker_p0-w0: resuming experience collection (100 times) +[2024-06-17 22:03:46,614][12883] Updated weights for policy 0, policy_version 570 (0.0037) +[2024-06-17 22:03:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38229.4, 300 sec: 31657.2). Total num frames: 9338880. Throughput: 0: 37913.8. Samples: 9409400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 22:03:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:51,292][12883] Updated weights for policy 0, policy_version 580 (0.0031) +[2024-06-17 22:03:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 38229.3, 300 sec: 32268.1). Total num frames: 9519104. Throughput: 0: 37791.9. Samples: 9629800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-17 22:03:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:03:55,257][12883] Updated weights for policy 0, policy_version 590 (0.0046) +[2024-06-17 22:03:56,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37956.4, 300 sec: 32323.7). Total num frames: 9699328. Throughput: 0: 37892.6. Samples: 9859740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-17 22:03:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:04:00,155][12883] Updated weights for policy 0, policy_version 600 (0.0026) +[2024-06-17 22:04:01,999][12645] Fps is (10 sec: 36027.8, 60 sec: 37407.1, 300 sec: 32934.1). Total num frames: 9879552. Throughput: 0: 37807.2. Samples: 9973180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 22:04:01,999][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:04:03,848][12883] Updated weights for policy 0, policy_version 610 (0.0043) +[2024-06-17 22:04:06,994][12645] Fps is (10 sec: 37682.5, 60 sec: 37956.2, 300 sec: 33490.0). Total num frames: 10076160. Throughput: 0: 37949.8. Samples: 10200880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 22:04:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:04:08,338][12883] Updated weights for policy 0, policy_version 620 (0.0037) +[2024-06-17 22:04:11,994][12645] Fps is (10 sec: 39340.9, 60 sec: 38229.4, 300 sec: 34045.4). Total num frames: 10272768. Throughput: 0: 37978.2. Samples: 10430380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 22:04:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:04:12,683][12883] Updated weights for policy 0, policy_version 630 (0.0044) +[2024-06-17 22:04:16,996][12645] Fps is (10 sec: 37675.0, 60 sec: 37681.9, 300 sec: 34489.4). Total num frames: 10452992. Throughput: 0: 37893.7. Samples: 10538120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 22:04:16,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:04:17,432][12883] Updated weights for policy 0, policy_version 640 (0.0039) +[2024-06-17 22:04:21,363][12883] Updated weights for policy 0, policy_version 650 (0.0038) +[2024-06-17 22:04:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 37956.2, 300 sec: 35045.1). Total num frames: 10665984. Throughput: 0: 37618.1. Samples: 10762660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-17 22:04:21,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:04:25,705][12883] Updated weights for policy 0, policy_version 660 (0.0042) +[2024-06-17 22:04:26,994][12645] Fps is (10 sec: 37691.7, 60 sec: 38229.3, 300 sec: 35322.8). Total num frames: 10829824. Throughput: 0: 37841.3. Samples: 10992640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 22:04:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:04:30,025][12883] Updated weights for policy 0, policy_version 670 (0.0053) +[2024-06-17 22:04:31,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37956.3, 300 sec: 35711.6). Total num frames: 11026432. Throughput: 0: 37690.2. Samples: 11105460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-17 22:04:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:04:34,730][12883] Updated weights for policy 0, policy_version 680 (0.0041) +[2024-06-17 22:04:36,994][12645] Fps is (10 sec: 36044.5, 60 sec: 37410.1, 300 sec: 35933.7). Total num frames: 11190272. Throughput: 0: 37764.5. Samples: 11329200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-17 22:04:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:04:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000683_11190272.pth... +[2024-06-17 22:04:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000131_2146304.pth +[2024-06-17 22:04:38,696][12883] Updated weights for policy 0, policy_version 690 (0.0031) +[2024-06-17 22:04:41,994][12645] Fps is (10 sec: 36044.5, 60 sec: 37683.1, 300 sec: 36266.9). Total num frames: 11386880. Throughput: 0: 37885.6. Samples: 11564600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 22:04:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:04:43,031][12883] Updated weights for policy 0, policy_version 700 (0.0026) +[2024-06-17 22:04:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 37410.1, 300 sec: 36489.1). Total num frames: 11583488. Throughput: 0: 37850.3. Samples: 11676260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-17 22:04:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:04:47,411][12883] Updated weights for policy 0, policy_version 710 (0.0029) +[2024-06-17 22:04:51,842][12883] Updated weights for policy 0, policy_version 720 (0.0040) +[2024-06-17 22:04:51,996][12645] Fps is (10 sec: 40951.2, 60 sec: 37954.9, 300 sec: 36766.5). Total num frames: 11796480. Throughput: 0: 37769.3. Samples: 11900580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-17 22:04:52,005][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:04:56,156][12883] Updated weights for policy 0, policy_version 730 (0.0044) +[2024-06-17 22:04:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 37956.2, 300 sec: 36933.4). Total num frames: 11976704. Throughput: 0: 37727.4. Samples: 12128120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-17 22:04:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:00,355][12883] Updated weights for policy 0, policy_version 740 (0.0035) +[2024-06-17 22:05:01,994][12645] Fps is (10 sec: 37691.3, 60 sec: 38232.3, 300 sec: 37100.0). Total num frames: 12173312. Throughput: 0: 37967.6. Samples: 12246580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-17 22:05:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:02,677][12862] Signal inference workers to stop experience collection... (150 times) +[2024-06-17 22:05:02,710][12883] InferenceWorker_p0-w0: stopping experience collection (150 times) +[2024-06-17 22:05:02,798][12862] Signal inference workers to resume experience collection... (150 times) +[2024-06-17 22:05:02,798][12883] InferenceWorker_p0-w0: resuming experience collection (150 times) +[2024-06-17 22:05:04,953][12883] Updated weights for policy 0, policy_version 750 (0.0048) +[2024-06-17 22:05:06,994][12645] Fps is (10 sec: 37683.0, 60 sec: 37956.2, 300 sec: 37266.7). Total num frames: 12353536. Throughput: 0: 37930.6. Samples: 12469540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 22:05:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:05:09,425][12883] Updated weights for policy 0, policy_version 760 (0.0028) +[2024-06-17 22:05:11,994][12645] Fps is (10 sec: 36045.0, 60 sec: 37683.1, 300 sec: 37377.7). Total num frames: 12533760. Throughput: 0: 37849.3. Samples: 12695860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-17 22:05:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:13,698][12883] Updated weights for policy 0, policy_version 770 (0.0034) +[2024-06-17 22:05:16,995][12645] Fps is (10 sec: 39317.9, 60 sec: 38230.1, 300 sec: 37599.7). Total num frames: 12746752. Throughput: 0: 38100.5. Samples: 12820020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-17 22:05:16,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:17,882][12883] Updated weights for policy 0, policy_version 780 (0.0024) +[2024-06-17 22:05:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 37683.2, 300 sec: 37711.0). Total num frames: 12926976. Throughput: 0: 38227.1. Samples: 13049420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) +[2024-06-17 22:05:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:22,220][12883] Updated weights for policy 0, policy_version 790 (0.0033) +[2024-06-17 22:05:26,260][12883] Updated weights for policy 0, policy_version 800 (0.0046) +[2024-06-17 22:05:26,994][12645] Fps is (10 sec: 36048.3, 60 sec: 37956.2, 300 sec: 37711.0). Total num frames: 13107200. Throughput: 0: 37986.2. Samples: 13273980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-17 22:05:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:30,897][12883] Updated weights for policy 0, policy_version 810 (0.0041) +[2024-06-17 22:05:31,994][12645] Fps is (10 sec: 37683.3, 60 sec: 37956.3, 300 sec: 37822.0). Total num frames: 13303808. Throughput: 0: 38078.6. Samples: 13389800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 22:05:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:05:34,945][12883] Updated weights for policy 0, policy_version 820 (0.0037) +[2024-06-17 22:05:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 38775.5, 300 sec: 37988.7). Total num frames: 13516800. Throughput: 0: 38182.8. Samples: 13618720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-17 22:05:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:39,577][12883] Updated weights for policy 0, policy_version 830 (0.0037) +[2024-06-17 22:05:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 38502.4, 300 sec: 38044.2). Total num frames: 13697024. Throughput: 0: 37990.2. Samples: 13837680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 22:05:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:05:43,834][12883] Updated weights for policy 0, policy_version 840 (0.0044) +[2024-06-17 22:05:46,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38229.3, 300 sec: 38044.2). Total num frames: 13877248. Throughput: 0: 38092.9. Samples: 13960760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 22:05:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:48,470][12883] Updated weights for policy 0, policy_version 850 (0.0033) +[2024-06-17 22:05:51,994][12645] Fps is (10 sec: 37682.9, 60 sec: 37957.6, 300 sec: 38099.7). Total num frames: 14073856. Throughput: 0: 38272.9. Samples: 14191820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-17 22:05:52,007][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:05:52,127][12883] Updated weights for policy 0, policy_version 860 (0.0028) +[2024-06-17 22:05:56,518][12883] Updated weights for policy 0, policy_version 870 (0.0026) +[2024-06-17 22:05:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 37956.3, 300 sec: 38044.2). Total num frames: 14254080. Throughput: 0: 38156.0. Samples: 14412880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 22:05:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:06:00,750][12883] Updated weights for policy 0, policy_version 880 (0.0035) +[2024-06-17 22:06:01,994][12645] Fps is (10 sec: 36045.0, 60 sec: 37683.2, 300 sec: 38044.2). Total num frames: 14434304. Throughput: 0: 37955.9. Samples: 14528000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-17 22:06:01,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:06:05,819][12883] Updated weights for policy 0, policy_version 890 (0.0047) +[2024-06-17 22:06:06,996][12645] Fps is (10 sec: 39312.8, 60 sec: 38228.0, 300 sec: 38043.9). Total num frames: 14647296. Throughput: 0: 38114.6. Samples: 14764660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-17 22:06:06,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:06:09,030][12883] Updated weights for policy 0, policy_version 900 (0.0042) +[2024-06-17 22:06:11,994][12645] Fps is (10 sec: 37683.2, 60 sec: 37956.2, 300 sec: 38099.7). Total num frames: 14811136. Throughput: 0: 38087.1. Samples: 14987900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 22:06:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:06:14,187][12883] Updated weights for policy 0, policy_version 910 (0.0037) +[2024-06-17 22:06:16,994][12645] Fps is (10 sec: 39330.2, 60 sec: 38230.0, 300 sec: 38099.7). Total num frames: 15040512. Throughput: 0: 38019.9. Samples: 15100700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 22:06:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:06:18,378][12883] Updated weights for policy 0, policy_version 920 (0.0038) +[2024-06-17 22:06:21,994][12645] Fps is (10 sec: 39321.9, 60 sec: 37956.3, 300 sec: 37988.8). Total num frames: 15204352. Throughput: 0: 37989.8. Samples: 15328260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:06:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:06:23,373][12883] Updated weights for policy 0, policy_version 930 (0.0037) +[2024-06-17 22:06:26,994][12645] Fps is (10 sec: 34406.3, 60 sec: 37956.3, 300 sec: 37988.9). Total num frames: 15384576. Throughput: 0: 38236.4. Samples: 15558320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-17 22:06:26,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:06:27,126][12883] Updated weights for policy 0, policy_version 940 (0.0040) +[2024-06-17 22:06:31,319][12883] Updated weights for policy 0, policy_version 950 (0.0041) +[2024-06-17 22:06:31,994][12645] Fps is (10 sec: 37683.0, 60 sec: 37956.2, 300 sec: 37988.7). Total num frames: 15581184. Throughput: 0: 37935.5. Samples: 15667860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-17 22:06:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:06:33,612][12862] Signal inference workers to stop experience collection... (200 times) +[2024-06-17 22:06:33,646][12883] InferenceWorker_p0-w0: stopping experience collection (200 times) +[2024-06-17 22:06:33,675][12862] Signal inference workers to resume experience collection... (200 times) +[2024-06-17 22:06:33,676][12883] InferenceWorker_p0-w0: resuming experience collection (200 times) +[2024-06-17 22:06:36,099][12883] Updated weights for policy 0, policy_version 960 (0.0048) +[2024-06-17 22:06:36,994][12645] Fps is (10 sec: 37684.1, 60 sec: 37410.2, 300 sec: 38044.2). Total num frames: 15761408. Throughput: 0: 37986.9. Samples: 15901220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) +[2024-06-17 22:06:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:06:37,056][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000963_15777792.pth... +[2024-06-17 22:06:37,107][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000407_6668288.pth +[2024-06-17 22:06:40,010][12883] Updated weights for policy 0, policy_version 970 (0.0045) +[2024-06-17 22:06:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 37956.3, 300 sec: 38044.2). Total num frames: 15974400. Throughput: 0: 38044.9. Samples: 16124900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 22:06:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:06:44,491][12883] Updated weights for policy 0, policy_version 980 (0.0044) +[2024-06-17 22:06:46,994][12645] Fps is (10 sec: 39320.5, 60 sec: 37956.2, 300 sec: 38044.2). Total num frames: 16154624. Throughput: 0: 38094.2. Samples: 16242240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 22:06:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:06:48,573][12883] Updated weights for policy 0, policy_version 990 (0.0044) +[2024-06-17 22:06:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 38229.5, 300 sec: 38100.0). Total num frames: 16367616. Throughput: 0: 37864.7. Samples: 16468480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-17 22:06:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:06:53,782][12883] Updated weights for policy 0, policy_version 1000 (0.0045) +[2024-06-17 22:06:56,997][12645] Fps is (10 sec: 37670.4, 60 sec: 37954.1, 300 sec: 37932.7). Total num frames: 16531456. Throughput: 0: 37894.5. Samples: 16693280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 22:06:56,998][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:06:57,451][12883] Updated weights for policy 0, policy_version 1010 (0.0038) +[2024-06-17 22:07:01,623][12883] Updated weights for policy 0, policy_version 1020 (0.0028) +[2024-06-17 22:07:01,994][12645] Fps is (10 sec: 34405.9, 60 sec: 37956.3, 300 sec: 37989.2). Total num frames: 16711680. Throughput: 0: 37948.0. Samples: 16808360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-17 22:07:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:07:05,675][12883] Updated weights for policy 0, policy_version 1030 (0.0034) +[2024-06-17 22:07:06,994][12645] Fps is (10 sec: 39335.4, 60 sec: 37957.7, 300 sec: 38044.2). Total num frames: 16924672. Throughput: 0: 38004.5. Samples: 17038460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:07:06,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:07:10,287][12883] Updated weights for policy 0, policy_version 1040 (0.0039) +[2024-06-17 22:07:11,994][12645] Fps is (10 sec: 37683.8, 60 sec: 37956.4, 300 sec: 37933.1). Total num frames: 17088512. Throughput: 0: 37896.2. Samples: 17263640. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) +[2024-06-17 22:07:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:07:14,182][12883] Updated weights for policy 0, policy_version 1050 (0.0026) +[2024-06-17 22:07:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 37683.2, 300 sec: 38044.2). Total num frames: 17301504. Throughput: 0: 37864.0. Samples: 17371740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-17 22:07:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:07:18,848][12883] Updated weights for policy 0, policy_version 1060 (0.0061) +[2024-06-17 22:07:21,993][12645] Fps is (10 sec: 40960.4, 60 sec: 38229.5, 300 sec: 38044.2). Total num frames: 17498112. Throughput: 0: 37936.9. Samples: 17608380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) +[2024-06-17 22:07:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:07:22,682][12883] Updated weights for policy 0, policy_version 1070 (0.0032) +[2024-06-17 22:07:26,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 17661952. Throughput: 0: 37969.8. Samples: 17833540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-17 22:07:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:07:27,503][12883] Updated weights for policy 0, policy_version 1080 (0.0037) +[2024-06-17 22:07:31,215][12883] Updated weights for policy 0, policy_version 1090 (0.0034) +[2024-06-17 22:07:31,994][12645] Fps is (10 sec: 37682.3, 60 sec: 38229.4, 300 sec: 37989.5). Total num frames: 17874944. Throughput: 0: 37845.9. Samples: 17945300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-17 22:07:31,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:07:36,151][12883] Updated weights for policy 0, policy_version 1100 (0.0046) +[2024-06-17 22:07:36,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37683.2, 300 sec: 37822.1). Total num frames: 18022400. Throughput: 0: 37901.3. Samples: 18174040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 22:07:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:07:39,694][12883] Updated weights for policy 0, policy_version 1110 (0.0040) +[2024-06-17 22:07:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 18251776. Throughput: 0: 38050.6. Samples: 18405420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 22:07:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:07:44,873][12883] Updated weights for policy 0, policy_version 1120 (0.0039) +[2024-06-17 22:07:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 37683.3, 300 sec: 37933.1). Total num frames: 18415616. Throughput: 0: 38011.2. Samples: 18518860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 22:07:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:07:48,418][12883] Updated weights for policy 0, policy_version 1130 (0.0040) +[2024-06-17 22:07:51,994][12645] Fps is (10 sec: 36044.8, 60 sec: 37410.1, 300 sec: 37933.2). Total num frames: 18612224. Throughput: 0: 37860.5. Samples: 18742180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-17 22:07:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:07:53,501][12883] Updated weights for policy 0, policy_version 1140 (0.0044) +[2024-06-17 22:07:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 37958.5, 300 sec: 37877.6). Total num frames: 18808832. Throughput: 0: 37960.9. Samples: 18971880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-17 22:07:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:07:57,275][12883] Updated weights for policy 0, policy_version 1150 (0.0042) +[2024-06-17 22:08:01,994][12645] Fps is (10 sec: 37682.9, 60 sec: 37956.3, 300 sec: 37933.1). Total num frames: 18989056. Throughput: 0: 38019.1. Samples: 19082600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 22:08:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:08:02,031][12883] Updated weights for policy 0, policy_version 1160 (0.0038) +[2024-06-17 22:08:05,650][12883] Updated weights for policy 0, policy_version 1170 (0.0045) +[2024-06-17 22:08:06,995][12645] Fps is (10 sec: 39317.6, 60 sec: 37955.7, 300 sec: 38044.1). Total num frames: 19202048. Throughput: 0: 37822.1. Samples: 19310420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-17 22:08:06,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:08:10,509][12883] Updated weights for policy 0, policy_version 1180 (0.0035) +[2024-06-17 22:08:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38229.3, 300 sec: 37933.1). Total num frames: 19382272. Throughput: 0: 38116.0. Samples: 19548760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 22:08:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:08:14,397][12883] Updated weights for policy 0, policy_version 1190 (0.0035) +[2024-06-17 22:08:15,903][12862] Signal inference workers to stop experience collection... (250 times) +[2024-06-17 22:08:15,924][12883] InferenceWorker_p0-w0: stopping experience collection (250 times) +[2024-06-17 22:08:15,958][12862] Signal inference workers to resume experience collection... (250 times) +[2024-06-17 22:08:15,960][12883] InferenceWorker_p0-w0: resuming experience collection (250 times) +[2024-06-17 22:08:16,994][12645] Fps is (10 sec: 37686.7, 60 sec: 37956.3, 300 sec: 37933.1). Total num frames: 19578880. Throughput: 0: 38102.2. Samples: 19659900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) +[2024-06-17 22:08:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:08:19,225][12883] Updated weights for policy 0, policy_version 1200 (0.0055) +[2024-06-17 22:08:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37683.1, 300 sec: 38044.2). Total num frames: 19759104. Throughput: 0: 38064.9. Samples: 19886960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-17 22:08:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:08:23,046][12883] Updated weights for policy 0, policy_version 1210 (0.0038) +[2024-06-17 22:08:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 37956.3, 300 sec: 37933.1). Total num frames: 19939328. Throughput: 0: 38080.9. Samples: 20119060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 22:08:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:08:27,791][12883] Updated weights for policy 0, policy_version 1220 (0.0024) +[2024-06-17 22:08:31,794][12883] Updated weights for policy 0, policy_version 1230 (0.0036) +[2024-06-17 22:08:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 20152320. Throughput: 0: 38098.7. Samples: 20233300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 22:08:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:08:36,479][12883] Updated weights for policy 0, policy_version 1240 (0.0047) +[2024-06-17 22:08:36,996][12645] Fps is (10 sec: 39312.8, 60 sec: 38500.9, 300 sec: 37988.4). Total num frames: 20332544. Throughput: 0: 38099.0. Samples: 20456720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 22:08:36,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:08:37,149][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001242_20348928.pth... +[2024-06-17 22:08:37,201][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000683_11190272.pth +[2024-06-17 22:08:40,638][12883] Updated weights for policy 0, policy_version 1250 (0.0035) +[2024-06-17 22:08:42,000][12645] Fps is (10 sec: 37659.6, 60 sec: 37952.3, 300 sec: 37932.3). Total num frames: 20529152. Throughput: 0: 38093.8. Samples: 20686340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 22:08:42,001][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:08:44,988][12883] Updated weights for policy 0, policy_version 1260 (0.0050) +[2024-06-17 22:08:46,994][12645] Fps is (10 sec: 39330.8, 60 sec: 38502.5, 300 sec: 37988.7). Total num frames: 20725760. Throughput: 0: 38306.8. Samples: 20806400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) +[2024-06-17 22:08:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:08:49,406][12883] Updated weights for policy 0, policy_version 1270 (0.0034) +[2024-06-17 22:08:51,996][12645] Fps is (10 sec: 36059.4, 60 sec: 37954.8, 300 sec: 37932.8). Total num frames: 20889600. Throughput: 0: 38219.4. Samples: 21030340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-17 22:08:51,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:08:53,335][12883] Updated weights for policy 0, policy_version 1280 (0.0035) +[2024-06-17 22:08:56,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38229.3, 300 sec: 38044.8). Total num frames: 21102592. Throughput: 0: 37968.1. Samples: 21257320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 22:08:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:08:58,167][12883] Updated weights for policy 0, policy_version 1290 (0.0032) +[2024-06-17 22:09:01,919][12883] Updated weights for policy 0, policy_version 1300 (0.0039) +[2024-06-17 22:09:01,994][12645] Fps is (10 sec: 40969.5, 60 sec: 38502.5, 300 sec: 38044.2). Total num frames: 21299200. Throughput: 0: 38037.0. Samples: 21371560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-17 22:09:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:09:06,598][12883] Updated weights for policy 0, policy_version 1310 (0.0032) +[2024-06-17 22:09:06,994][12645] Fps is (10 sec: 37683.2, 60 sec: 37956.9, 300 sec: 37988.7). Total num frames: 21479424. Throughput: 0: 38139.5. Samples: 21603240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-17 22:09:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:09:10,675][12883] Updated weights for policy 0, policy_version 1320 (0.0030) +[2024-06-17 22:09:11,994][12645] Fps is (10 sec: 37682.4, 60 sec: 38229.3, 300 sec: 38044.5). Total num frames: 21676032. Throughput: 0: 37983.5. Samples: 21828320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:09:11,999][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:09:15,225][12883] Updated weights for policy 0, policy_version 1330 (0.0043) +[2024-06-17 22:09:16,994][12645] Fps is (10 sec: 36044.3, 60 sec: 37683.1, 300 sec: 37877.6). Total num frames: 21839872. Throughput: 0: 37992.3. Samples: 21942960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:09:16,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:09:19,405][12883] Updated weights for policy 0, policy_version 1340 (0.0044) +[2024-06-17 22:09:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 38775.5, 300 sec: 38155.3). Total num frames: 22085632. Throughput: 0: 38106.3. Samples: 22171420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:09:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:09:23,874][12883] Updated weights for policy 0, policy_version 1350 (0.0031) +[2024-06-17 22:09:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38229.3, 300 sec: 37988.7). Total num frames: 22233088. Throughput: 0: 38040.8. Samples: 22397940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) +[2024-06-17 22:09:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:09:28,178][12883] Updated weights for policy 0, policy_version 1360 (0.0042) +[2024-06-17 22:09:31,995][12645] Fps is (10 sec: 34402.1, 60 sec: 37955.5, 300 sec: 38099.6). Total num frames: 22429696. Throughput: 0: 37885.5. Samples: 22511300. Policy #0 lag: (min: 1.0, avg: 12.6, max: 26.0) +[2024-06-17 22:09:31,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:09:32,503][12883] Updated weights for policy 0, policy_version 1370 (0.0036) +[2024-06-17 22:09:36,613][12883] Updated weights for policy 0, policy_version 1380 (0.0029) +[2024-06-17 22:09:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 37957.6, 300 sec: 38044.2). Total num frames: 22609920. Throughput: 0: 38169.8. Samples: 22747900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-17 22:09:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:09:40,535][12883] Updated weights for policy 0, policy_version 1390 (0.0040) +[2024-06-17 22:09:41,994][12645] Fps is (10 sec: 36048.7, 60 sec: 37687.1, 300 sec: 37988.6). Total num frames: 22790144. Throughput: 0: 38031.4. Samples: 22968740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 22:09:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:09:45,204][12883] Updated weights for policy 0, policy_version 1400 (0.0035) +[2024-06-17 22:09:45,931][12862] Signal inference workers to stop experience collection... (300 times) +[2024-06-17 22:09:45,973][12883] InferenceWorker_p0-w0: stopping experience collection (300 times) +[2024-06-17 22:09:45,984][12862] Signal inference workers to resume experience collection... (300 times) +[2024-06-17 22:09:45,993][12883] InferenceWorker_p0-w0: resuming experience collection (300 times) +[2024-06-17 22:09:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 38229.3, 300 sec: 38044.5). Total num frames: 23019520. Throughput: 0: 38021.2. Samples: 23082520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) +[2024-06-17 22:09:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:09:49,112][12883] Updated weights for policy 0, policy_version 1410 (0.0034) +[2024-06-17 22:09:51,994][12645] Fps is (10 sec: 37684.0, 60 sec: 37957.7, 300 sec: 37933.1). Total num frames: 23166976. Throughput: 0: 38142.7. Samples: 23319660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) +[2024-06-17 22:09:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:09:53,922][12883] Updated weights for policy 0, policy_version 1420 (0.0045) +[2024-06-17 22:09:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 38502.4, 300 sec: 38099.8). Total num frames: 23412736. Throughput: 0: 38024.1. Samples: 23539400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 22:09:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:09:57,223][12883] Updated weights for policy 0, policy_version 1430 (0.0043) +[2024-06-17 22:10:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 37683.1, 300 sec: 37988.7). Total num frames: 23560192. Throughput: 0: 37988.1. Samples: 23652420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 22:10:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:02,608][12883] Updated weights for policy 0, policy_version 1440 (0.0038) +[2024-06-17 22:10:06,323][12883] Updated weights for policy 0, policy_version 1450 (0.0040) +[2024-06-17 22:10:06,994][12645] Fps is (10 sec: 34406.4, 60 sec: 37956.2, 300 sec: 38044.2). Total num frames: 23756800. Throughput: 0: 37899.1. Samples: 23876880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 22:10:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:10:11,293][12883] Updated weights for policy 0, policy_version 1460 (0.0027) +[2024-06-17 22:10:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 37956.4, 300 sec: 37988.8). Total num frames: 23953408. Throughput: 0: 38041.9. Samples: 24109820. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) +[2024-06-17 22:10:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:15,110][12883] Updated weights for policy 0, policy_version 1470 (0.0032) +[2024-06-17 22:10:16,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37956.4, 300 sec: 37933.1). Total num frames: 24117248. Throughput: 0: 37911.7. Samples: 24217280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:10:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:10:19,739][12883] Updated weights for policy 0, policy_version 1480 (0.0047) +[2024-06-17 22:10:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 37683.2, 300 sec: 38099.8). Total num frames: 24346624. Throughput: 0: 37714.3. Samples: 24445040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-17 22:10:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:23,223][12883] Updated weights for policy 0, policy_version 1490 (0.0030) +[2024-06-17 22:10:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 24510464. Throughput: 0: 38170.7. Samples: 24686420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-17 22:10:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:10:28,462][12883] Updated weights for policy 0, policy_version 1500 (0.0042) +[2024-06-17 22:10:31,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38230.1, 300 sec: 37988.7). Total num frames: 24723456. Throughput: 0: 37953.7. Samples: 24790440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-17 22:10:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:32,097][12883] Updated weights for policy 0, policy_version 1510 (0.0046) +[2024-06-17 22:10:36,564][12883] Updated weights for policy 0, policy_version 1520 (0.0025) +[2024-06-17 22:10:36,996][12645] Fps is (10 sec: 40950.7, 60 sec: 38501.0, 300 sec: 38043.9). Total num frames: 24920064. Throughput: 0: 37884.2. Samples: 25024540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 22:10:36,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:37,030][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001521_24920064.pth... +[2024-06-17 22:10:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000963_15777792.pth +[2024-06-17 22:10:40,803][12883] Updated weights for policy 0, policy_version 1530 (0.0035) +[2024-06-17 22:10:41,994][12645] Fps is (10 sec: 36045.1, 60 sec: 38229.4, 300 sec: 37988.7). Total num frames: 25083904. Throughput: 0: 38108.9. Samples: 25254300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) +[2024-06-17 22:10:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:45,628][12883] Updated weights for policy 0, policy_version 1540 (0.0043) +[2024-06-17 22:10:46,994][12645] Fps is (10 sec: 37692.0, 60 sec: 37956.3, 300 sec: 38044.2). Total num frames: 25296896. Throughput: 0: 38099.2. Samples: 25366880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 19.0) +[2024-06-17 22:10:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:49,786][12883] Updated weights for policy 0, policy_version 1550 (0.0038) +[2024-06-17 22:10:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38229.3, 300 sec: 37988.7). Total num frames: 25460736. Throughput: 0: 38268.4. Samples: 25598960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-17 22:10:51,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:54,079][12883] Updated weights for policy 0, policy_version 1560 (0.0032) +[2024-06-17 22:10:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 37683.2, 300 sec: 38099.8). Total num frames: 25673728. Throughput: 0: 37950.2. Samples: 25817580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) +[2024-06-17 22:10:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:10:58,061][12883] Updated weights for policy 0, policy_version 1570 (0.0048) +[2024-06-17 22:11:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37956.3, 300 sec: 37933.4). Total num frames: 25837568. Throughput: 0: 38303.1. Samples: 25940920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-17 22:11:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:11:02,632][12883] Updated weights for policy 0, policy_version 1580 (0.0040) +[2024-06-17 22:11:06,612][12883] Updated weights for policy 0, policy_version 1590 (0.0036) +[2024-06-17 22:11:06,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38229.3, 300 sec: 38099.7). Total num frames: 26050560. Throughput: 0: 38105.7. Samples: 26159800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-17 22:11:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:11:11,377][12883] Updated weights for policy 0, policy_version 1600 (0.0043) +[2024-06-17 22:11:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 38229.4, 300 sec: 37988.7). Total num frames: 26247168. Throughput: 0: 37972.1. Samples: 26395160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 22:11:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:11:15,460][12883] Updated weights for policy 0, policy_version 1610 (0.0044) +[2024-06-17 22:11:16,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38502.3, 300 sec: 38044.2). Total num frames: 26427392. Throughput: 0: 38155.5. Samples: 26507440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-17 22:11:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:11:20,186][12883] Updated weights for policy 0, policy_version 1620 (0.0031) +[2024-06-17 22:11:21,994][12645] Fps is (10 sec: 37682.5, 60 sec: 37956.2, 300 sec: 38099.7). Total num frames: 26624000. Throughput: 0: 37991.7. Samples: 26734080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-17 22:11:21,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:11:23,558][12862] Signal inference workers to stop experience collection... (350 times) +[2024-06-17 22:11:23,586][12883] InferenceWorker_p0-w0: stopping experience collection (350 times) +[2024-06-17 22:11:23,616][12862] Signal inference workers to resume experience collection... (350 times) +[2024-06-17 22:11:23,618][12883] InferenceWorker_p0-w0: resuming experience collection (350 times) +[2024-06-17 22:11:23,767][12883] Updated weights for policy 0, policy_version 1630 (0.0045) +[2024-06-17 22:11:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38229.3, 300 sec: 38044.2). Total num frames: 26804224. Throughput: 0: 38090.2. Samples: 26968360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:11:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:11:28,526][12883] Updated weights for policy 0, policy_version 1640 (0.0031) +[2024-06-17 22:11:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 37956.2, 300 sec: 38099.7). Total num frames: 27000832. Throughput: 0: 38031.0. Samples: 27078280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 22:11:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:11:32,478][12883] Updated weights for policy 0, policy_version 1650 (0.0045) +[2024-06-17 22:11:36,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37684.7, 300 sec: 37988.7). Total num frames: 27181056. Throughput: 0: 38042.7. Samples: 27310880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 22:11:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:11:37,090][12883] Updated weights for policy 0, policy_version 1660 (0.0040) +[2024-06-17 22:11:41,145][12883] Updated weights for policy 0, policy_version 1670 (0.0045) +[2024-06-17 22:11:41,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38229.3, 300 sec: 38044.2). Total num frames: 27377664. Throughput: 0: 38211.5. Samples: 27537100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-17 22:11:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:11:46,056][12883] Updated weights for policy 0, policy_version 1680 (0.0039) +[2024-06-17 22:11:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 27574272. Throughput: 0: 38106.3. Samples: 27655700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:11:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:11:49,388][12883] Updated weights for policy 0, policy_version 1690 (0.0045) +[2024-06-17 22:11:51,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38229.4, 300 sec: 38044.7). Total num frames: 27754496. Throughput: 0: 38363.6. Samples: 27886160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 22:11:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:11:54,625][12883] Updated weights for policy 0, policy_version 1700 (0.0049) +[2024-06-17 22:11:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 38502.4, 300 sec: 38210.8). Total num frames: 27983872. Throughput: 0: 38113.6. Samples: 28110280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-17 22:11:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:11:57,831][12883] Updated weights for policy 0, policy_version 1710 (0.0028) +[2024-06-17 22:12:01,994][12645] Fps is (10 sec: 36045.0, 60 sec: 37956.3, 300 sec: 37933.1). Total num frames: 28114944. Throughput: 0: 38317.9. Samples: 28231740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-17 22:12:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:12:03,243][12883] Updated weights for policy 0, policy_version 1720 (0.0039) +[2024-06-17 22:12:05,778][12883] Updated weights for policy 0, policy_version 1730 (0.0033) +[2024-06-17 22:12:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38502.4, 300 sec: 38210.8). Total num frames: 28360704. Throughput: 0: 38288.5. Samples: 28457060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-17 22:12:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:12:11,994][12645] Fps is (10 sec: 37683.0, 60 sec: 37410.1, 300 sec: 37933.1). Total num frames: 28491776. Throughput: 0: 38327.2. Samples: 28693080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 22.0) +[2024-06-17 22:12:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:12:12,015][12883] Updated weights for policy 0, policy_version 1740 (0.0039) +[2024-06-17 22:12:15,075][12883] Updated weights for policy 0, policy_version 1750 (0.0039) +[2024-06-17 22:12:16,994][12645] Fps is (10 sec: 36045.2, 60 sec: 38229.5, 300 sec: 38044.2). Total num frames: 28721152. Throughput: 0: 38216.6. Samples: 28798020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 22:12:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:12:20,258][12883] Updated weights for policy 0, policy_version 1760 (0.0042) +[2024-06-17 22:12:21,994][12645] Fps is (10 sec: 44236.0, 60 sec: 38502.4, 300 sec: 38210.8). Total num frames: 28934144. Throughput: 0: 38176.8. Samples: 29028840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 22:12:21,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:12:23,166][12883] Updated weights for policy 0, policy_version 1770 (0.0041) +[2024-06-17 22:12:26,996][12645] Fps is (10 sec: 36036.3, 60 sec: 37954.9, 300 sec: 37988.4). Total num frames: 29081600. Throughput: 0: 38376.3. Samples: 29264120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-17 22:12:26,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:12:28,909][12883] Updated weights for policy 0, policy_version 1780 (0.0032) +[2024-06-17 22:12:31,994][12645] Fps is (10 sec: 37683.8, 60 sec: 38502.5, 300 sec: 38266.4). Total num frames: 29310976. Throughput: 0: 38245.7. Samples: 29376760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-17 22:12:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:12:32,015][12883] Updated weights for policy 0, policy_version 1790 (0.0036) +[2024-06-17 22:12:36,994][12645] Fps is (10 sec: 37691.4, 60 sec: 37956.2, 300 sec: 37988.6). Total num frames: 29458432. Throughput: 0: 38324.3. Samples: 29610760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-17 22:12:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:12:37,102][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001799_29474816.pth... +[2024-06-17 22:12:37,159][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001242_20348928.pth +[2024-06-17 22:12:37,360][12883] Updated weights for policy 0, policy_version 1800 (0.0034) +[2024-06-17 22:12:40,356][12883] Updated weights for policy 0, policy_version 1810 (0.0033) +[2024-06-17 22:12:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38502.4, 300 sec: 38210.8). Total num frames: 29687808. Throughput: 0: 38212.0. Samples: 29829820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 22:12:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:12:45,810][12883] Updated weights for policy 0, policy_version 1820 (0.0026) +[2024-06-17 22:12:46,998][12645] Fps is (10 sec: 37666.5, 60 sec: 37680.3, 300 sec: 38043.6). Total num frames: 29835264. Throughput: 0: 38195.2. Samples: 29950700. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) +[2024-06-17 22:12:46,999][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:12:47,482][12862] Signal inference workers to stop experience collection... (400 times) +[2024-06-17 22:12:47,534][12883] InferenceWorker_p0-w0: stopping experience collection (400 times) +[2024-06-17 22:12:47,603][12862] Signal inference workers to resume experience collection... (400 times) +[2024-06-17 22:12:47,603][12883] InferenceWorker_p0-w0: resuming experience collection (400 times) +[2024-06-17 22:12:48,961][12883] Updated weights for policy 0, policy_version 1830 (0.0035) +[2024-06-17 22:12:51,996][12645] Fps is (10 sec: 34398.6, 60 sec: 37954.8, 300 sec: 38043.9). Total num frames: 30031872. Throughput: 0: 38212.3. Samples: 30176700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:12:51,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:12:54,402][12883] Updated weights for policy 0, policy_version 1840 (0.0029) +[2024-06-17 22:12:56,994][12645] Fps is (10 sec: 45896.1, 60 sec: 38502.5, 300 sec: 38321.9). Total num frames: 30294016. Throughput: 0: 38157.8. Samples: 30410180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 26.0) +[2024-06-17 22:12:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:12:57,589][12883] Updated weights for policy 0, policy_version 1850 (0.0037) +[2024-06-17 22:13:01,994][12645] Fps is (10 sec: 39331.0, 60 sec: 38502.4, 300 sec: 38044.3). Total num frames: 30425088. Throughput: 0: 38597.3. Samples: 30534900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 22:13:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:13:02,846][12883] Updated weights for policy 0, policy_version 1860 (0.0029) +[2024-06-17 22:13:05,759][12883] Updated weights for policy 0, policy_version 1870 (0.0033) +[2024-06-17 22:13:06,994][12645] Fps is (10 sec: 36044.4, 60 sec: 38229.3, 300 sec: 38210.8). Total num frames: 30654464. Throughput: 0: 38437.8. Samples: 30758540. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) +[2024-06-17 22:13:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:13:11,457][12883] Updated weights for policy 0, policy_version 1880 (0.0033) +[2024-06-17 22:13:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.6, 300 sec: 38155.3). Total num frames: 30834688. Throughput: 0: 38482.4. Samples: 30995740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 22:13:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:13:14,270][12883] Updated weights for policy 0, policy_version 1890 (0.0036) +[2024-06-17 22:13:16,994][12645] Fps is (10 sec: 36045.2, 60 sec: 38229.3, 300 sec: 38155.3). Total num frames: 31014912. Throughput: 0: 38412.0. Samples: 31105300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) +[2024-06-17 22:13:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:13:19,688][12883] Updated weights for policy 0, policy_version 1900 (0.0031) +[2024-06-17 22:13:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 38502.5, 300 sec: 38321.9). Total num frames: 31244288. Throughput: 0: 38544.1. Samples: 31345240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) +[2024-06-17 22:13:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:13:22,792][12883] Updated weights for policy 0, policy_version 1910 (0.0048) +[2024-06-17 22:13:26,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38230.8, 300 sec: 38044.2). Total num frames: 31375360. Throughput: 0: 38889.8. Samples: 31579860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-17 22:13:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:13:28,279][12883] Updated weights for policy 0, policy_version 1920 (0.0040) +[2024-06-17 22:13:31,437][12883] Updated weights for policy 0, policy_version 1930 (0.0023) +[2024-06-17 22:13:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 38775.4, 300 sec: 38322.2). Total num frames: 31637504. Throughput: 0: 38600.3. Samples: 31687540. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-17 22:13:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:13:36,693][12883] Updated weights for policy 0, policy_version 1940 (0.0038) +[2024-06-17 22:13:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 39048.6, 300 sec: 38211.6). Total num frames: 31801344. Throughput: 0: 38948.3. Samples: 31929280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 22:13:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:13:39,676][12883] Updated weights for policy 0, policy_version 1950 (0.0037) +[2024-06-17 22:13:41,994][12645] Fps is (10 sec: 34406.7, 60 sec: 38229.4, 300 sec: 38155.3). Total num frames: 31981568. Throughput: 0: 38948.0. Samples: 32162840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) +[2024-06-17 22:13:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:13:44,893][12883] Updated weights for policy 0, policy_version 1960 (0.0047) +[2024-06-17 22:13:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39597.6, 300 sec: 38377.7). Total num frames: 32210944. Throughput: 0: 38875.0. Samples: 32284280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 24.0) +[2024-06-17 22:13:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:13:48,230][12883] Updated weights for policy 0, policy_version 1970 (0.0038) +[2024-06-17 22:13:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39050.1, 300 sec: 38210.8). Total num frames: 32374784. Throughput: 0: 38976.1. Samples: 32512460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 22:13:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:13:53,523][12883] Updated weights for policy 0, policy_version 1980 (0.0040) +[2024-06-17 22:13:56,897][12883] Updated weights for policy 0, policy_version 1990 (0.0041) +[2024-06-17 22:13:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38502.4, 300 sec: 38321.9). Total num frames: 32604160. Throughput: 0: 38772.8. Samples: 32740520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) +[2024-06-17 22:13:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:14:01,993][12645] Fps is (10 sec: 39322.1, 60 sec: 39048.6, 300 sec: 38266.4). Total num frames: 32768000. Throughput: 0: 39105.5. Samples: 32865040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-17 22:14:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:14:01,999][12883] Updated weights for policy 0, policy_version 2000 (0.0030) +[2024-06-17 22:14:04,698][12862] Signal inference workers to stop experience collection... (450 times) +[2024-06-17 22:14:04,744][12883] InferenceWorker_p0-w0: stopping experience collection (450 times) +[2024-06-17 22:14:04,812][12862] Signal inference workers to resume experience collection... (450 times) +[2024-06-17 22:14:04,812][12883] InferenceWorker_p0-w0: resuming experience collection (450 times) +[2024-06-17 22:14:04,953][12883] Updated weights for policy 0, policy_version 2010 (0.0038) +[2024-06-17 22:14:06,996][12645] Fps is (10 sec: 36036.7, 60 sec: 38501.0, 300 sec: 38266.1). Total num frames: 32964608. Throughput: 0: 38702.4. Samples: 33086940. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-17 22:14:06,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:14:10,597][12883] Updated weights for policy 0, policy_version 2020 (0.0039) +[2024-06-17 22:14:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39048.5, 300 sec: 38433.0). Total num frames: 33177600. Throughput: 0: 38700.0. Samples: 33321360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 22:14:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:14:13,858][12883] Updated weights for policy 0, policy_version 2030 (0.0034) +[2024-06-17 22:14:16,996][12645] Fps is (10 sec: 37683.1, 60 sec: 38774.0, 300 sec: 38155.0). Total num frames: 33341440. Throughput: 0: 38705.7. Samples: 33429380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 22:14:16,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:14:19,285][12883] Updated weights for policy 0, policy_version 2040 (0.0036) +[2024-06-17 22:14:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 38775.4, 300 sec: 38433.0). Total num frames: 33570816. Throughput: 0: 38647.0. Samples: 33668400. Policy #0 lag: (min: 2.0, avg: 12.0, max: 24.0) +[2024-06-17 22:14:21,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:14:22,036][12883] Updated weights for policy 0, policy_version 2050 (0.0044) +[2024-06-17 22:14:26,994][12645] Fps is (10 sec: 37691.3, 60 sec: 39048.5, 300 sec: 38266.5). Total num frames: 33718272. Throughput: 0: 38442.6. Samples: 33892760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-17 22:14:26,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:14:27,735][12883] Updated weights for policy 0, policy_version 2060 (0.0044) +[2024-06-17 22:14:31,337][12883] Updated weights for policy 0, policy_version 2070 (0.0036) +[2024-06-17 22:14:31,994][12645] Fps is (10 sec: 36045.5, 60 sec: 38229.4, 300 sec: 38377.5). Total num frames: 33931264. Throughput: 0: 38161.0. Samples: 34001520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-17 22:14:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:14:36,286][12883] Updated weights for policy 0, policy_version 2080 (0.0030) +[2024-06-17 22:14:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 38775.4, 300 sec: 38433.0). Total num frames: 34127872. Throughput: 0: 38475.6. Samples: 34243860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) +[2024-06-17 22:14:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:14:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002083_34127872.pth... +[2024-06-17 22:14:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001521_24920064.pth +[2024-06-17 22:14:39,353][12883] Updated weights for policy 0, policy_version 2090 (0.0040) +[2024-06-17 22:14:41,994][12645] Fps is (10 sec: 37682.8, 60 sec: 38775.4, 300 sec: 38266.4). Total num frames: 34308096. Throughput: 0: 38486.2. Samples: 34472400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 22:14:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:14:44,648][12883] Updated weights for policy 0, policy_version 2100 (0.0033) +[2024-06-17 22:14:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 38502.4, 300 sec: 38488.5). Total num frames: 34521088. Throughput: 0: 38292.7. Samples: 34588220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-17 22:14:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:14:48,231][12883] Updated weights for policy 0, policy_version 2110 (0.0041) +[2024-06-17 22:14:51,994][12645] Fps is (10 sec: 34406.2, 60 sec: 37956.2, 300 sec: 38099.7). Total num frames: 34652160. Throughput: 0: 38454.7. Samples: 34817320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 22:14:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:14:53,022][12883] Updated weights for policy 0, policy_version 2120 (0.0043) +[2024-06-17 22:14:56,361][12883] Updated weights for policy 0, policy_version 2130 (0.0038) +[2024-06-17 22:14:56,994][12645] Fps is (10 sec: 37683.6, 60 sec: 38229.3, 300 sec: 38433.0). Total num frames: 34897920. Throughput: 0: 38247.1. Samples: 35042480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) +[2024-06-17 22:14:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:15:01,359][12883] Updated weights for policy 0, policy_version 2140 (0.0028) +[2024-06-17 22:15:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 38775.4, 300 sec: 38433.0). Total num frames: 35094528. Throughput: 0: 38609.1. Samples: 35166700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-17 22:15:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:15:04,943][12883] Updated weights for policy 0, policy_version 2150 (0.0038) +[2024-06-17 22:15:06,994][12645] Fps is (10 sec: 37683.2, 60 sec: 38503.9, 300 sec: 38377.4). Total num frames: 35274752. Throughput: 0: 38303.2. Samples: 35392040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-17 22:15:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:15:09,914][12883] Updated weights for policy 0, policy_version 2160 (0.0031) +[2024-06-17 22:15:11,994][12645] Fps is (10 sec: 37682.8, 60 sec: 38229.3, 300 sec: 38488.5). Total num frames: 35471360. Throughput: 0: 38429.4. Samples: 35622080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) +[2024-06-17 22:15:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:15:13,911][12883] Updated weights for policy 0, policy_version 2170 (0.0036) +[2024-06-17 22:15:16,994][12645] Fps is (10 sec: 36045.1, 60 sec: 38230.8, 300 sec: 38266.4). Total num frames: 35635200. Throughput: 0: 38644.0. Samples: 35740500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-17 22:15:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:15:18,129][12883] Updated weights for policy 0, policy_version 2180 (0.0042) +[2024-06-17 22:15:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37956.4, 300 sec: 38433.0). Total num frames: 35848192. Throughput: 0: 38507.1. Samples: 35976680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 22:15:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:15:22,191][12883] Updated weights for policy 0, policy_version 2190 (0.0043) +[2024-06-17 22:15:26,718][12883] Updated weights for policy 0, policy_version 2200 (0.0041) +[2024-06-17 22:15:26,994][12645] Fps is (10 sec: 42597.4, 60 sec: 39048.5, 300 sec: 38433.0). Total num frames: 36061184. Throughput: 0: 38561.7. Samples: 36207680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 22:15:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:15:30,457][12862] Signal inference workers to stop experience collection... (500 times) +[2024-06-17 22:15:30,493][12883] InferenceWorker_p0-w0: stopping experience collection (500 times) +[2024-06-17 22:15:30,513][12862] Signal inference workers to resume experience collection... (500 times) +[2024-06-17 22:15:30,515][12883] InferenceWorker_p0-w0: resuming experience collection (500 times) +[2024-06-17 22:15:30,836][12883] Updated weights for policy 0, policy_version 2210 (0.0035) +[2024-06-17 22:15:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38229.3, 300 sec: 38322.2). Total num frames: 36225024. Throughput: 0: 38565.5. Samples: 36323660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 22:15:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:15:34,946][12883] Updated weights for policy 0, policy_version 2220 (0.0042) +[2024-06-17 22:15:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38502.3, 300 sec: 38488.5). Total num frames: 36438016. Throughput: 0: 38611.1. Samples: 36554820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:15:36,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:15:39,259][12883] Updated weights for policy 0, policy_version 2230 (0.0038) +[2024-06-17 22:15:41,994][12645] Fps is (10 sec: 40959.2, 60 sec: 38775.4, 300 sec: 38433.0). Total num frames: 36634624. Throughput: 0: 38661.7. Samples: 36782260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-17 22:15:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:15:43,731][12883] Updated weights for policy 0, policy_version 2240 (0.0035) +[2024-06-17 22:15:46,994][12645] Fps is (10 sec: 37683.8, 60 sec: 38229.4, 300 sec: 38488.5). Total num frames: 36814848. Throughput: 0: 38536.5. Samples: 36900840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:15:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:15:48,057][12883] Updated weights for policy 0, policy_version 2250 (0.0043) +[2024-06-17 22:15:51,761][12883] Updated weights for policy 0, policy_version 2260 (0.0048) +[2024-06-17 22:15:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.6, 300 sec: 38488.5). Total num frames: 37027840. Throughput: 0: 38660.8. Samples: 37131780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 22:15:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:15:56,265][12883] Updated weights for policy 0, policy_version 2270 (0.0037) +[2024-06-17 22:15:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 38502.4, 300 sec: 38544.0). Total num frames: 37208064. Throughput: 0: 38645.4. Samples: 37361120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:15:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:16:00,445][12883] Updated weights for policy 0, policy_version 2280 (0.0038) +[2024-06-17 22:16:01,994][12645] Fps is (10 sec: 36045.0, 60 sec: 38229.3, 300 sec: 38433.0). Total num frames: 37388288. Throughput: 0: 38454.1. Samples: 37470940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 22:16:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:16:04,961][12883] Updated weights for policy 0, policy_version 2290 (0.0044) +[2024-06-17 22:16:06,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38229.3, 300 sec: 38377.4). Total num frames: 37568512. Throughput: 0: 38477.7. Samples: 37708180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) +[2024-06-17 22:16:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:16:08,590][12883] Updated weights for policy 0, policy_version 2300 (0.0033) +[2024-06-17 22:16:11,994][12645] Fps is (10 sec: 37683.6, 60 sec: 38229.4, 300 sec: 38433.0). Total num frames: 37765120. Throughput: 0: 38669.9. Samples: 37947820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-17 22:16:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:16:13,618][12883] Updated weights for policy 0, policy_version 2310 (0.0036) +[2024-06-17 22:16:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 39321.5, 300 sec: 38544.1). Total num frames: 37994496. Throughput: 0: 38504.4. Samples: 38056360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) +[2024-06-17 22:16:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:16:17,091][12883] Updated weights for policy 0, policy_version 2320 (0.0039) +[2024-06-17 22:16:21,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38229.3, 300 sec: 38433.0). Total num frames: 38141952. Throughput: 0: 38699.6. Samples: 38296300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-17 22:16:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 22:16:22,127][12862] Saving new best policy, reward=0.006! +[2024-06-17 22:16:22,370][12883] Updated weights for policy 0, policy_version 2330 (0.0032) +[2024-06-17 22:16:25,318][12883] Updated weights for policy 0, policy_version 2340 (0.0029) +[2024-06-17 22:16:26,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38229.4, 300 sec: 38488.5). Total num frames: 38354944. Throughput: 0: 38746.3. Samples: 38525840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) +[2024-06-17 22:16:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:16:30,367][12883] Updated weights for policy 0, policy_version 2350 (0.0041) +[2024-06-17 22:16:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 39321.5, 300 sec: 38655.1). Total num frames: 38584320. Throughput: 0: 38849.7. Samples: 38649080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-17 22:16:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:16:34,331][12883] Updated weights for policy 0, policy_version 2360 (0.0034) +[2024-06-17 22:16:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38502.5, 300 sec: 38544.1). Total num frames: 38748160. Throughput: 0: 38741.9. Samples: 38875160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-17 22:16:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:16:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002366_38764544.pth... +[2024-06-17 22:16:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001799_29474816.pth +[2024-06-17 22:16:39,046][12883] Updated weights for policy 0, policy_version 2370 (0.0034) +[2024-06-17 22:16:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38775.5, 300 sec: 38599.6). Total num frames: 38961152. Throughput: 0: 38712.8. Samples: 39103200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:16:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:16:42,601][12883] Updated weights for policy 0, policy_version 2380 (0.0054) +[2024-06-17 22:16:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38502.4, 300 sec: 38544.1). Total num frames: 39124992. Throughput: 0: 38972.1. Samples: 39224680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-17 22:16:47,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:16:47,418][12883] Updated weights for policy 0, policy_version 2390 (0.0042) +[2024-06-17 22:16:50,730][12883] Updated weights for policy 0, policy_version 2400 (0.0050) +[2024-06-17 22:16:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38502.4, 300 sec: 38488.5). Total num frames: 39337984. Throughput: 0: 38839.0. Samples: 39455940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 22:16:51,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:16:55,670][12883] Updated weights for policy 0, policy_version 2410 (0.0031) +[2024-06-17 22:16:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 38775.4, 300 sec: 38710.6). Total num frames: 39534592. Throughput: 0: 38790.1. Samples: 39693380. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-17 22:16:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:16:58,265][12862] Signal inference workers to stop experience collection... (550 times) +[2024-06-17 22:16:58,265][12862] Signal inference workers to resume experience collection... (550 times) +[2024-06-17 22:16:58,304][12883] InferenceWorker_p0-w0: stopping experience collection (550 times) +[2024-06-17 22:16:58,304][12883] InferenceWorker_p0-w0: resuming experience collection (550 times) +[2024-06-17 22:16:59,307][12883] Updated weights for policy 0, policy_version 2420 (0.0048) +[2024-06-17 22:17:01,994][12645] Fps is (10 sec: 37683.7, 60 sec: 38775.5, 300 sec: 38488.5). Total num frames: 39714816. Throughput: 0: 38830.2. Samples: 39803720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) +[2024-06-17 22:17:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:17:04,370][12883] Updated weights for policy 0, policy_version 2430 (0.0031) +[2024-06-17 22:17:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39594.7, 300 sec: 38821.7). Total num frames: 39944192. Throughput: 0: 38756.4. Samples: 40040340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 22:17:07,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:17:07,289][12883] Updated weights for policy 0, policy_version 2440 (0.0034) +[2024-06-17 22:17:11,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38502.3, 300 sec: 38488.5). Total num frames: 40075264. Throughput: 0: 38909.3. Samples: 40276760. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) +[2024-06-17 22:17:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:17:13,119][12883] Updated weights for policy 0, policy_version 2450 (0.0043) +[2024-06-17 22:17:15,830][12883] Updated weights for policy 0, policy_version 2460 (0.0039) +[2024-06-17 22:17:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 38775.4, 300 sec: 38599.6). Total num frames: 40321024. Throughput: 0: 38541.3. Samples: 40383440. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) +[2024-06-17 22:17:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:17:21,642][12883] Updated weights for policy 0, policy_version 2470 (0.0026) +[2024-06-17 22:17:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39321.6, 300 sec: 38711.0). Total num frames: 40501248. Throughput: 0: 38855.9. Samples: 40623680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-17 22:17:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:17:24,278][12883] Updated weights for policy 0, policy_version 2480 (0.0038) +[2024-06-17 22:17:26,994][12645] Fps is (10 sec: 36044.9, 60 sec: 38775.4, 300 sec: 38544.0). Total num frames: 40681472. Throughput: 0: 38815.1. Samples: 40849880. Policy #0 lag: (min: 1.0, avg: 12.3, max: 24.0) +[2024-06-17 22:17:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:17:30,234][12883] Updated weights for policy 0, policy_version 2490 (0.0041) +[2024-06-17 22:17:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38502.4, 300 sec: 38766.2). Total num frames: 40894464. Throughput: 0: 38743.0. Samples: 40968120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) +[2024-06-17 22:17:31,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:17:33,278][12883] Updated weights for policy 0, policy_version 2500 (0.0044) +[2024-06-17 22:17:36,996][12645] Fps is (10 sec: 37675.0, 60 sec: 38501.0, 300 sec: 38543.8). Total num frames: 41058304. Throughput: 0: 38772.0. Samples: 41200760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 22:17:36,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:17:38,558][12883] Updated weights for policy 0, policy_version 2510 (0.0039) +[2024-06-17 22:17:41,182][12883] Updated weights for policy 0, policy_version 2520 (0.0040) +[2024-06-17 22:17:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39048.6, 300 sec: 38877.9). Total num frames: 41304064. Throughput: 0: 38591.7. Samples: 41430000. Policy #0 lag: (min: 2.0, avg: 9.7, max: 23.0) +[2024-06-17 22:17:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:17:46,946][12883] Updated weights for policy 0, policy_version 2530 (0.0027) +[2024-06-17 22:17:46,995][12645] Fps is (10 sec: 39326.0, 60 sec: 38774.7, 300 sec: 38710.8). Total num frames: 41451520. Throughput: 0: 38811.5. Samples: 41550280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-17 22:17:46,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:17:49,873][12883] Updated weights for policy 0, policy_version 2540 (0.0042) +[2024-06-17 22:17:51,994][12645] Fps is (10 sec: 34406.1, 60 sec: 38502.5, 300 sec: 38488.5). Total num frames: 41648128. Throughput: 0: 38595.1. Samples: 41777120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) +[2024-06-17 22:17:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:17:55,270][12883] Updated weights for policy 0, policy_version 2550 (0.0047) +[2024-06-17 22:17:56,994][12645] Fps is (10 sec: 42603.3, 60 sec: 39048.6, 300 sec: 38821.7). Total num frames: 41877504. Throughput: 0: 38613.9. Samples: 42014380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) +[2024-06-17 22:17:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:17:58,589][12883] Updated weights for policy 0, policy_version 2560 (0.0043) +[2024-06-17 22:18:01,996][12645] Fps is (10 sec: 39313.0, 60 sec: 38774.0, 300 sec: 38599.3). Total num frames: 42041344. Throughput: 0: 38866.1. Samples: 42132500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-17 22:18:01,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:18:03,842][12883] Updated weights for policy 0, policy_version 2570 (0.0039) +[2024-06-17 22:18:06,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38229.4, 300 sec: 38655.1). Total num frames: 42237952. Throughput: 0: 38560.0. Samples: 42358880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 22:18:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:07,206][12883] Updated weights for policy 0, policy_version 2580 (0.0042) +[2024-06-17 22:18:11,994][12645] Fps is (10 sec: 37691.2, 60 sec: 39048.5, 300 sec: 38655.1). Total num frames: 42418176. Throughput: 0: 38915.5. Samples: 42601080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 22:18:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:18:12,088][12883] Updated weights for policy 0, policy_version 2590 (0.0038) +[2024-06-17 22:18:15,618][12883] Updated weights for policy 0, policy_version 2600 (0.0035) +[2024-06-17 22:18:16,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38229.3, 300 sec: 38544.0). Total num frames: 42614784. Throughput: 0: 38731.0. Samples: 42711020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-17 22:18:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:20,166][12883] Updated weights for policy 0, policy_version 2610 (0.0047) +[2024-06-17 22:18:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 38775.5, 300 sec: 38821.8). Total num frames: 42827776. Throughput: 0: 38881.1. Samples: 42950320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-17 22:18:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:24,632][12883] Updated weights for policy 0, policy_version 2620 (0.0038) +[2024-06-17 22:18:26,223][12862] Signal inference workers to stop experience collection... (600 times) +[2024-06-17 22:18:26,225][12862] Signal inference workers to resume experience collection... (600 times) +[2024-06-17 22:18:26,263][12883] InferenceWorker_p0-w0: stopping experience collection (600 times) +[2024-06-17 22:18:26,263][12883] InferenceWorker_p0-w0: resuming experience collection (600 times) +[2024-06-17 22:18:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39048.5, 300 sec: 38599.6). Total num frames: 43024384. Throughput: 0: 38899.9. Samples: 43180500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-17 22:18:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:18:28,557][12883] Updated weights for policy 0, policy_version 2630 (0.0041) +[2024-06-17 22:18:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38775.5, 300 sec: 38710.7). Total num frames: 43220992. Throughput: 0: 38777.9. Samples: 43295240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 22:18:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:32,639][12883] Updated weights for policy 0, policy_version 2640 (0.0039) +[2024-06-17 22:18:36,945][12883] Updated weights for policy 0, policy_version 2650 (0.0031) +[2024-06-17 22:18:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39323.0, 300 sec: 38766.2). Total num frames: 43417600. Throughput: 0: 39024.9. Samples: 43533240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 22:18:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002650_43417600.pth... +[2024-06-17 22:18:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002083_34127872.pth +[2024-06-17 22:18:41,482][12883] Updated weights for policy 0, policy_version 2660 (0.0036) +[2024-06-17 22:18:41,994][12645] Fps is (10 sec: 39321.0, 60 sec: 38502.3, 300 sec: 38655.1). Total num frames: 43614208. Throughput: 0: 38858.6. Samples: 43763020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-17 22:18:41,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:45,362][12883] Updated weights for policy 0, policy_version 2670 (0.0036) +[2024-06-17 22:18:46,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38776.1, 300 sec: 38655.1). Total num frames: 43778048. Throughput: 0: 38782.3. Samples: 43877620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 22:18:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:49,566][12883] Updated weights for policy 0, policy_version 2680 (0.0036) +[2024-06-17 22:18:51,994][12645] Fps is (10 sec: 36045.5, 60 sec: 38775.6, 300 sec: 38544.1). Total num frames: 43974656. Throughput: 0: 38833.9. Samples: 44106400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 22:18:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:53,802][12883] Updated weights for policy 0, policy_version 2690 (0.0035) +[2024-06-17 22:18:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38229.3, 300 sec: 38655.1). Total num frames: 44171264. Throughput: 0: 38588.9. Samples: 44337580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-17 22:18:57,000][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:18:58,459][12883] Updated weights for policy 0, policy_version 2700 (0.0039) +[2024-06-17 22:19:01,971][12883] Updated weights for policy 0, policy_version 2710 (0.0051) +[2024-06-17 22:19:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39323.1, 300 sec: 38766.5). Total num frames: 44400640. Throughput: 0: 38700.6. Samples: 44452540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 22:19:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:19:06,782][12883] Updated weights for policy 0, policy_version 2720 (0.0044) +[2024-06-17 22:19:06,994][12645] Fps is (10 sec: 39322.4, 60 sec: 38775.6, 300 sec: 38599.6). Total num frames: 44564480. Throughput: 0: 38442.8. Samples: 44680240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 22:19:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:19:10,961][12883] Updated weights for policy 0, policy_version 2730 (0.0039) +[2024-06-17 22:19:11,994][12645] Fps is (10 sec: 34406.2, 60 sec: 38775.5, 300 sec: 38655.4). Total num frames: 44744704. Throughput: 0: 38478.7. Samples: 44912040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-17 22:19:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:19:15,379][12883] Updated weights for policy 0, policy_version 2740 (0.0048) +[2024-06-17 22:19:16,994][12645] Fps is (10 sec: 37682.2, 60 sec: 38775.5, 300 sec: 38544.0). Total num frames: 44941312. Throughput: 0: 38490.5. Samples: 45027320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-17 22:19:16,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:19:19,221][12883] Updated weights for policy 0, policy_version 2750 (0.0048) +[2024-06-17 22:19:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38229.3, 300 sec: 38655.2). Total num frames: 45121536. Throughput: 0: 38317.4. Samples: 45257520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-17 22:19:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:19:24,104][12883] Updated weights for policy 0, policy_version 2760 (0.0050) +[2024-06-17 22:19:26,995][12645] Fps is (10 sec: 39318.7, 60 sec: 38501.9, 300 sec: 38655.0). Total num frames: 45334528. Throughput: 0: 38347.3. Samples: 45488680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-17 22:19:26,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:19:27,975][12883] Updated weights for policy 0, policy_version 2770 (0.0040) +[2024-06-17 22:19:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 38502.3, 300 sec: 38655.1). Total num frames: 45531136. Throughput: 0: 38593.3. Samples: 45614320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 22:19:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:19:32,234][12883] Updated weights for policy 0, policy_version 2780 (0.0035) +[2024-06-17 22:19:36,309][12883] Updated weights for policy 0, policy_version 2790 (0.0037) +[2024-06-17 22:19:36,994][12645] Fps is (10 sec: 37686.3, 60 sec: 38229.3, 300 sec: 38655.1). Total num frames: 45711360. Throughput: 0: 38471.9. Samples: 45837640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:19:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:19:40,864][12883] Updated weights for policy 0, policy_version 2800 (0.0032) +[2024-06-17 22:19:41,994][12645] Fps is (10 sec: 36045.3, 60 sec: 37956.4, 300 sec: 38544.1). Total num frames: 45891584. Throughput: 0: 38455.7. Samples: 46068080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-17 22:19:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:19:45,078][12883] Updated weights for policy 0, policy_version 2810 (0.0039) +[2024-06-17 22:19:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38502.4, 300 sec: 38766.2). Total num frames: 46088192. Throughput: 0: 38408.0. Samples: 46180900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-17 22:19:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:19:49,527][12883] Updated weights for policy 0, policy_version 2820 (0.0032) +[2024-06-17 22:19:51,994][12645] Fps is (10 sec: 39320.9, 60 sec: 38502.3, 300 sec: 38599.6). Total num frames: 46284800. Throughput: 0: 38439.3. Samples: 46410020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-17 22:19:51,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:19:53,514][12883] Updated weights for policy 0, policy_version 2830 (0.0046) +[2024-06-17 22:19:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 38502.5, 300 sec: 38599.6). Total num frames: 46481408. Throughput: 0: 38530.3. Samples: 46645900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-17 22:19:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:19:57,904][12883] Updated weights for policy 0, policy_version 2840 (0.0034) +[2024-06-17 22:20:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 37956.2, 300 sec: 38655.1). Total num frames: 46678016. Throughput: 0: 38531.6. Samples: 46761240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-17 22:20:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:20:02,032][12883] Updated weights for policy 0, policy_version 2850 (0.0035) +[2024-06-17 22:20:06,455][12883] Updated weights for policy 0, policy_version 2860 (0.0035) +[2024-06-17 22:20:06,491][12862] Signal inference workers to stop experience collection... (650 times) +[2024-06-17 22:20:06,491][12862] Signal inference workers to resume experience collection... (650 times) +[2024-06-17 22:20:06,506][12883] InferenceWorker_p0-w0: stopping experience collection (650 times) +[2024-06-17 22:20:06,506][12883] InferenceWorker_p0-w0: resuming experience collection (650 times) +[2024-06-17 22:20:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 38502.3, 300 sec: 38655.1). Total num frames: 46874624. Throughput: 0: 38560.0. Samples: 46992720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 22:20:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:20:11,240][12883] Updated weights for policy 0, policy_version 2870 (0.0041) +[2024-06-17 22:20:11,994][12645] Fps is (10 sec: 34406.5, 60 sec: 37956.2, 300 sec: 38599.6). Total num frames: 47022080. Throughput: 0: 38585.6. Samples: 47225000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 22:20:11,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:20:15,005][12883] Updated weights for policy 0, policy_version 2880 (0.0031) +[2024-06-17 22:20:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.6, 300 sec: 38766.2). Total num frames: 47284224. Throughput: 0: 38304.1. Samples: 47338000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) +[2024-06-17 22:20:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:20:19,286][12883] Updated weights for policy 0, policy_version 2890 (0.0045) +[2024-06-17 22:20:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 38502.3, 300 sec: 38544.1). Total num frames: 47431680. Throughput: 0: 38649.4. Samples: 47576860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-17 22:20:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:20:23,411][12883] Updated weights for policy 0, policy_version 2900 (0.0035) +[2024-06-17 22:20:26,994][12645] Fps is (10 sec: 36044.7, 60 sec: 38503.0, 300 sec: 38710.7). Total num frames: 47644672. Throughput: 0: 38614.6. Samples: 47805740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-17 22:20:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:20:28,187][12883] Updated weights for policy 0, policy_version 2910 (0.0043) +[2024-06-17 22:20:31,790][12883] Updated weights for policy 0, policy_version 2920 (0.0043) +[2024-06-17 22:20:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 38502.4, 300 sec: 38655.1). Total num frames: 47841280. Throughput: 0: 38749.7. Samples: 47924640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-17 22:20:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:20:36,572][12883] Updated weights for policy 0, policy_version 2930 (0.0030) +[2024-06-17 22:20:36,996][12645] Fps is (10 sec: 36036.7, 60 sec: 38227.9, 300 sec: 38543.8). Total num frames: 48005120. Throughput: 0: 38711.1. Samples: 48152100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-17 22:20:36,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:20:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002930_48005120.pth... +[2024-06-17 22:20:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002366_38764544.pth +[2024-06-17 22:20:40,343][12883] Updated weights for policy 0, policy_version 2940 (0.0038) +[2024-06-17 22:20:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39321.6, 300 sec: 38766.2). Total num frames: 48250880. Throughput: 0: 38538.6. Samples: 48380140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-17 22:20:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:20:45,029][12883] Updated weights for policy 0, policy_version 2950 (0.0041) +[2024-06-17 22:20:46,994][12645] Fps is (10 sec: 40968.9, 60 sec: 38775.4, 300 sec: 38599.6). Total num frames: 48414720. Throughput: 0: 38747.1. Samples: 48504860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) +[2024-06-17 22:20:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:20:49,017][12883] Updated weights for policy 0, policy_version 2960 (0.0037) +[2024-06-17 22:20:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39048.7, 300 sec: 38710.7). Total num frames: 48627712. Throughput: 0: 38592.5. Samples: 48729380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-17 22:20:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:20:53,620][12883] Updated weights for policy 0, policy_version 2970 (0.0042) +[2024-06-17 22:20:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 38775.5, 300 sec: 38710.7). Total num frames: 48807936. Throughput: 0: 38753.9. Samples: 48968920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) +[2024-06-17 22:20:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:20:57,121][12883] Updated weights for policy 0, policy_version 2980 (0.0046) +[2024-06-17 22:21:01,986][12883] Updated weights for policy 0, policy_version 2990 (0.0031) +[2024-06-17 22:21:01,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38502.5, 300 sec: 38710.7). Total num frames: 48988160. Throughput: 0: 38754.2. Samples: 49081940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) +[2024-06-17 22:21:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:21:05,483][12883] Updated weights for policy 0, policy_version 3000 (0.0033) +[2024-06-17 22:21:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 39321.6, 300 sec: 38877.3). Total num frames: 49233920. Throughput: 0: 38609.0. Samples: 49314260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 22:21:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:21:10,307][12883] Updated weights for policy 0, policy_version 3010 (0.0040) +[2024-06-17 22:21:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39594.7, 300 sec: 38655.1). Total num frames: 49397760. Throughput: 0: 38928.0. Samples: 49557500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-17 22:21:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:21:13,955][12883] Updated weights for policy 0, policy_version 3020 (0.0039) +[2024-06-17 22:21:16,994][12645] Fps is (10 sec: 36044.5, 60 sec: 38502.4, 300 sec: 38821.7). Total num frames: 49594368. Throughput: 0: 38855.6. Samples: 49673140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 22:21:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:21:18,500][12883] Updated weights for policy 0, policy_version 3030 (0.0042) +[2024-06-17 22:21:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.6, 300 sec: 38710.7). Total num frames: 49774592. Throughput: 0: 38986.9. Samples: 49906420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:21:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:21:22,237][12883] Updated weights for policy 0, policy_version 3040 (0.0040) +[2024-06-17 22:21:26,994][12645] Fps is (10 sec: 36044.7, 60 sec: 38502.4, 300 sec: 38544.1). Total num frames: 49954816. Throughput: 0: 39036.8. Samples: 50136800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-17 22:21:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:21:27,353][12883] Updated weights for policy 0, policy_version 3050 (0.0056) +[2024-06-17 22:21:30,825][12883] Updated weights for policy 0, policy_version 3060 (0.0045) +[2024-06-17 22:21:31,214][12862] Signal inference workers to stop experience collection... (700 times) +[2024-06-17 22:21:31,236][12883] InferenceWorker_p0-w0: stopping experience collection (700 times) +[2024-06-17 22:21:31,329][12862] Signal inference workers to resume experience collection... (700 times) +[2024-06-17 22:21:31,330][12883] InferenceWorker_p0-w0: resuming experience collection (700 times) +[2024-06-17 22:21:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 39048.6, 300 sec: 38766.2). Total num frames: 50184192. Throughput: 0: 38932.5. Samples: 50256820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) +[2024-06-17 22:21:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:21:35,304][12883] Updated weights for policy 0, policy_version 3070 (0.0038) +[2024-06-17 22:21:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39050.0, 300 sec: 38599.6). Total num frames: 50348032. Throughput: 0: 39021.7. Samples: 50485360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 22:21:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:21:39,375][12883] Updated weights for policy 0, policy_version 3080 (0.0042) +[2024-06-17 22:21:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38502.4, 300 sec: 38766.2). Total num frames: 50561024. Throughput: 0: 38775.0. Samples: 50713800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 22:21:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:21:44,191][12883] Updated weights for policy 0, policy_version 3090 (0.0035) +[2024-06-17 22:21:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.6, 300 sec: 38710.7). Total num frames: 50757632. Throughput: 0: 38970.6. Samples: 50835620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-17 22:21:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:21:47,827][12883] Updated weights for policy 0, policy_version 3100 (0.0038) +[2024-06-17 22:21:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 38502.4, 300 sec: 38655.1). Total num frames: 50937856. Throughput: 0: 38923.1. Samples: 51065800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-17 22:21:51,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:21:52,333][12883] Updated weights for policy 0, policy_version 3110 (0.0050) +[2024-06-17 22:21:56,423][12883] Updated weights for policy 0, policy_version 3120 (0.0035) +[2024-06-17 22:21:56,996][12645] Fps is (10 sec: 37674.8, 60 sec: 38774.0, 300 sec: 38710.4). Total num frames: 51134464. Throughput: 0: 38580.3. Samples: 51293700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) +[2024-06-17 22:21:56,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:22:00,890][12883] Updated weights for policy 0, policy_version 3130 (0.0046) +[2024-06-17 22:22:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39048.5, 300 sec: 38599.6). Total num frames: 51331072. Throughput: 0: 38640.0. Samples: 51411940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 22:22:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:22:04,749][12883] Updated weights for policy 0, policy_version 3140 (0.0048) +[2024-06-17 22:22:06,994][12645] Fps is (10 sec: 39330.9, 60 sec: 38229.4, 300 sec: 38821.8). Total num frames: 51527680. Throughput: 0: 38606.2. Samples: 51643700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-17 22:22:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:22:08,709][12883] Updated weights for policy 0, policy_version 3150 (0.0026) +[2024-06-17 22:22:11,994][12645] Fps is (10 sec: 36044.9, 60 sec: 38229.3, 300 sec: 38544.1). Total num frames: 51691520. Throughput: 0: 38641.9. Samples: 51875680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-17 22:22:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:22:13,314][12883] Updated weights for policy 0, policy_version 3160 (0.0034) +[2024-06-17 22:22:16,994][12645] Fps is (10 sec: 36044.1, 60 sec: 38229.3, 300 sec: 38599.6). Total num frames: 51888128. Throughput: 0: 38447.5. Samples: 51986960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-17 22:22:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:22:17,827][12883] Updated weights for policy 0, policy_version 3170 (0.0034) +[2024-06-17 22:22:21,564][12883] Updated weights for policy 0, policy_version 3180 (0.0039) +[2024-06-17 22:22:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 39048.4, 300 sec: 38766.2). Total num frames: 52117504. Throughput: 0: 38613.7. Samples: 52222980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-17 22:22:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:22:25,883][12883] Updated weights for policy 0, policy_version 3190 (0.0039) +[2024-06-17 22:22:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.5, 300 sec: 38655.1). Total num frames: 52297728. Throughput: 0: 38799.9. Samples: 52459800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 22:22:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:22:30,045][12883] Updated weights for policy 0, policy_version 3200 (0.0038) +[2024-06-17 22:22:31,994][12645] Fps is (10 sec: 36045.5, 60 sec: 38229.4, 300 sec: 38711.0). Total num frames: 52477952. Throughput: 0: 38642.3. Samples: 52574520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 22:22:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:22:34,992][12883] Updated weights for policy 0, policy_version 3210 (0.0039) +[2024-06-17 22:22:37,000][12645] Fps is (10 sec: 39297.3, 60 sec: 39044.5, 300 sec: 38598.8). Total num frames: 52690944. Throughput: 0: 38639.5. Samples: 52804820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:22:37,001][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:22:37,029][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003216_52690944.pth... +[2024-06-17 22:22:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002650_43417600.pth +[2024-06-17 22:22:38,298][12883] Updated weights for policy 0, policy_version 3220 (0.0039) +[2024-06-17 22:22:41,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38229.3, 300 sec: 38655.3). Total num frames: 52854784. Throughput: 0: 38904.6. Samples: 53044320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) +[2024-06-17 22:22:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:22:43,012][12883] Updated weights for policy 0, policy_version 3230 (0.0041) +[2024-06-17 22:22:46,457][12883] Updated weights for policy 0, policy_version 3240 (0.0033) +[2024-06-17 22:22:46,994][12645] Fps is (10 sec: 39346.2, 60 sec: 38775.5, 300 sec: 38766.2). Total num frames: 53084160. Throughput: 0: 38752.4. Samples: 53155800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 22:22:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:22:51,410][12883] Updated weights for policy 0, policy_version 3250 (0.0033) +[2024-06-17 22:22:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 38775.4, 300 sec: 38599.6). Total num frames: 53264384. Throughput: 0: 38936.7. Samples: 53395860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) +[2024-06-17 22:22:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:22:55,373][12883] Updated weights for policy 0, policy_version 3260 (0.0035) +[2024-06-17 22:22:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39050.1, 300 sec: 38766.5). Total num frames: 53477376. Throughput: 0: 38908.5. Samples: 53626560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-17 22:22:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:22:59,870][12883] Updated weights for policy 0, policy_version 3270 (0.0039) +[2024-06-17 22:23:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.5, 300 sec: 38766.2). Total num frames: 53673984. Throughput: 0: 39019.6. Samples: 53742840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:23:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:23:03,615][12862] Signal inference workers to stop experience collection... (750 times) +[2024-06-17 22:23:03,615][12862] Signal inference workers to resume experience collection... (750 times) +[2024-06-17 22:23:03,650][12883] InferenceWorker_p0-w0: stopping experience collection (750 times) +[2024-06-17 22:23:03,650][12883] InferenceWorker_p0-w0: resuming experience collection (750 times) +[2024-06-17 22:23:03,767][12883] Updated weights for policy 0, policy_version 3280 (0.0044) +[2024-06-17 22:23:06,994][12645] Fps is (10 sec: 36044.3, 60 sec: 38502.3, 300 sec: 38710.7). Total num frames: 53837824. Throughput: 0: 38819.2. Samples: 53969840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-17 22:23:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:23:08,367][12883] Updated weights for policy 0, policy_version 3290 (0.0029) +[2024-06-17 22:23:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39321.5, 300 sec: 38766.2). Total num frames: 54050816. Throughput: 0: 38748.0. Samples: 54203460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) +[2024-06-17 22:23:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:23:12,165][12883] Updated weights for policy 0, policy_version 3300 (0.0052) +[2024-06-17 22:23:16,936][12883] Updated weights for policy 0, policy_version 3310 (0.0048) +[2024-06-17 22:23:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39048.5, 300 sec: 38655.1). Total num frames: 54231040. Throughput: 0: 38902.6. Samples: 54325140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 22:23:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:23:20,693][12883] Updated weights for policy 0, policy_version 3320 (0.0036) +[2024-06-17 22:23:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38775.5, 300 sec: 38710.7). Total num frames: 54444032. Throughput: 0: 38895.6. Samples: 54554880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 22:23:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:23:25,559][12883] Updated weights for policy 0, policy_version 3330 (0.0036) +[2024-06-17 22:23:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39048.5, 300 sec: 38710.6). Total num frames: 54640640. Throughput: 0: 38644.4. Samples: 54783320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-17 22:23:27,003][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:23:28,638][12883] Updated weights for policy 0, policy_version 3340 (0.0027) +[2024-06-17 22:23:31,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39048.5, 300 sec: 38655.1). Total num frames: 54820864. Throughput: 0: 38877.3. Samples: 54905280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-17 22:23:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:23:33,663][12883] Updated weights for policy 0, policy_version 3350 (0.0037) +[2024-06-17 22:23:36,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39052.6, 300 sec: 38710.7). Total num frames: 55033856. Throughput: 0: 38757.8. Samples: 55139960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) +[2024-06-17 22:23:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:23:37,229][12883] Updated weights for policy 0, policy_version 3360 (0.0045) +[2024-06-17 22:23:41,603][12883] Updated weights for policy 0, policy_version 3370 (0.0045) +[2024-06-17 22:23:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39321.6, 300 sec: 38766.2). Total num frames: 55214080. Throughput: 0: 38930.1. Samples: 55378420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 22:23:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:23:45,354][12883] Updated weights for policy 0, policy_version 3380 (0.0026) +[2024-06-17 22:23:46,995][12645] Fps is (10 sec: 36038.6, 60 sec: 38501.3, 300 sec: 38710.4). Total num frames: 55394304. Throughput: 0: 39009.2. Samples: 55498320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-17 22:23:46,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:23:50,285][12883] Updated weights for policy 0, policy_version 3390 (0.0038) +[2024-06-17 22:23:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39321.6, 300 sec: 38821.7). Total num frames: 55623680. Throughput: 0: 39116.0. Samples: 55730060. Policy #0 lag: (min: 1.0, avg: 12.8, max: 26.0) +[2024-06-17 22:23:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:23:53,885][12883] Updated weights for policy 0, policy_version 3400 (0.0036) +[2024-06-17 22:23:56,994][12645] Fps is (10 sec: 40967.3, 60 sec: 38775.4, 300 sec: 38655.1). Total num frames: 55803904. Throughput: 0: 39163.7. Samples: 55965820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 22:23:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:23:59,052][12883] Updated weights for policy 0, policy_version 3410 (0.0033) +[2024-06-17 22:24:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.5, 300 sec: 38821.7). Total num frames: 56016896. Throughput: 0: 38945.8. Samples: 56077700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-17 22:24:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:24:02,417][12883] Updated weights for policy 0, policy_version 3420 (0.0033) +[2024-06-17 22:24:06,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39048.6, 300 sec: 38766.2). Total num frames: 56180736. Throughput: 0: 39136.5. Samples: 56316020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-17 22:24:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:24:07,551][12883] Updated weights for policy 0, policy_version 3430 (0.0032) +[2024-06-17 22:24:11,081][12883] Updated weights for policy 0, policy_version 3440 (0.0041) +[2024-06-17 22:24:11,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39048.5, 300 sec: 38821.8). Total num frames: 56393728. Throughput: 0: 39138.2. Samples: 56544540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 22:24:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:24:15,555][12883] Updated weights for policy 0, policy_version 3450 (0.0050) +[2024-06-17 22:24:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39048.5, 300 sec: 38821.7). Total num frames: 56573952. Throughput: 0: 38929.3. Samples: 56657100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 22:24:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:24:19,131][12883] Updated weights for policy 0, policy_version 3460 (0.0028) +[2024-06-17 22:24:21,994][12645] Fps is (10 sec: 37683.6, 60 sec: 38775.5, 300 sec: 38766.3). Total num frames: 56770560. Throughput: 0: 38834.7. Samples: 56887520. Policy #0 lag: (min: 1.0, avg: 8.5, max: 18.0) +[2024-06-17 22:24:22,003][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:24:23,811][12883] Updated weights for policy 0, policy_version 3470 (0.0043) +[2024-06-17 22:24:26,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38502.5, 300 sec: 38710.7). Total num frames: 56950784. Throughput: 0: 38923.2. Samples: 57129960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 22:24:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:24:27,910][12883] Updated weights for policy 0, policy_version 3480 (0.0044) +[2024-06-17 22:24:30,893][12862] Signal inference workers to stop experience collection... (800 times) +[2024-06-17 22:24:30,933][12883] InferenceWorker_p0-w0: stopping experience collection (800 times) +[2024-06-17 22:24:30,941][12862] Signal inference workers to resume experience collection... (800 times) +[2024-06-17 22:24:30,954][12883] InferenceWorker_p0-w0: resuming experience collection (800 times) +[2024-06-17 22:24:31,643][12883] Updated weights for policy 0, policy_version 3490 (0.0042) +[2024-06-17 22:24:31,999][12645] Fps is (10 sec: 40936.2, 60 sec: 39317.8, 300 sec: 38876.5). Total num frames: 57180160. Throughput: 0: 38848.9. Samples: 57246680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) +[2024-06-17 22:24:32,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:24:36,222][12883] Updated weights for policy 0, policy_version 3500 (0.0035) +[2024-06-17 22:24:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 39321.6, 300 sec: 38988.4). Total num frames: 57393152. Throughput: 0: 39071.2. Samples: 57488260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) +[2024-06-17 22:24:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:24:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003503_57393152.pth... +[2024-06-17 22:24:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002930_48005120.pth +[2024-06-17 22:24:40,187][12883] Updated weights for policy 0, policy_version 3510 (0.0033) +[2024-06-17 22:24:41,994][12645] Fps is (10 sec: 36065.8, 60 sec: 38775.5, 300 sec: 38821.8). Total num frames: 57540608. Throughput: 0: 38926.6. Samples: 57717520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 22:24:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:24:44,921][12883] Updated weights for policy 0, policy_version 3520 (0.0045) +[2024-06-17 22:24:46,994][12645] Fps is (10 sec: 34406.0, 60 sec: 39049.6, 300 sec: 38821.8). Total num frames: 57737216. Throughput: 0: 38934.6. Samples: 57829760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-17 22:24:46,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:24:48,383][12883] Updated weights for policy 0, policy_version 3530 (0.0040) +[2024-06-17 22:24:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 38502.4, 300 sec: 38821.7). Total num frames: 57933824. Throughput: 0: 39029.3. Samples: 58072340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-17 22:24:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:24:52,942][12883] Updated weights for policy 0, policy_version 3540 (0.0044) +[2024-06-17 22:24:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39048.5, 300 sec: 38877.3). Total num frames: 58146816. Throughput: 0: 39058.3. Samples: 58302160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) +[2024-06-17 22:24:56,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:24:57,072][12883] Updated weights for policy 0, policy_version 3550 (0.0034) +[2024-06-17 22:25:01,262][12883] Updated weights for policy 0, policy_version 3560 (0.0034) +[2024-06-17 22:25:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 58343424. Throughput: 0: 39254.8. Samples: 58423560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:25:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:25:05,116][12883] Updated weights for policy 0, policy_version 3570 (0.0041) +[2024-06-17 22:25:06,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39048.5, 300 sec: 38988.4). Total num frames: 58523648. Throughput: 0: 39215.0. Samples: 58652200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:25:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:25:09,777][12883] Updated weights for policy 0, policy_version 3580 (0.0030) +[2024-06-17 22:25:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38775.6, 300 sec: 38766.2). Total num frames: 58720256. Throughput: 0: 39005.8. Samples: 58885220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) +[2024-06-17 22:25:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:25:14,043][12883] Updated weights for policy 0, policy_version 3590 (0.0039) +[2024-06-17 22:25:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.6, 300 sec: 38932.8). Total num frames: 58916864. Throughput: 0: 38888.1. Samples: 58996420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) +[2024-06-17 22:25:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:25:18,655][12883] Updated weights for policy 0, policy_version 3600 (0.0036) +[2024-06-17 22:25:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39048.6, 300 sec: 38877.3). Total num frames: 59113472. Throughput: 0: 38741.4. Samples: 59231620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-17 22:25:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:25:22,275][12883] Updated weights for policy 0, policy_version 3610 (0.0033) +[2024-06-17 22:25:26,807][12883] Updated weights for policy 0, policy_version 3620 (0.0032) +[2024-06-17 22:25:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.6, 300 sec: 38932.8). Total num frames: 59326464. Throughput: 0: 38962.1. Samples: 59470820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 22:25:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:25:31,187][12883] Updated weights for policy 0, policy_version 3630 (0.0033) +[2024-06-17 22:25:31,999][12645] Fps is (10 sec: 37661.2, 60 sec: 38502.4, 300 sec: 38932.4). Total num frames: 59490304. Throughput: 0: 38997.3. Samples: 59584860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-17 22:25:32,000][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:25:35,301][12883] Updated weights for policy 0, policy_version 3640 (0.0037) +[2024-06-17 22:25:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 59719680. Throughput: 0: 38716.6. Samples: 59814580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:25:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:25:39,470][12883] Updated weights for policy 0, policy_version 3650 (0.0034) +[2024-06-17 22:25:41,994][12645] Fps is (10 sec: 39344.6, 60 sec: 39048.6, 300 sec: 38877.3). Total num frames: 59883520. Throughput: 0: 38855.7. Samples: 60050660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 22:25:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:25:43,712][12883] Updated weights for policy 0, policy_version 3660 (0.0044) +[2024-06-17 22:25:46,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39321.6, 300 sec: 38877.3). Total num frames: 60096512. Throughput: 0: 38744.8. Samples: 60167080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 22:25:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:25:47,885][12883] Updated weights for policy 0, policy_version 3670 (0.0039) +[2024-06-17 22:25:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38775.6, 300 sec: 38821.8). Total num frames: 60260352. Throughput: 0: 38850.4. Samples: 60400460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) +[2024-06-17 22:25:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:25:52,333][12883] Updated weights for policy 0, policy_version 3680 (0.0041) +[2024-06-17 22:25:56,401][12883] Updated weights for policy 0, policy_version 3690 (0.0047) +[2024-06-17 22:25:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 60473344. Throughput: 0: 38900.3. Samples: 60635740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-17 22:25:57,003][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:25:57,926][12862] Signal inference workers to stop experience collection... (850 times) +[2024-06-17 22:25:57,975][12883] InferenceWorker_p0-w0: stopping experience collection (850 times) +[2024-06-17 22:25:57,984][12862] Signal inference workers to resume experience collection... (850 times) +[2024-06-17 22:25:57,987][12883] InferenceWorker_p0-w0: resuming experience collection (850 times) +[2024-06-17 22:26:00,222][12883] Updated weights for policy 0, policy_version 3700 (0.0037) +[2024-06-17 22:26:01,994][12645] Fps is (10 sec: 42597.4, 60 sec: 39048.4, 300 sec: 38821.7). Total num frames: 60686336. Throughput: 0: 39081.2. Samples: 60755080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-17 22:26:02,004][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:26:04,715][12883] Updated weights for policy 0, policy_version 3710 (0.0024) +[2024-06-17 22:26:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39048.5, 300 sec: 38877.3). Total num frames: 60866560. Throughput: 0: 38966.6. Samples: 60985120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 22:26:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:26:09,239][12883] Updated weights for policy 0, policy_version 3720 (0.0039) +[2024-06-17 22:26:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.4, 300 sec: 38877.3). Total num frames: 61063168. Throughput: 0: 38863.6. Samples: 61219680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) +[2024-06-17 22:26:12,008][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:26:13,101][12883] Updated weights for policy 0, policy_version 3730 (0.0039) +[2024-06-17 22:26:16,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 61243392. Throughput: 0: 38921.0. Samples: 61336080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-17 22:26:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:26:17,418][12883] Updated weights for policy 0, policy_version 3740 (0.0031) +[2024-06-17 22:26:21,523][12883] Updated weights for policy 0, policy_version 3750 (0.0042) +[2024-06-17 22:26:21,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38775.4, 300 sec: 38932.8). Total num frames: 61440000. Throughput: 0: 39062.5. Samples: 61572400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-17 22:26:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:26:25,698][12883] Updated weights for policy 0, policy_version 3760 (0.0043) +[2024-06-17 22:26:26,996][12645] Fps is (10 sec: 39312.9, 60 sec: 38501.0, 300 sec: 38821.5). Total num frames: 61636608. Throughput: 0: 38979.3. Samples: 61804820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-17 22:26:26,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:26:29,833][12883] Updated weights for policy 0, policy_version 3770 (0.0034) +[2024-06-17 22:26:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39325.4, 300 sec: 38988.4). Total num frames: 61849600. Throughput: 0: 39099.5. Samples: 61926560. Policy #0 lag: (min: 1.0, avg: 12.1, max: 23.0) +[2024-06-17 22:26:31,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:26:33,908][12883] Updated weights for policy 0, policy_version 3780 (0.0053) +[2024-06-17 22:26:36,994][12645] Fps is (10 sec: 39330.2, 60 sec: 38502.3, 300 sec: 38877.3). Total num frames: 62029824. Throughput: 0: 39188.3. Samples: 62163940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 22:26:36,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:26:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003786_62029824.pth... +[2024-06-17 22:26:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003216_52690944.pth +[2024-06-17 22:26:38,249][12883] Updated weights for policy 0, policy_version 3790 (0.0036) +[2024-06-17 22:26:41,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39048.6, 300 sec: 38877.3). Total num frames: 62226432. Throughput: 0: 39033.9. Samples: 62392260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-17 22:26:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:26:42,386][12883] Updated weights for policy 0, policy_version 3800 (0.0034) +[2024-06-17 22:26:46,489][12883] Updated weights for policy 0, policy_version 3810 (0.0029) +[2024-06-17 22:26:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 62423040. Throughput: 0: 38955.6. Samples: 62508080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:26:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:26:50,486][12883] Updated weights for policy 0, policy_version 3820 (0.0037) +[2024-06-17 22:26:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39321.6, 300 sec: 38933.1). Total num frames: 62619648. Throughput: 0: 39138.8. Samples: 62746360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:26:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:26:55,919][12883] Updated weights for policy 0, policy_version 3830 (0.0046) +[2024-06-17 22:26:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39321.6, 300 sec: 38988.4). Total num frames: 62832640. Throughput: 0: 38875.1. Samples: 62969060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 22:26:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:26:59,522][12883] Updated weights for policy 0, policy_version 3840 (0.0030) +[2024-06-17 22:27:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 63012864. Throughput: 0: 38998.6. Samples: 63091020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-17 22:27:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:27:03,983][12883] Updated weights for policy 0, policy_version 3850 (0.0027) +[2024-06-17 22:27:06,994][12645] Fps is (10 sec: 34406.9, 60 sec: 38502.5, 300 sec: 38932.8). Total num frames: 63176704. Throughput: 0: 38851.3. Samples: 63320700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-17 22:27:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:27:07,646][12883] Updated weights for policy 0, policy_version 3860 (0.0044) +[2024-06-17 22:27:11,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38502.4, 300 sec: 38932.8). Total num frames: 63373312. Throughput: 0: 38894.8. Samples: 63555000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 22:27:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:27:12,422][12883] Updated weights for policy 0, policy_version 3870 (0.0037) +[2024-06-17 22:27:15,934][12883] Updated weights for policy 0, policy_version 3880 (0.0026) +[2024-06-17 22:27:16,997][12645] Fps is (10 sec: 42582.6, 60 sec: 39319.3, 300 sec: 38932.4). Total num frames: 63602688. Throughput: 0: 38782.2. Samples: 63671900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) +[2024-06-17 22:27:16,998][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:27:20,751][12883] Updated weights for policy 0, policy_version 3890 (0.0037) +[2024-06-17 22:27:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 39594.7, 300 sec: 39043.9). Total num frames: 63815680. Throughput: 0: 38906.2. Samples: 63914720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-17 22:27:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:27:24,109][12862] Signal inference workers to stop experience collection... (900 times) +[2024-06-17 22:27:24,136][12883] InferenceWorker_p0-w0: stopping experience collection (900 times) +[2024-06-17 22:27:24,229][12862] Signal inference workers to resume experience collection... (900 times) +[2024-06-17 22:27:24,230][12883] InferenceWorker_p0-w0: resuming experience collection (900 times) +[2024-06-17 22:27:24,367][12883] Updated weights for policy 0, policy_version 3900 (0.0025) +[2024-06-17 22:27:26,994][12645] Fps is (10 sec: 37697.2, 60 sec: 39050.0, 300 sec: 38988.4). Total num frames: 63979520. Throughput: 0: 38896.0. Samples: 64142580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 22:27:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:27:28,714][12883] Updated weights for policy 0, policy_version 3910 (0.0036) +[2024-06-17 22:27:31,994][12645] Fps is (10 sec: 37680.6, 60 sec: 39048.1, 300 sec: 38989.1). Total num frames: 64192512. Throughput: 0: 38923.9. Samples: 64259680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:27:31,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:27:32,548][12883] Updated weights for policy 0, policy_version 3920 (0.0039) +[2024-06-17 22:27:36,996][12645] Fps is (10 sec: 39312.4, 60 sec: 39047.1, 300 sec: 39043.6). Total num frames: 64372736. Throughput: 0: 38924.7. Samples: 64498060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:27:36,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:27:37,292][12883] Updated weights for policy 0, policy_version 3930 (0.0036) +[2024-06-17 22:27:41,182][12883] Updated weights for policy 0, policy_version 3940 (0.0036) +[2024-06-17 22:27:41,994][12645] Fps is (10 sec: 36047.8, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 64552960. Throughput: 0: 39129.5. Samples: 64729880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) +[2024-06-17 22:27:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:27:45,453][12883] Updated weights for policy 0, policy_version 3950 (0.0050) +[2024-06-17 22:27:46,994][12645] Fps is (10 sec: 39330.4, 60 sec: 39048.5, 300 sec: 38988.4). Total num frames: 64765952. Throughput: 0: 39075.1. Samples: 64849400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-17 22:27:46,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:27:49,736][12883] Updated weights for policy 0, policy_version 3960 (0.0038) +[2024-06-17 22:27:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 64946176. Throughput: 0: 39266.7. Samples: 65087700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-17 22:27:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:27:53,889][12883] Updated weights for policy 0, policy_version 3970 (0.0041) +[2024-06-17 22:27:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 39048.6, 300 sec: 38988.4). Total num frames: 65175552. Throughput: 0: 39100.1. Samples: 65314500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) +[2024-06-17 22:27:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:27:58,445][12883] Updated weights for policy 0, policy_version 3980 (0.0044) +[2024-06-17 22:28:01,994][12645] Fps is (10 sec: 40958.7, 60 sec: 39048.4, 300 sec: 39043.9). Total num frames: 65355776. Throughput: 0: 39339.9. Samples: 65442060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) +[2024-06-17 22:28:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:28:02,403][12883] Updated weights for policy 0, policy_version 3990 (0.0036) +[2024-06-17 22:28:06,591][12883] Updated weights for policy 0, policy_version 4000 (0.0040) +[2024-06-17 22:28:06,994][12645] Fps is (10 sec: 36044.1, 60 sec: 39321.5, 300 sec: 38932.8). Total num frames: 65536000. Throughput: 0: 39220.8. Samples: 65679660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 22:28:06,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:28:10,362][12883] Updated weights for policy 0, policy_version 4010 (0.0036) +[2024-06-17 22:28:11,994][12645] Fps is (10 sec: 40961.1, 60 sec: 39867.8, 300 sec: 39099.5). Total num frames: 65765376. Throughput: 0: 39271.1. Samples: 65909780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 22:28:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:28:14,577][12883] Updated weights for policy 0, policy_version 4020 (0.0037) +[2024-06-17 22:28:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39050.9, 300 sec: 38988.4). Total num frames: 65945600. Throughput: 0: 39388.1. Samples: 66032120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-17 22:28:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:28:18,525][12883] Updated weights for policy 0, policy_version 4030 (0.0050) +[2024-06-17 22:28:21,996][12645] Fps is (10 sec: 37674.7, 60 sec: 38774.1, 300 sec: 38988.1). Total num frames: 66142208. Throughput: 0: 39370.7. Samples: 66269740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 22:28:21,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:28:22,953][12883] Updated weights for policy 0, policy_version 4040 (0.0052) +[2024-06-17 22:28:26,764][12883] Updated weights for policy 0, policy_version 4050 (0.0040) +[2024-06-17 22:28:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39867.6, 300 sec: 39155.0). Total num frames: 66371584. Throughput: 0: 39582.9. Samples: 66511120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-17 22:28:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:28:30,854][12883] Updated weights for policy 0, policy_version 4060 (0.0026) +[2024-06-17 22:28:31,994][12645] Fps is (10 sec: 39330.6, 60 sec: 39049.1, 300 sec: 38988.4). Total num frames: 66535424. Throughput: 0: 39461.4. Samples: 66625160. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) +[2024-06-17 22:28:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:28:35,211][12883] Updated weights for policy 0, policy_version 4070 (0.0043) +[2024-06-17 22:28:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39869.1, 300 sec: 39155.0). Total num frames: 66764800. Throughput: 0: 39474.8. Samples: 66864080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 22:28:36,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:28:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004075_66764800.pth... +[2024-06-17 22:28:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003503_57393152.pth +[2024-06-17 22:28:39,737][12883] Updated weights for policy 0, policy_version 4080 (0.0047) +[2024-06-17 22:28:41,994][12645] Fps is (10 sec: 39320.7, 60 sec: 39594.5, 300 sec: 39099.7). Total num frames: 66928640. Throughput: 0: 39632.3. Samples: 67097960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-17 22:28:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:28:43,249][12883] Updated weights for policy 0, policy_version 4090 (0.0042) +[2024-06-17 22:28:46,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39321.5, 300 sec: 38988.4). Total num frames: 67125248. Throughput: 0: 39406.7. Samples: 67215360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-17 22:28:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:28:47,817][12883] Updated weights for policy 0, policy_version 4100 (0.0035) +[2024-06-17 22:28:50,030][12862] Signal inference workers to stop experience collection... (950 times) +[2024-06-17 22:28:50,084][12883] InferenceWorker_p0-w0: stopping experience collection (950 times) +[2024-06-17 22:28:50,085][12862] Signal inference workers to resume experience collection... (950 times) +[2024-06-17 22:28:50,101][12883] InferenceWorker_p0-w0: resuming experience collection (950 times) +[2024-06-17 22:28:51,664][12883] Updated weights for policy 0, policy_version 4110 (0.0035) +[2024-06-17 22:28:51,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39867.7, 300 sec: 39099.4). Total num frames: 67338240. Throughput: 0: 39452.6. Samples: 67455020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 22:28:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:28:56,173][12883] Updated weights for policy 0, policy_version 4120 (0.0049) +[2024-06-17 22:28:56,995][12645] Fps is (10 sec: 40954.1, 60 sec: 39320.5, 300 sec: 39043.7). Total num frames: 67534848. Throughput: 0: 39515.9. Samples: 67688060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-17 22:28:56,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:28:59,663][12883] Updated weights for policy 0, policy_version 4130 (0.0042) +[2024-06-17 22:29:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39594.8, 300 sec: 39155.0). Total num frames: 67731456. Throughput: 0: 39361.3. Samples: 67803380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-17 22:29:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:29:04,360][12883] Updated weights for policy 0, policy_version 4140 (0.0043) +[2024-06-17 22:29:06,994][12645] Fps is (10 sec: 37689.1, 60 sec: 39594.8, 300 sec: 39043.9). Total num frames: 67911680. Throughput: 0: 39325.0. Samples: 68039280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-17 22:29:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 22:29:08,645][12883] Updated weights for policy 0, policy_version 4150 (0.0038) +[2024-06-17 22:29:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.5, 300 sec: 39099.5). Total num frames: 68108288. Throughput: 0: 39058.4. Samples: 68268740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-17 22:29:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:29:12,850][12883] Updated weights for policy 0, policy_version 4160 (0.0040) +[2024-06-17 22:29:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39321.6, 300 sec: 39099.4). Total num frames: 68304896. Throughput: 0: 39201.2. Samples: 68389220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-17 22:29:17,003][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:29:17,179][12883] Updated weights for policy 0, policy_version 4170 (0.0043) +[2024-06-17 22:29:21,428][12883] Updated weights for policy 0, policy_version 4180 (0.0058) +[2024-06-17 22:29:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39323.0, 300 sec: 39155.0). Total num frames: 68501504. Throughput: 0: 38879.7. Samples: 68613660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-17 22:29:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:29:25,682][12883] Updated weights for policy 0, policy_version 4190 (0.0044) +[2024-06-17 22:29:26,994][12645] Fps is (10 sec: 37683.9, 60 sec: 38502.5, 300 sec: 38989.1). Total num frames: 68681728. Throughput: 0: 38936.2. Samples: 68850080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-17 22:29:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:29:29,813][12883] Updated weights for policy 0, policy_version 4200 (0.0038) +[2024-06-17 22:29:31,994][12645] Fps is (10 sec: 36044.7, 60 sec: 38775.4, 300 sec: 38877.3). Total num frames: 68861952. Throughput: 0: 38904.0. Samples: 68966040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 22:29:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:29:34,023][12883] Updated weights for policy 0, policy_version 4210 (0.0042) +[2024-06-17 22:29:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 38775.6, 300 sec: 39155.0). Total num frames: 69091328. Throughput: 0: 38812.8. Samples: 69201600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 22:29:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:29:38,005][12883] Updated weights for policy 0, policy_version 4220 (0.0036) +[2024-06-17 22:29:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 39048.6, 300 sec: 39099.5). Total num frames: 69271552. Throughput: 0: 38897.3. Samples: 69438380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-17 22:29:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:29:43,019][12883] Updated weights for policy 0, policy_version 4230 (0.0036) +[2024-06-17 22:29:46,222][12883] Updated weights for policy 0, policy_version 4240 (0.0034) +[2024-06-17 22:29:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39321.7, 300 sec: 39155.0). Total num frames: 69484544. Throughput: 0: 38853.0. Samples: 69551760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-17 22:29:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:29:51,617][12883] Updated weights for policy 0, policy_version 4250 (0.0037) +[2024-06-17 22:29:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 38502.3, 300 sec: 38988.4). Total num frames: 69648384. Throughput: 0: 38810.6. Samples: 69785760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:29:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:29:55,178][12883] Updated weights for policy 0, policy_version 4260 (0.0049) +[2024-06-17 22:29:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38776.5, 300 sec: 39043.9). Total num frames: 69861376. Throughput: 0: 38781.8. Samples: 70013920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-17 22:29:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:30:00,011][12883] Updated weights for policy 0, policy_version 4270 (0.0034) +[2024-06-17 22:30:01,996][12645] Fps is (10 sec: 40951.3, 60 sec: 38774.1, 300 sec: 39099.2). Total num frames: 70057984. Throughput: 0: 38716.8. Samples: 70131560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-17 22:30:01,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:30:03,680][12883] Updated weights for policy 0, policy_version 4280 (0.0047) +[2024-06-17 22:30:06,996][12645] Fps is (10 sec: 36036.6, 60 sec: 38500.9, 300 sec: 38988.1). Total num frames: 70221824. Throughput: 0: 38911.9. Samples: 70364780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-17 22:30:06,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:30:08,360][12883] Updated weights for policy 0, policy_version 4290 (0.0046) +[2024-06-17 22:30:11,792][12883] Updated weights for policy 0, policy_version 4300 (0.0048) +[2024-06-17 22:30:11,994][12645] Fps is (10 sec: 39330.4, 60 sec: 39048.5, 300 sec: 39099.5). Total num frames: 70451200. Throughput: 0: 38790.6. Samples: 70595660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-17 22:30:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:30:16,346][12862] Signal inference workers to stop experience collection... (1000 times) +[2024-06-17 22:30:16,356][12862] Signal inference workers to resume experience collection... (1000 times) +[2024-06-17 22:30:16,383][12883] InferenceWorker_p0-w0: stopping experience collection (1000 times) +[2024-06-17 22:30:16,383][12883] InferenceWorker_p0-w0: resuming experience collection (1000 times) +[2024-06-17 22:30:16,544][12883] Updated weights for policy 0, policy_version 4310 (0.0041) +[2024-06-17 22:30:16,994][12645] Fps is (10 sec: 40969.5, 60 sec: 38775.5, 300 sec: 39043.9). Total num frames: 70631424. Throughput: 0: 38949.0. Samples: 70718740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 22:30:16,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 22:30:20,476][12883] Updated weights for policy 0, policy_version 4320 (0.0040) +[2024-06-17 22:30:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38775.4, 300 sec: 38988.4). Total num frames: 70828032. Throughput: 0: 38758.6. Samples: 70945740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 22:30:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:30:24,759][12883] Updated weights for policy 0, policy_version 4330 (0.0046) +[2024-06-17 22:30:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39321.5, 300 sec: 39155.7). Total num frames: 71041024. Throughput: 0: 38699.5. Samples: 71179860. Policy #0 lag: (min: 0.0, avg: 7.2, max: 19.0) +[2024-06-17 22:30:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:30:28,810][12883] Updated weights for policy 0, policy_version 4340 (0.0038) +[2024-06-17 22:30:31,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39048.6, 300 sec: 38932.8). Total num frames: 71204864. Throughput: 0: 38713.8. Samples: 71293880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) +[2024-06-17 22:30:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:30:33,233][12883] Updated weights for policy 0, policy_version 4350 (0.0034) +[2024-06-17 22:30:36,782][12883] Updated weights for policy 0, policy_version 4360 (0.0032) +[2024-06-17 22:30:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.5, 300 sec: 39155.0). Total num frames: 71434240. Throughput: 0: 38819.2. Samples: 71532620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-17 22:30:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:30:37,031][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004360_71434240.pth... +[2024-06-17 22:30:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003786_62029824.pth +[2024-06-17 22:30:41,492][12883] Updated weights for policy 0, policy_version 4370 (0.0039) +[2024-06-17 22:30:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39048.6, 300 sec: 39043.9). Total num frames: 71614464. Throughput: 0: 38863.6. Samples: 71762780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-17 22:30:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:30:45,577][12883] Updated weights for policy 0, policy_version 4380 (0.0038) +[2024-06-17 22:30:46,994][12645] Fps is (10 sec: 36044.3, 60 sec: 38502.3, 300 sec: 39099.4). Total num frames: 71794688. Throughput: 0: 38686.2. Samples: 71872360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:30:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:30:50,172][12883] Updated weights for policy 0, policy_version 4390 (0.0041) +[2024-06-17 22:30:51,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38775.5, 300 sec: 38988.4). Total num frames: 71974912. Throughput: 0: 38623.3. Samples: 72102740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-17 22:30:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:30:54,275][12883] Updated weights for policy 0, policy_version 4400 (0.0037) +[2024-06-17 22:30:56,994][12645] Fps is (10 sec: 37683.6, 60 sec: 38502.4, 300 sec: 38932.8). Total num frames: 72171520. Throughput: 0: 38835.5. Samples: 72343260. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-17 22:30:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:30:58,441][12883] Updated weights for policy 0, policy_version 4410 (0.0035) +[2024-06-17 22:31:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 38776.8, 300 sec: 39043.9). Total num frames: 72384512. Throughput: 0: 38692.8. Samples: 72459920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 22:31:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:31:03,174][12883] Updated weights for policy 0, policy_version 4420 (0.0042) +[2024-06-17 22:31:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38776.9, 300 sec: 38932.8). Total num frames: 72548352. Throughput: 0: 38675.2. Samples: 72686120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:31:07,000][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:31:07,230][12883] Updated weights for policy 0, policy_version 4430 (0.0039) +[2024-06-17 22:31:11,139][12883] Updated weights for policy 0, policy_version 4440 (0.0029) +[2024-06-17 22:31:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 38775.5, 300 sec: 39099.5). Total num frames: 72777728. Throughput: 0: 38605.5. Samples: 72917100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) +[2024-06-17 22:31:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:31:15,227][12883] Updated weights for policy 0, policy_version 4450 (0.0045) +[2024-06-17 22:31:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 38775.4, 300 sec: 39043.9). Total num frames: 72957952. Throughput: 0: 38891.4. Samples: 73044000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 22:31:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:31:19,498][12883] Updated weights for policy 0, policy_version 4460 (0.0050) +[2024-06-17 22:31:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38775.5, 300 sec: 39044.2). Total num frames: 73154560. Throughput: 0: 38571.5. Samples: 73268340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 22:31:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:31:23,682][12883] Updated weights for policy 0, policy_version 4470 (0.0031) +[2024-06-17 22:31:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38229.3, 300 sec: 38932.8). Total num frames: 73334784. Throughput: 0: 38820.8. Samples: 73509720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 22:31:26,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:31:28,053][12883] Updated weights for policy 0, policy_version 4480 (0.0038) +[2024-06-17 22:31:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39048.4, 300 sec: 39043.9). Total num frames: 73547776. Throughput: 0: 38940.0. Samples: 73624660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 22:31:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:31:32,153][12883] Updated weights for policy 0, policy_version 4490 (0.0036) +[2024-06-17 22:31:36,114][12883] Updated weights for policy 0, policy_version 4500 (0.0032) +[2024-06-17 22:31:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 38775.4, 300 sec: 39099.4). Total num frames: 73760768. Throughput: 0: 39182.2. Samples: 73865940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) +[2024-06-17 22:31:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:31:40,892][12883] Updated weights for policy 0, policy_version 4510 (0.0045) +[2024-06-17 22:31:41,994][12645] Fps is (10 sec: 37684.0, 60 sec: 38502.4, 300 sec: 38988.4). Total num frames: 73924608. Throughput: 0: 39103.6. Samples: 74102920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-17 22:31:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:31:44,539][12883] Updated weights for policy 0, policy_version 4520 (0.0031) +[2024-06-17 22:31:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39321.6, 300 sec: 39099.4). Total num frames: 74153984. Throughput: 0: 39022.7. Samples: 74215940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) +[2024-06-17 22:31:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:31:49,430][12883] Updated weights for policy 0, policy_version 4530 (0.0031) +[2024-06-17 22:31:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39321.6, 300 sec: 38988.4). Total num frames: 74334208. Throughput: 0: 39291.0. Samples: 74454220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-17 22:31:51,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:31:52,655][12883] Updated weights for policy 0, policy_version 4540 (0.0036) +[2024-06-17 22:31:56,994][12645] Fps is (10 sec: 36045.2, 60 sec: 39048.5, 300 sec: 38988.4). Total num frames: 74514432. Throughput: 0: 39376.8. Samples: 74689060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-17 22:31:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:31:58,004][12883] Updated weights for policy 0, policy_version 4550 (0.0045) +[2024-06-17 22:31:59,709][12862] Signal inference workers to stop experience collection... (1050 times) +[2024-06-17 22:31:59,710][12862] Signal inference workers to resume experience collection... (1050 times) +[2024-06-17 22:31:59,726][12883] InferenceWorker_p0-w0: stopping experience collection (1050 times) +[2024-06-17 22:31:59,727][12883] InferenceWorker_p0-w0: resuming experience collection (1050 times) +[2024-06-17 22:32:00,741][12883] Updated weights for policy 0, policy_version 4560 (0.0032) +[2024-06-17 22:32:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39048.6, 300 sec: 39155.0). Total num frames: 74727424. Throughput: 0: 39277.5. Samples: 74811480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 22:32:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:32:05,971][12883] Updated weights for policy 0, policy_version 4570 (0.0042) +[2024-06-17 22:32:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.6, 300 sec: 39155.0). Total num frames: 74924032. Throughput: 0: 39609.7. Samples: 75050780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-17 22:32:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:32:09,362][12883] Updated weights for policy 0, policy_version 4580 (0.0047) +[2024-06-17 22:32:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39048.5, 300 sec: 39044.4). Total num frames: 75120640. Throughput: 0: 39201.9. Samples: 75273800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 22:32:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:32:14,160][12883] Updated weights for policy 0, policy_version 4590 (0.0035) +[2024-06-17 22:32:16,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 38988.4). Total num frames: 75317248. Throughput: 0: 39324.2. Samples: 75394240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 22:32:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:32:17,391][12883] Updated weights for policy 0, policy_version 4600 (0.0033) +[2024-06-17 22:32:21,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39048.5, 300 sec: 39043.9). Total num frames: 75497472. Throughput: 0: 39197.4. Samples: 75629820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:32:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:32:22,526][12883] Updated weights for policy 0, policy_version 4610 (0.0035) +[2024-06-17 22:32:26,755][12883] Updated weights for policy 0, policy_version 4620 (0.0044) +[2024-06-17 22:32:26,994][12645] Fps is (10 sec: 37682.4, 60 sec: 39321.6, 300 sec: 38988.4). Total num frames: 75694080. Throughput: 0: 39137.6. Samples: 75864120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 22:32:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:32:30,940][12883] Updated weights for policy 0, policy_version 4630 (0.0041) +[2024-06-17 22:32:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39048.6, 300 sec: 39044.2). Total num frames: 75890688. Throughput: 0: 39192.9. Samples: 75979620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-17 22:32:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:32:34,668][12883] Updated weights for policy 0, policy_version 4640 (0.0041) +[2024-06-17 22:32:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 39048.6, 300 sec: 39155.0). Total num frames: 76103680. Throughput: 0: 39091.7. Samples: 76213340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 22:32:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:32:37,137][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004646_76120064.pth... +[2024-06-17 22:32:37,179][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004075_66764800.pth +[2024-06-17 22:32:39,426][12883] Updated weights for policy 0, policy_version 4650 (0.0040) +[2024-06-17 22:32:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39321.6, 300 sec: 39043.9). Total num frames: 76283904. Throughput: 0: 38985.4. Samples: 76443400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 22:32:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:32:43,281][12883] Updated weights for policy 0, policy_version 4660 (0.0040) +[2024-06-17 22:32:46,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38775.5, 300 sec: 39099.4). Total num frames: 76480512. Throughput: 0: 38947.9. Samples: 76564140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:32:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:32:47,808][12883] Updated weights for policy 0, policy_version 4670 (0.0038) +[2024-06-17 22:32:51,681][12883] Updated weights for policy 0, policy_version 4680 (0.0042) +[2024-06-17 22:32:52,000][12645] Fps is (10 sec: 39296.9, 60 sec: 39044.5, 300 sec: 38987.5). Total num frames: 76677120. Throughput: 0: 38771.6. Samples: 76795740. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) +[2024-06-17 22:32:52,001][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:32:56,112][12883] Updated weights for policy 0, policy_version 4690 (0.0040) +[2024-06-17 22:32:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 39043.9). Total num frames: 76873728. Throughput: 0: 38980.0. Samples: 77027900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:32:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:33:00,508][12883] Updated weights for policy 0, policy_version 4700 (0.0038) +[2024-06-17 22:33:01,994][12645] Fps is (10 sec: 39346.1, 60 sec: 39048.5, 300 sec: 39099.5). Total num frames: 77070336. Throughput: 0: 38817.2. Samples: 77141020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) +[2024-06-17 22:33:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:33:04,216][12883] Updated weights for policy 0, policy_version 4710 (0.0028) +[2024-06-17 22:33:06,994][12645] Fps is (10 sec: 37682.8, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 77250560. Throughput: 0: 38755.1. Samples: 77373800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:33:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:33:08,601][12883] Updated weights for policy 0, policy_version 4720 (0.0029) +[2024-06-17 22:33:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39048.5, 300 sec: 39043.9). Total num frames: 77463552. Throughput: 0: 38914.3. Samples: 77615260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 22:33:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:33:12,951][12883] Updated weights for policy 0, policy_version 4730 (0.0043) +[2024-06-17 22:33:16,961][12883] Updated weights for policy 0, policy_version 4740 (0.0046) +[2024-06-17 22:33:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39048.5, 300 sec: 39044.2). Total num frames: 77660160. Throughput: 0: 38884.9. Samples: 77729440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 22:33:16,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 22:33:20,974][12883] Updated weights for policy 0, policy_version 4750 (0.0035) +[2024-06-17 22:33:21,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39321.7, 300 sec: 38932.8). Total num frames: 77856768. Throughput: 0: 38985.8. Samples: 77967700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-17 22:33:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:33:25,163][12883] Updated weights for policy 0, policy_version 4760 (0.0038) +[2024-06-17 22:33:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 38775.6, 300 sec: 38932.8). Total num frames: 78020608. Throughput: 0: 39204.4. Samples: 78207600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 22:33:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:33:29,154][12883] Updated weights for policy 0, policy_version 4770 (0.0048) +[2024-06-17 22:33:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39321.7, 300 sec: 38932.9). Total num frames: 78249984. Throughput: 0: 39081.0. Samples: 78322780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 22:33:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:33:33,426][12883] Updated weights for policy 0, policy_version 4780 (0.0034) +[2024-06-17 22:33:36,996][12645] Fps is (10 sec: 40950.8, 60 sec: 38774.0, 300 sec: 38988.1). Total num frames: 78430208. Throughput: 0: 39333.8. Samples: 78565600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-17 22:33:36,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:33:37,666][12883] Updated weights for policy 0, policy_version 4790 (0.0031) +[2024-06-17 22:33:41,507][12883] Updated weights for policy 0, policy_version 4800 (0.0034) +[2024-06-17 22:33:41,996][12645] Fps is (10 sec: 39312.7, 60 sec: 39320.1, 300 sec: 39043.6). Total num frames: 78643200. Throughput: 0: 39067.8. Samples: 78786040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-17 22:33:41,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:33:46,152][12883] Updated weights for policy 0, policy_version 4810 (0.0029) +[2024-06-17 22:33:46,994][12645] Fps is (10 sec: 40968.9, 60 sec: 39321.6, 300 sec: 38988.3). Total num frames: 78839808. Throughput: 0: 39423.1. Samples: 78915060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-17 22:33:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:33:47,758][12862] Signal inference workers to stop experience collection... (1100 times) +[2024-06-17 22:33:47,795][12883] InferenceWorker_p0-w0: stopping experience collection (1100 times) +[2024-06-17 22:33:47,869][12862] Signal inference workers to resume experience collection... (1100 times) +[2024-06-17 22:33:47,869][12883] InferenceWorker_p0-w0: resuming experience collection (1100 times) +[2024-06-17 22:33:50,297][12883] Updated weights for policy 0, policy_version 4820 (0.0041) +[2024-06-17 22:33:51,994][12645] Fps is (10 sec: 36052.5, 60 sec: 38779.4, 300 sec: 38877.5). Total num frames: 79003648. Throughput: 0: 39278.6. Samples: 79141340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-17 22:33:51,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:33:54,955][12883] Updated weights for policy 0, policy_version 4830 (0.0037) +[2024-06-17 22:33:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39594.6, 300 sec: 39043.9). Total num frames: 79249408. Throughput: 0: 39044.9. Samples: 79372280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 20.0) +[2024-06-17 22:33:56,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:33:58,310][12883] Updated weights for policy 0, policy_version 4840 (0.0037) +[2024-06-17 22:34:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38775.4, 300 sec: 38932.8). Total num frames: 79396864. Throughput: 0: 39341.7. Samples: 79499820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 22:34:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:34:03,308][12883] Updated weights for policy 0, policy_version 4850 (0.0046) +[2024-06-17 22:34:06,685][12883] Updated weights for policy 0, policy_version 4860 (0.0034) +[2024-06-17 22:34:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39867.8, 300 sec: 39099.4). Total num frames: 79642624. Throughput: 0: 39129.8. Samples: 79728540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 22:34:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:34:11,446][12883] Updated weights for policy 0, policy_version 4870 (0.0031) +[2024-06-17 22:34:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 79790080. Throughput: 0: 39154.6. Samples: 79969560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-17 22:34:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:34:15,260][12883] Updated weights for policy 0, policy_version 4880 (0.0025) +[2024-06-17 22:34:16,994][12645] Fps is (10 sec: 36044.7, 60 sec: 39048.5, 300 sec: 38988.4). Total num frames: 80003072. Throughput: 0: 39081.3. Samples: 80081440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) +[2024-06-17 22:34:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:34:20,153][12883] Updated weights for policy 0, policy_version 4890 (0.0049) +[2024-06-17 22:34:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 39594.6, 300 sec: 39155.0). Total num frames: 80232448. Throughput: 0: 39027.7. Samples: 80321760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-17 22:34:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:34:23,251][12883] Updated weights for policy 0, policy_version 4900 (0.0037) +[2024-06-17 22:34:26,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39321.6, 300 sec: 39043.9). Total num frames: 80379904. Throughput: 0: 39550.0. Samples: 80565700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) +[2024-06-17 22:34:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:34:28,057][12883] Updated weights for policy 0, policy_version 4910 (0.0030) +[2024-06-17 22:34:31,096][12883] Updated weights for policy 0, policy_version 4920 (0.0039) +[2024-06-17 22:34:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39594.7, 300 sec: 39099.4). Total num frames: 80625664. Throughput: 0: 39297.0. Samples: 80683420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-17 22:34:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:34:36,389][12883] Updated weights for policy 0, policy_version 4930 (0.0040) +[2024-06-17 22:34:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 39596.2, 300 sec: 39099.5). Total num frames: 80805888. Throughput: 0: 39839.7. Samples: 80934120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 22:34:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:34:37,034][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004933_80822272.pth... +[2024-06-17 22:34:37,101][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004360_71434240.pth +[2024-06-17 22:34:39,590][12883] Updated weights for policy 0, policy_version 4940 (0.0037) +[2024-06-17 22:34:41,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39050.0, 300 sec: 38988.4). Total num frames: 80986112. Throughput: 0: 39868.9. Samples: 81166380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 22:34:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:34:44,454][12883] Updated weights for policy 0, policy_version 4950 (0.0046) +[2024-06-17 22:34:46,994][12645] Fps is (10 sec: 39320.9, 60 sec: 39321.6, 300 sec: 39155.0). Total num frames: 81199104. Throughput: 0: 39682.7. Samples: 81285540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) +[2024-06-17 22:34:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:34:47,606][12883] Updated weights for policy 0, policy_version 4960 (0.0035) +[2024-06-17 22:34:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39594.7, 300 sec: 39043.9). Total num frames: 81379328. Throughput: 0: 39872.8. Samples: 81522820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 22:34:51,999][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:34:52,583][12883] Updated weights for policy 0, policy_version 4970 (0.0040) +[2024-06-17 22:34:56,152][12883] Updated weights for policy 0, policy_version 4980 (0.0042) +[2024-06-17 22:34:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.5, 300 sec: 39099.7). Total num frames: 81592320. Throughput: 0: 39445.8. Samples: 81744620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-17 22:34:57,000][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:35:01,140][12883] Updated weights for policy 0, policy_version 4990 (0.0044) +[2024-06-17 22:35:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 39867.8, 300 sec: 39210.8). Total num frames: 81788928. Throughput: 0: 39666.3. Samples: 81866420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) +[2024-06-17 22:35:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:35:03,951][12862] Signal inference workers to stop experience collection... (1150 times) +[2024-06-17 22:35:04,004][12883] InferenceWorker_p0-w0: stopping experience collection (1150 times) +[2024-06-17 22:35:04,010][12862] Signal inference workers to resume experience collection... (1150 times) +[2024-06-17 22:35:04,019][12883] InferenceWorker_p0-w0: resuming experience collection (1150 times) +[2024-06-17 22:35:04,150][12883] Updated weights for policy 0, policy_version 5000 (0.0041) +[2024-06-17 22:35:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38775.5, 300 sec: 39043.9). Total num frames: 81969152. Throughput: 0: 39422.7. Samples: 82095780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-17 22:35:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:35:09,713][12883] Updated weights for policy 0, policy_version 5010 (0.0039) +[2024-06-17 22:35:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.9, 300 sec: 39155.0). Total num frames: 82182144. Throughput: 0: 39269.9. Samples: 82332840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-17 22:35:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:35:12,587][12883] Updated weights for policy 0, policy_version 5020 (0.0033) +[2024-06-17 22:35:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.7, 300 sec: 39155.0). Total num frames: 82378752. Throughput: 0: 39400.0. Samples: 82456420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-17 22:35:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:35:18,119][12883] Updated weights for policy 0, policy_version 5030 (0.0033) +[2024-06-17 22:35:21,302][12883] Updated weights for policy 0, policy_version 5040 (0.0037) +[2024-06-17 22:35:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39048.6, 300 sec: 39099.5). Total num frames: 82575360. Throughput: 0: 39004.8. Samples: 82689340. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) +[2024-06-17 22:35:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:35:26,018][12883] Updated weights for policy 0, policy_version 5050 (0.0039) +[2024-06-17 22:35:26,994][12645] Fps is (10 sec: 37682.6, 60 sec: 39594.6, 300 sec: 39155.0). Total num frames: 82755584. Throughput: 0: 39154.6. Samples: 82928340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:35:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:35:29,532][12883] Updated weights for policy 0, policy_version 5060 (0.0038) +[2024-06-17 22:35:31,996][12645] Fps is (10 sec: 37674.8, 60 sec: 38774.0, 300 sec: 39043.6). Total num frames: 82952192. Throughput: 0: 38963.5. Samples: 83038980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 22:35:31,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:35:34,686][12883] Updated weights for policy 0, policy_version 5070 (0.0054) +[2024-06-17 22:35:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39321.6, 300 sec: 39155.0). Total num frames: 83165184. Throughput: 0: 39090.3. Samples: 83281880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:35:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:35:38,078][12883] Updated weights for policy 0, policy_version 5080 (0.0028) +[2024-06-17 22:35:41,994][12645] Fps is (10 sec: 39330.4, 60 sec: 39321.6, 300 sec: 39155.0). Total num frames: 83345408. Throughput: 0: 39481.4. Samples: 83521280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:35:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:35:42,555][12883] Updated weights for policy 0, policy_version 5090 (0.0035) +[2024-06-17 22:35:46,179][12883] Updated weights for policy 0, policy_version 5100 (0.0032) +[2024-06-17 22:35:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.7, 300 sec: 39321.6). Total num frames: 83574784. Throughput: 0: 39407.1. Samples: 83639740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) +[2024-06-17 22:35:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:35:51,210][12883] Updated weights for policy 0, policy_version 5110 (0.0036) +[2024-06-17 22:35:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39321.7, 300 sec: 39210.5). Total num frames: 83738624. Throughput: 0: 39418.3. Samples: 83869600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 22:35:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:35:54,868][12883] Updated weights for policy 0, policy_version 5120 (0.0026) +[2024-06-17 22:35:56,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39048.5, 300 sec: 39155.0). Total num frames: 83935232. Throughput: 0: 39381.2. Samples: 84105000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 22:35:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:35:59,592][12883] Updated weights for policy 0, policy_version 5130 (0.0029) +[2024-06-17 22:36:01,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38775.4, 300 sec: 39210.5). Total num frames: 84115456. Throughput: 0: 39195.5. Samples: 84220220. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) +[2024-06-17 22:36:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:36:03,421][12883] Updated weights for policy 0, policy_version 5140 (0.0045) +[2024-06-17 22:36:06,994][12645] Fps is (10 sec: 40960.7, 60 sec: 39594.7, 300 sec: 39210.5). Total num frames: 84344832. Throughput: 0: 39234.7. Samples: 84454900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-17 22:36:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 22:36:07,844][12883] Updated weights for policy 0, policy_version 5150 (0.0030) +[2024-06-17 22:36:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39048.4, 300 sec: 39210.5). Total num frames: 84525056. Throughput: 0: 39216.9. Samples: 84693100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-17 22:36:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:36:12,180][12883] Updated weights for policy 0, policy_version 5160 (0.0038) +[2024-06-17 22:36:16,135][12883] Updated weights for policy 0, policy_version 5170 (0.0035) +[2024-06-17 22:36:16,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39048.6, 300 sec: 39210.5). Total num frames: 84721664. Throughput: 0: 39359.8. Samples: 84810080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-17 22:36:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:36:20,331][12883] Updated weights for policy 0, policy_version 5180 (0.0035) +[2024-06-17 22:36:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39321.5, 300 sec: 39321.6). Total num frames: 84934656. Throughput: 0: 39171.4. Samples: 85044600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) +[2024-06-17 22:36:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:36:22,079][12862] Signal inference workers to stop experience collection... (1200 times) +[2024-06-17 22:36:22,080][12862] Signal inference workers to resume experience collection... (1200 times) +[2024-06-17 22:36:22,129][12883] InferenceWorker_p0-w0: stopping experience collection (1200 times) +[2024-06-17 22:36:22,129][12883] InferenceWorker_p0-w0: resuming experience collection (1200 times) +[2024-06-17 22:36:24,462][12883] Updated weights for policy 0, policy_version 5190 (0.0046) +[2024-06-17 22:36:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39321.7, 300 sec: 39210.5). Total num frames: 85114880. Throughput: 0: 39308.4. Samples: 85290160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 22:36:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:36:28,844][12883] Updated weights for policy 0, policy_version 5200 (0.0043) +[2024-06-17 22:36:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39596.1, 300 sec: 39210.5). Total num frames: 85327872. Throughput: 0: 39227.1. Samples: 85404960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-17 22:36:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:36:32,633][12883] Updated weights for policy 0, policy_version 5210 (0.0043) +[2024-06-17 22:36:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39048.4, 300 sec: 39266.0). Total num frames: 85508096. Throughput: 0: 39445.2. Samples: 85644640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-17 22:36:36,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:36:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005220_85524480.pth... +[2024-06-17 22:36:37,007][12883] Updated weights for policy 0, policy_version 5220 (0.0045) +[2024-06-17 22:36:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004646_76120064.pth +[2024-06-17 22:36:40,693][12883] Updated weights for policy 0, policy_version 5230 (0.0039) +[2024-06-17 22:36:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.6, 300 sec: 39266.1). Total num frames: 85737472. Throughput: 0: 39278.7. Samples: 85872540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 22:36:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:36:45,497][12883] Updated weights for policy 0, policy_version 5240 (0.0042) +[2024-06-17 22:36:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 39321.6, 300 sec: 39321.6). Total num frames: 85934080. Throughput: 0: 39699.9. Samples: 86006720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 22:36:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:36:48,640][12883] Updated weights for policy 0, policy_version 5250 (0.0034) +[2024-06-17 22:36:51,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39321.5, 300 sec: 39266.1). Total num frames: 86097920. Throughput: 0: 39630.6. Samples: 86238280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 22:36:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:36:53,565][12883] Updated weights for policy 0, policy_version 5260 (0.0034) +[2024-06-17 22:36:56,730][12883] Updated weights for policy 0, policy_version 5270 (0.0041) +[2024-06-17 22:36:56,995][12645] Fps is (10 sec: 40953.1, 60 sec: 40139.7, 300 sec: 39376.9). Total num frames: 86343680. Throughput: 0: 39563.9. Samples: 86473540. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) +[2024-06-17 22:36:56,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:37:01,848][12883] Updated weights for policy 0, policy_version 5280 (0.0040) +[2024-06-17 22:37:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39867.8, 300 sec: 39266.1). Total num frames: 86507520. Throughput: 0: 39780.8. Samples: 86600220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-17 22:37:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:37:04,904][12883] Updated weights for policy 0, policy_version 5290 (0.0052) +[2024-06-17 22:37:06,994][12645] Fps is (10 sec: 39328.0, 60 sec: 39867.6, 300 sec: 39377.1). Total num frames: 86736896. Throughput: 0: 39798.7. Samples: 86835540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-17 22:37:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:37:10,419][12883] Updated weights for policy 0, policy_version 5300 (0.0032) +[2024-06-17 22:37:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 39321.6). Total num frames: 86917120. Throughput: 0: 39564.0. Samples: 87070540. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) +[2024-06-17 22:37:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:37:13,230][12883] Updated weights for policy 0, policy_version 5310 (0.0046) +[2024-06-17 22:37:16,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39594.6, 300 sec: 39321.6). Total num frames: 87097344. Throughput: 0: 39635.6. Samples: 87188560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-17 22:37:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:37:18,474][12883] Updated weights for policy 0, policy_version 5320 (0.0041) +[2024-06-17 22:37:21,353][12883] Updated weights for policy 0, policy_version 5330 (0.0058) +[2024-06-17 22:37:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40140.9, 300 sec: 39488.2). Total num frames: 87343104. Throughput: 0: 39705.0. Samples: 87431360. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) +[2024-06-17 22:37:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:37:26,498][12883] Updated weights for policy 0, policy_version 5340 (0.0048) +[2024-06-17 22:37:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.7, 300 sec: 39321.6). Total num frames: 87490560. Throughput: 0: 39983.7. Samples: 87671800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) +[2024-06-17 22:37:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:37:28,861][12862] Signal inference workers to stop experience collection... (1250 times) +[2024-06-17 22:37:28,879][12883] InferenceWorker_p0-w0: stopping experience collection (1250 times) +[2024-06-17 22:37:28,981][12862] Signal inference workers to resume experience collection... (1250 times) +[2024-06-17 22:37:28,981][12883] InferenceWorker_p0-w0: resuming experience collection (1250 times) +[2024-06-17 22:37:29,747][12883] Updated weights for policy 0, policy_version 5350 (0.0029) +[2024-06-17 22:37:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39867.8, 300 sec: 39377.1). Total num frames: 87719936. Throughput: 0: 39523.1. Samples: 87785260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-17 22:37:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:37:34,954][12883] Updated weights for policy 0, policy_version 5360 (0.0045) +[2024-06-17 22:37:36,994][12645] Fps is (10 sec: 40959.2, 60 sec: 39867.7, 300 sec: 39377.1). Total num frames: 87900160. Throughput: 0: 39788.4. Samples: 88028760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) +[2024-06-17 22:37:36,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:37:38,093][12883] Updated weights for policy 0, policy_version 5370 (0.0044) +[2024-06-17 22:37:41,994][12645] Fps is (10 sec: 34406.3, 60 sec: 38775.5, 300 sec: 39266.1). Total num frames: 88064000. Throughput: 0: 39719.7. Samples: 88260860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) +[2024-06-17 22:37:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:37:43,496][12883] Updated weights for policy 0, policy_version 5380 (0.0040) +[2024-06-17 22:37:46,539][12883] Updated weights for policy 0, policy_version 5390 (0.0043) +[2024-06-17 22:37:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 39867.7, 300 sec: 39489.0). Total num frames: 88326144. Throughput: 0: 39476.3. Samples: 88376660. Policy #0 lag: (min: 0.0, avg: 7.0, max: 19.0) +[2024-06-17 22:37:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:37:51,654][12883] Updated weights for policy 0, policy_version 5400 (0.0024) +[2024-06-17 22:37:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.7, 300 sec: 39377.1). Total num frames: 88489984. Throughput: 0: 39686.7. Samples: 88621440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) +[2024-06-17 22:37:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:37:54,954][12883] Updated weights for policy 0, policy_version 5410 (0.0036) +[2024-06-17 22:37:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39595.8, 300 sec: 39488.2). Total num frames: 88719360. Throughput: 0: 39493.8. Samples: 88847760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-17 22:37:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:37:59,951][12883] Updated weights for policy 0, policy_version 5420 (0.0047) +[2024-06-17 22:38:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39594.7, 300 sec: 39432.7). Total num frames: 88883200. Throughput: 0: 39569.5. Samples: 88969180. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) +[2024-06-17 22:38:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:38:03,796][12883] Updated weights for policy 0, policy_version 5430 (0.0055) +[2024-06-17 22:38:06,994][12645] Fps is (10 sec: 32767.6, 60 sec: 38502.4, 300 sec: 39266.1). Total num frames: 89047040. Throughput: 0: 39119.5. Samples: 89191740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-17 22:38:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:38:08,375][12883] Updated weights for policy 0, policy_version 5440 (0.0030) +[2024-06-17 22:38:11,955][12883] Updated weights for policy 0, policy_version 5450 (0.0043) +[2024-06-17 22:38:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39594.6, 300 sec: 39432.7). Total num frames: 89292800. Throughput: 0: 39101.2. Samples: 89431360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) +[2024-06-17 22:38:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:38:16,826][12883] Updated weights for policy 0, policy_version 5460 (0.0053) +[2024-06-17 22:38:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39321.6, 300 sec: 39321.6). Total num frames: 89456640. Throughput: 0: 39303.1. Samples: 89553900. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) +[2024-06-17 22:38:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:38:20,039][12883] Updated weights for policy 0, policy_version 5470 (0.0040) +[2024-06-17 22:38:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.5, 300 sec: 39543.8). Total num frames: 89686016. Throughput: 0: 38977.9. Samples: 89782760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 22:38:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:38:25,246][12883] Updated weights for policy 0, policy_version 5480 (0.0050) +[2024-06-17 22:38:27,000][12645] Fps is (10 sec: 40934.8, 60 sec: 39590.5, 300 sec: 39376.3). Total num frames: 89866240. Throughput: 0: 39188.8. Samples: 90024600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 22:38:27,001][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:38:28,463][12883] Updated weights for policy 0, policy_version 5490 (0.0044) +[2024-06-17 22:38:31,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38775.4, 300 sec: 39377.4). Total num frames: 90046464. Throughput: 0: 39074.2. Samples: 90135000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 22:38:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:38:33,503][12883] Updated weights for policy 0, policy_version 5500 (0.0047) +[2024-06-17 22:38:36,697][12883] Updated weights for policy 0, policy_version 5510 (0.0031) +[2024-06-17 22:38:36,994][12645] Fps is (10 sec: 40985.5, 60 sec: 39594.7, 300 sec: 39433.0). Total num frames: 90275840. Throughput: 0: 39012.9. Samples: 90377020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-17 22:38:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:38:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005511_90292224.pth... +[2024-06-17 22:38:37,048][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004933_80822272.pth +[2024-06-17 22:38:41,576][12883] Updated weights for policy 0, policy_version 5520 (0.0053) +[2024-06-17 22:38:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39867.8, 300 sec: 39377.1). Total num frames: 90456064. Throughput: 0: 39280.9. Samples: 90615400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) +[2024-06-17 22:38:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:38:45,042][12883] Updated weights for policy 0, policy_version 5530 (0.0031) +[2024-06-17 22:38:46,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38775.4, 300 sec: 39488.2). Total num frames: 90652672. Throughput: 0: 39130.5. Samples: 90730060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 22:38:46,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:38:49,496][12883] Updated weights for policy 0, policy_version 5540 (0.0036) +[2024-06-17 22:38:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39321.6). Total num frames: 90849280. Throughput: 0: 39449.0. Samples: 90966940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-17 22:38:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:38:53,695][12883] Updated weights for policy 0, policy_version 5550 (0.0036) +[2024-06-17 22:38:53,728][12862] Signal inference workers to stop experience collection... (1300 times) +[2024-06-17 22:38:53,729][12862] Signal inference workers to resume experience collection... (1300 times) +[2024-06-17 22:38:53,749][12883] InferenceWorker_p0-w0: stopping experience collection (1300 times) +[2024-06-17 22:38:53,749][12883] InferenceWorker_p0-w0: resuming experience collection (1300 times) +[2024-06-17 22:38:56,994][12645] Fps is (10 sec: 37683.9, 60 sec: 38502.4, 300 sec: 39432.7). Total num frames: 91029504. Throughput: 0: 39389.4. Samples: 91203880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 27.0) +[2024-06-17 22:38:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:38:58,289][12883] Updated weights for policy 0, policy_version 5560 (0.0028) +[2024-06-17 22:39:01,892][12883] Updated weights for policy 0, policy_version 5570 (0.0060) +[2024-06-17 22:39:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.5, 300 sec: 39377.1). Total num frames: 91258880. Throughput: 0: 39279.1. Samples: 91321460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 22:39:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:39:06,812][12883] Updated weights for policy 0, policy_version 5580 (0.0044) +[2024-06-17 22:39:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 91439104. Throughput: 0: 39400.0. Samples: 91555760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:39:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:39:10,360][12883] Updated weights for policy 0, policy_version 5590 (0.0035) +[2024-06-17 22:39:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 91635712. Throughput: 0: 39134.3. Samples: 91785400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 22:39:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 22:39:15,036][12883] Updated weights for policy 0, policy_version 5600 (0.0034) +[2024-06-17 22:39:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39321.7, 300 sec: 39266.1). Total num frames: 91815936. Throughput: 0: 39495.6. Samples: 91912300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-17 22:39:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:39:18,923][12883] Updated weights for policy 0, policy_version 5610 (0.0044) +[2024-06-17 22:39:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 38775.4, 300 sec: 39432.7). Total num frames: 92012544. Throughput: 0: 39024.4. Samples: 92133120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 22:39:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:39:23,207][12883] Updated weights for policy 0, policy_version 5620 (0.0046) +[2024-06-17 22:39:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39325.7, 300 sec: 39321.6). Total num frames: 92225536. Throughput: 0: 39107.2. Samples: 92375220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 22:39:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:39:27,040][12883] Updated weights for policy 0, policy_version 5630 (0.0029) +[2024-06-17 22:39:31,652][12883] Updated weights for policy 0, policy_version 5640 (0.0041) +[2024-06-17 22:39:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39321.6, 300 sec: 39321.6). Total num frames: 92405760. Throughput: 0: 39144.1. Samples: 92491540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 22:39:32,003][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:39:35,471][12883] Updated weights for policy 0, policy_version 5650 (0.0041) +[2024-06-17 22:39:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 92618752. Throughput: 0: 39138.7. Samples: 92728180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 22:39:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:39:40,015][12883] Updated weights for policy 0, policy_version 5660 (0.0040) +[2024-06-17 22:39:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39321.6, 300 sec: 39377.1). Total num frames: 92815360. Throughput: 0: 39178.1. Samples: 92966900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) +[2024-06-17 22:39:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:39:44,224][12883] Updated weights for policy 0, policy_version 5670 (0.0038) +[2024-06-17 22:39:46,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39048.5, 300 sec: 39377.1). Total num frames: 92995584. Throughput: 0: 39172.9. Samples: 93084240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-17 22:39:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:39:47,931][12883] Updated weights for policy 0, policy_version 5680 (0.0031) +[2024-06-17 22:39:51,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39321.7, 300 sec: 39377.2). Total num frames: 93208576. Throughput: 0: 39350.8. Samples: 93326540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:39:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:39:52,043][12883] Updated weights for policy 0, policy_version 5690 (0.0039) +[2024-06-17 22:39:56,340][12883] Updated weights for policy 0, policy_version 5700 (0.0046) +[2024-06-17 22:39:56,994][12645] Fps is (10 sec: 40961.1, 60 sec: 39594.7, 300 sec: 39377.2). Total num frames: 93405184. Throughput: 0: 39470.4. Samples: 93561560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-17 22:39:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:40:00,737][12883] Updated weights for policy 0, policy_version 5710 (0.0056) +[2024-06-17 22:40:01,994][12645] Fps is (10 sec: 37682.6, 60 sec: 38775.5, 300 sec: 39377.1). Total num frames: 93585408. Throughput: 0: 39237.7. Samples: 93678000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-17 22:40:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:40:04,409][12883] Updated weights for policy 0, policy_version 5720 (0.0040) +[2024-06-17 22:40:06,994][12645] Fps is (10 sec: 40958.9, 60 sec: 39594.6, 300 sec: 39432.6). Total num frames: 93814784. Throughput: 0: 39654.6. Samples: 93917580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 22:40:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:40:08,840][12883] Updated weights for policy 0, policy_version 5730 (0.0036) +[2024-06-17 22:40:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39321.6, 300 sec: 39377.1). Total num frames: 93995008. Throughput: 0: 39565.3. Samples: 94155660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-17 22:40:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:40:13,152][12883] Updated weights for policy 0, policy_version 5740 (0.0034) +[2024-06-17 22:40:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39594.6, 300 sec: 39377.1). Total num frames: 94191616. Throughput: 0: 39598.1. Samples: 94273460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-17 22:40:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:40:17,320][12883] Updated weights for policy 0, policy_version 5750 (0.0032) +[2024-06-17 22:40:21,248][12883] Updated weights for policy 0, policy_version 5760 (0.0044) +[2024-06-17 22:40:21,556][12862] Signal inference workers to stop experience collection... (1350 times) +[2024-06-17 22:40:21,613][12883] InferenceWorker_p0-w0: stopping experience collection (1350 times) +[2024-06-17 22:40:21,614][12862] Signal inference workers to resume experience collection... (1350 times) +[2024-06-17 22:40:21,630][12883] InferenceWorker_p0-w0: resuming experience collection (1350 times) +[2024-06-17 22:40:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 94404608. Throughput: 0: 39713.4. Samples: 94515280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-17 22:40:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:40:25,895][12883] Updated weights for policy 0, policy_version 5770 (0.0042) +[2024-06-17 22:40:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39594.6, 300 sec: 39488.5). Total num frames: 94601216. Throughput: 0: 39742.3. Samples: 94755300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) +[2024-06-17 22:40:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:40:29,703][12883] Updated weights for policy 0, policy_version 5780 (0.0036) +[2024-06-17 22:40:31,994][12645] Fps is (10 sec: 39320.6, 60 sec: 39867.6, 300 sec: 39432.6). Total num frames: 94797824. Throughput: 0: 39784.9. Samples: 94874560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 22:40:31,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:40:33,668][12883] Updated weights for policy 0, policy_version 5790 (0.0052) +[2024-06-17 22:40:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.7, 300 sec: 39488.2). Total num frames: 94994432. Throughput: 0: 39701.2. Samples: 95113100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-17 22:40:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:40:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005798_94994432.pth... +[2024-06-17 22:40:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005220_85524480.pth +[2024-06-17 22:40:37,730][12883] Updated weights for policy 0, policy_version 5800 (0.0045) +[2024-06-17 22:40:41,521][12883] Updated weights for policy 0, policy_version 5810 (0.0041) +[2024-06-17 22:40:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39594.7, 300 sec: 39377.1). Total num frames: 95191040. Throughput: 0: 39692.8. Samples: 95347740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 22:40:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:40:46,396][12883] Updated weights for policy 0, policy_version 5820 (0.0048) +[2024-06-17 22:40:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 39543.8). Total num frames: 95404032. Throughput: 0: 39869.0. Samples: 95472100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-17 22:40:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:40:49,891][12883] Updated weights for policy 0, policy_version 5830 (0.0047) +[2024-06-17 22:40:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39321.5, 300 sec: 39432.7). Total num frames: 95567872. Throughput: 0: 39649.0. Samples: 95701780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-17 22:40:52,000][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:40:54,438][12883] Updated weights for policy 0, policy_version 5840 (0.0036) +[2024-06-17 22:40:56,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39594.5, 300 sec: 39543.7). Total num frames: 95780864. Throughput: 0: 39476.4. Samples: 95932100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-17 22:40:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:40:58,046][12883] Updated weights for policy 0, policy_version 5850 (0.0049) +[2024-06-17 22:41:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39594.7, 300 sec: 39377.1). Total num frames: 95961088. Throughput: 0: 39654.3. Samples: 96057900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-17 22:41:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:41:02,622][12883] Updated weights for policy 0, policy_version 5860 (0.0046) +[2024-06-17 22:41:06,757][12883] Updated weights for policy 0, policy_version 5870 (0.0043) +[2024-06-17 22:41:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 39488.2). Total num frames: 96174080. Throughput: 0: 39375.1. Samples: 96287160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 22:41:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:41:11,160][12883] Updated weights for policy 0, policy_version 5880 (0.0037) +[2024-06-17 22:41:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.7, 300 sec: 39488.2). Total num frames: 96370688. Throughput: 0: 39164.9. Samples: 96517720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-17 22:41:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:41:15,468][12883] Updated weights for policy 0, policy_version 5890 (0.0062) +[2024-06-17 22:41:16,994][12645] Fps is (10 sec: 37682.4, 60 sec: 39321.6, 300 sec: 39377.1). Total num frames: 96550912. Throughput: 0: 39240.5. Samples: 96640380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-17 22:41:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:41:19,203][12883] Updated weights for policy 0, policy_version 5900 (0.0047) +[2024-06-17 22:41:21,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 96747520. Throughput: 0: 39172.4. Samples: 96875860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 22:41:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:41:23,713][12883] Updated weights for policy 0, policy_version 5910 (0.0036) +[2024-06-17 22:41:26,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39321.6, 300 sec: 39432.7). Total num frames: 96960512. Throughput: 0: 39210.3. Samples: 97112200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) +[2024-06-17 22:41:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:41:27,191][12883] Updated weights for policy 0, policy_version 5920 (0.0040) +[2024-06-17 22:41:31,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38775.6, 300 sec: 39377.2). Total num frames: 97124352. Throughput: 0: 38893.7. Samples: 97222320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) +[2024-06-17 22:41:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:41:32,998][12883] Updated weights for policy 0, policy_version 5930 (0.0042) +[2024-06-17 22:41:35,547][12883] Updated weights for policy 0, policy_version 5940 (0.0040) +[2024-06-17 22:41:36,996][12645] Fps is (10 sec: 40950.5, 60 sec: 39593.2, 300 sec: 39432.4). Total num frames: 97370112. Throughput: 0: 39061.6. Samples: 97459640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-17 22:41:36,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:41:41,116][12883] Updated weights for policy 0, policy_version 5950 (0.0034) +[2024-06-17 22:41:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38502.4, 300 sec: 39210.5). Total num frames: 97501184. Throughput: 0: 39389.0. Samples: 97704600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-17 22:41:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:41:44,133][12883] Updated weights for policy 0, policy_version 5960 (0.0035) +[2024-06-17 22:41:46,994][12645] Fps is (10 sec: 37691.5, 60 sec: 39048.5, 300 sec: 39488.2). Total num frames: 97746944. Throughput: 0: 38973.3. Samples: 97811700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-17 22:41:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:41:49,576][12883] Updated weights for policy 0, policy_version 5970 (0.0030) +[2024-06-17 22:41:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 39594.6, 300 sec: 39321.8). Total num frames: 97943552. Throughput: 0: 39321.2. Samples: 98056620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-17 22:41:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:41:52,228][12883] Updated weights for policy 0, policy_version 5980 (0.0049) +[2024-06-17 22:41:52,836][12862] Signal inference workers to stop experience collection... (1400 times) +[2024-06-17 22:41:52,880][12883] InferenceWorker_p0-w0: stopping experience collection (1400 times) +[2024-06-17 22:41:52,885][12862] Signal inference workers to resume experience collection... (1400 times) +[2024-06-17 22:41:52,892][12883] InferenceWorker_p0-w0: resuming experience collection (1400 times) +[2024-06-17 22:41:56,996][12645] Fps is (10 sec: 36037.0, 60 sec: 38774.1, 300 sec: 39321.3). Total num frames: 98107392. Throughput: 0: 39408.2. Samples: 98291180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 22:41:56,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:41:57,819][12883] Updated weights for policy 0, policy_version 5990 (0.0039) +[2024-06-17 22:42:00,828][12883] Updated weights for policy 0, policy_version 6000 (0.0037) +[2024-06-17 22:42:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39867.7, 300 sec: 39377.1). Total num frames: 98353152. Throughput: 0: 39328.5. Samples: 98410160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 22:42:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:42:05,998][12883] Updated weights for policy 0, policy_version 6010 (0.0041) +[2024-06-17 22:42:06,994][12645] Fps is (10 sec: 40968.7, 60 sec: 39048.4, 300 sec: 39321.6). Total num frames: 98516992. Throughput: 0: 39455.1. Samples: 98651340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) +[2024-06-17 22:42:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:42:09,130][12883] Updated weights for policy 0, policy_version 6020 (0.0029) +[2024-06-17 22:42:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39321.6, 300 sec: 39432.7). Total num frames: 98729984. Throughput: 0: 39147.5. Samples: 98873840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 22:42:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:42:14,362][12883] Updated weights for policy 0, policy_version 6030 (0.0031) +[2024-06-17 22:42:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39594.8, 300 sec: 39266.1). Total num frames: 98926592. Throughput: 0: 39530.7. Samples: 99001200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-17 22:42:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:42:17,195][12883] Updated weights for policy 0, policy_version 6040 (0.0044) +[2024-06-17 22:42:21,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39048.6, 300 sec: 39321.6). Total num frames: 99090432. Throughput: 0: 39555.8. Samples: 99239560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-17 22:42:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:42:22,722][12883] Updated weights for policy 0, policy_version 6050 (0.0034) +[2024-06-17 22:42:25,226][12883] Updated weights for policy 0, policy_version 6060 (0.0044) +[2024-06-17 22:42:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39594.6, 300 sec: 39377.1). Total num frames: 99336192. Throughput: 0: 39252.0. Samples: 99470940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-17 22:42:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:42:30,874][12883] Updated weights for policy 0, policy_version 6070 (0.0048) +[2024-06-17 22:42:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.7, 300 sec: 39321.6). Total num frames: 99500032. Throughput: 0: 39652.6. Samples: 99596060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) +[2024-06-17 22:42:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:42:33,833][12883] Updated weights for policy 0, policy_version 6080 (0.0038) +[2024-06-17 22:42:36,994][12645] Fps is (10 sec: 36044.5, 60 sec: 38776.9, 300 sec: 39432.7). Total num frames: 99696640. Throughput: 0: 39326.2. Samples: 99826300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) +[2024-06-17 22:42:36,999][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:42:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006085_99696640.pth... +[2024-06-17 22:42:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005511_90292224.pth +[2024-06-17 22:42:39,010][12883] Updated weights for policy 0, policy_version 6090 (0.0038) +[2024-06-17 22:42:41,994][12645] Fps is (10 sec: 42597.4, 60 sec: 40413.7, 300 sec: 39321.6). Total num frames: 99926016. Throughput: 0: 39317.4. Samples: 100060380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-17 22:42:41,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:42:42,159][12883] Updated weights for policy 0, policy_version 6100 (0.0036) +[2024-06-17 22:42:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39048.6, 300 sec: 39321.6). Total num frames: 100089856. Throughput: 0: 39301.4. Samples: 100178720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-17 22:42:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:42:47,314][12883] Updated weights for policy 0, policy_version 6110 (0.0037) +[2024-06-17 22:42:50,762][12883] Updated weights for policy 0, policy_version 6120 (0.0040) +[2024-06-17 22:42:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.7, 300 sec: 39321.6). Total num frames: 100319232. Throughput: 0: 39233.4. Samples: 100416840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 22:42:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:42:55,728][12883] Updated weights for policy 0, policy_version 6130 (0.0032) +[2024-06-17 22:42:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40142.2, 300 sec: 39432.7). Total num frames: 100515840. Throughput: 0: 39426.1. Samples: 100648020. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-17 22:42:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:42:58,727][12883] Updated weights for policy 0, policy_version 6140 (0.0039) +[2024-06-17 22:43:01,994][12645] Fps is (10 sec: 34406.7, 60 sec: 38502.5, 300 sec: 39377.2). Total num frames: 100663296. Throughput: 0: 39216.9. Samples: 100765960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-17 22:43:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:43:03,873][12883] Updated weights for policy 0, policy_version 6150 (0.0048) +[2024-06-17 22:43:06,818][12883] Updated weights for policy 0, policy_version 6160 (0.0045) +[2024-06-17 22:43:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40140.8, 300 sec: 39432.7). Total num frames: 100925440. Throughput: 0: 39316.8. Samples: 101008820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-17 22:43:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:43:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39048.5, 300 sec: 39377.1). Total num frames: 101072896. Throughput: 0: 39606.6. Samples: 101253240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-17 22:43:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:43:12,122][12883] Updated weights for policy 0, policy_version 6170 (0.0036) +[2024-06-17 22:43:14,307][12862] Signal inference workers to stop experience collection... (1450 times) +[2024-06-17 22:43:14,364][12883] InferenceWorker_p0-w0: stopping experience collection (1450 times) +[2024-06-17 22:43:14,421][12862] Signal inference workers to resume experience collection... (1450 times) +[2024-06-17 22:43:14,422][12883] InferenceWorker_p0-w0: resuming experience collection (1450 times) +[2024-06-17 22:43:14,943][12883] Updated weights for policy 0, policy_version 6180 (0.0046) +[2024-06-17 22:43:16,996][12645] Fps is (10 sec: 36037.0, 60 sec: 39320.1, 300 sec: 39321.3). Total num frames: 101285888. Throughput: 0: 39227.3. Samples: 101361380. Policy #0 lag: (min: 2.0, avg: 9.7, max: 20.0) +[2024-06-17 22:43:16,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:43:20,354][12883] Updated weights for policy 0, policy_version 6190 (0.0039) +[2024-06-17 22:43:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.7, 300 sec: 39378.0). Total num frames: 101482496. Throughput: 0: 39442.2. Samples: 101601200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-17 22:43:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:43:23,482][12883] Updated weights for policy 0, policy_version 6200 (0.0041) +[2024-06-17 22:43:26,994][12645] Fps is (10 sec: 37691.5, 60 sec: 38775.5, 300 sec: 39377.1). Total num frames: 101662720. Throughput: 0: 39587.7. Samples: 101841820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) +[2024-06-17 22:43:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:43:28,634][12883] Updated weights for policy 0, policy_version 6210 (0.0042) +[2024-06-17 22:43:31,571][12883] Updated weights for policy 0, policy_version 6220 (0.0033) +[2024-06-17 22:43:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40140.8, 300 sec: 39432.7). Total num frames: 101908480. Throughput: 0: 39482.7. Samples: 101955440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-17 22:43:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:43:36,818][12883] Updated weights for policy 0, policy_version 6230 (0.0044) +[2024-06-17 22:43:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39594.7, 300 sec: 39377.1). Total num frames: 102072320. Throughput: 0: 39662.7. Samples: 102201660. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) +[2024-06-17 22:43:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:43:40,552][12883] Updated weights for policy 0, policy_version 6240 (0.0039) +[2024-06-17 22:43:41,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39048.6, 300 sec: 39377.1). Total num frames: 102268928. Throughput: 0: 39624.5. Samples: 102431120. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) +[2024-06-17 22:43:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:43:44,952][12883] Updated weights for policy 0, policy_version 6250 (0.0038) +[2024-06-17 22:43:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.7, 300 sec: 39377.1). Total num frames: 102465536. Throughput: 0: 39774.2. Samples: 102555800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 22:43:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:43:48,647][12883] Updated weights for policy 0, policy_version 6260 (0.0038) +[2024-06-17 22:43:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38775.5, 300 sec: 39377.1). Total num frames: 102645760. Throughput: 0: 39660.5. Samples: 102793540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-17 22:43:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:43:53,603][12883] Updated weights for policy 0, policy_version 6270 (0.0040) +[2024-06-17 22:43:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39321.6, 300 sec: 39377.1). Total num frames: 102875136. Throughput: 0: 39199.5. Samples: 103017220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-17 22:43:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:43:57,566][12883] Updated weights for policy 0, policy_version 6280 (0.0036) +[2024-06-17 22:44:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39594.6, 300 sec: 39321.6). Total num frames: 103038976. Throughput: 0: 39669.9. Samples: 103146440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 22:44:01,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:44:02,130][12883] Updated weights for policy 0, policy_version 6290 (0.0041) +[2024-06-17 22:44:05,461][12883] Updated weights for policy 0, policy_version 6300 (0.0035) +[2024-06-17 22:44:06,996][12645] Fps is (10 sec: 37674.9, 60 sec: 38774.1, 300 sec: 39376.8). Total num frames: 103251968. Throughput: 0: 39567.9. Samples: 103381840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:44:06,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:44:10,234][12883] Updated weights for policy 0, policy_version 6310 (0.0038) +[2024-06-17 22:44:11,995][12645] Fps is (10 sec: 44229.7, 60 sec: 40139.7, 300 sec: 39543.5). Total num frames: 103481344. Throughput: 0: 39484.7. Samples: 103618700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:44:11,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:44:13,995][12883] Updated weights for policy 0, policy_version 6320 (0.0047) +[2024-06-17 22:44:16,996][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39432.4). Total num frames: 103645184. Throughput: 0: 39542.9. Samples: 103734960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 22:44:16,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:44:18,533][12883] Updated weights for policy 0, policy_version 6330 (0.0044) +[2024-06-17 22:44:21,994][12645] Fps is (10 sec: 39328.0, 60 sec: 39867.7, 300 sec: 39488.2). Total num frames: 103874560. Throughput: 0: 39377.8. Samples: 103973660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-17 22:44:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:44:22,001][12883] Updated weights for policy 0, policy_version 6340 (0.0048) +[2024-06-17 22:44:26,612][12883] Updated weights for policy 0, policy_version 6350 (0.0036) +[2024-06-17 22:44:26,994][12645] Fps is (10 sec: 40969.4, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 104054784. Throughput: 0: 39653.8. Samples: 104215540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 22:44:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:44:30,070][12883] Updated weights for policy 0, policy_version 6360 (0.0038) +[2024-06-17 22:44:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 104251392. Throughput: 0: 39516.0. Samples: 104334020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 20.0) +[2024-06-17 22:44:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:44:34,162][12862] Signal inference workers to stop experience collection... (1500 times) +[2024-06-17 22:44:34,162][12862] Signal inference workers to resume experience collection... (1500 times) +[2024-06-17 22:44:34,185][12883] InferenceWorker_p0-w0: stopping experience collection (1500 times) +[2024-06-17 22:44:34,185][12883] InferenceWorker_p0-w0: resuming experience collection (1500 times) +[2024-06-17 22:44:34,617][12883] Updated weights for policy 0, policy_version 6370 (0.0034) +[2024-06-17 22:44:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39867.7, 300 sec: 39488.2). Total num frames: 104464384. Throughput: 0: 39579.4. Samples: 104574620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 26.0) +[2024-06-17 22:44:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:44:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006376_104464384.pth... +[2024-06-17 22:44:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005798_94994432.pth +[2024-06-17 22:44:38,116][12883] Updated weights for policy 0, policy_version 6380 (0.0035) +[2024-06-17 22:44:41,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39048.5, 300 sec: 39377.2). Total num frames: 104611840. Throughput: 0: 39950.2. Samples: 104814980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 26.0) +[2024-06-17 22:44:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:44:43,122][12883] Updated weights for policy 0, policy_version 6390 (0.0035) +[2024-06-17 22:44:46,297][12883] Updated weights for policy 0, policy_version 6400 (0.0038) +[2024-06-17 22:44:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.7, 300 sec: 39488.2). Total num frames: 104857600. Throughput: 0: 39528.0. Samples: 104925200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-17 22:44:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:44:51,466][12883] Updated weights for policy 0, policy_version 6410 (0.0040) +[2024-06-17 22:44:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.7, 300 sec: 39432.7). Total num frames: 105037824. Throughput: 0: 39734.8. Samples: 105169820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-17 22:44:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:44:54,385][12883] Updated weights for policy 0, policy_version 6420 (0.0032) +[2024-06-17 22:44:56,994][12645] Fps is (10 sec: 37683.9, 60 sec: 39321.7, 300 sec: 39488.2). Total num frames: 105234432. Throughput: 0: 39588.2. Samples: 105400100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-17 22:44:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:44:59,739][12883] Updated weights for policy 0, policy_version 6430 (0.0040) +[2024-06-17 22:45:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40414.0, 300 sec: 39488.2). Total num frames: 105463808. Throughput: 0: 39782.5. Samples: 105525080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-17 22:45:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:45:03,230][12883] Updated weights for policy 0, policy_version 6440 (0.0035) +[2024-06-17 22:45:06,997][12645] Fps is (10 sec: 37671.7, 60 sec: 39321.1, 300 sec: 39376.7). Total num frames: 105611264. Throughput: 0: 39653.4. Samples: 105758180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) +[2024-06-17 22:45:06,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:45:08,345][12883] Updated weights for policy 0, policy_version 6450 (0.0041) +[2024-06-17 22:45:11,259][12883] Updated weights for policy 0, policy_version 6460 (0.0047) +[2024-06-17 22:45:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39595.8, 300 sec: 39543.8). Total num frames: 105857024. Throughput: 0: 39410.2. Samples: 105989000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) +[2024-06-17 22:45:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:45:16,473][12883] Updated weights for policy 0, policy_version 6470 (0.0031) +[2024-06-17 22:45:16,994][12645] Fps is (10 sec: 40972.5, 60 sec: 39596.2, 300 sec: 39377.1). Total num frames: 106020864. Throughput: 0: 39609.8. Samples: 106116460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) +[2024-06-17 22:45:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:45:19,698][12883] Updated weights for policy 0, policy_version 6480 (0.0029) +[2024-06-17 22:45:21,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39321.7, 300 sec: 39432.7). Total num frames: 106233856. Throughput: 0: 39395.7. Samples: 106347420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-17 22:45:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:45:24,560][12883] Updated weights for policy 0, policy_version 6490 (0.0027) +[2024-06-17 22:45:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 106446848. Throughput: 0: 39537.8. Samples: 106594180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:45:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:45:27,619][12883] Updated weights for policy 0, policy_version 6500 (0.0037) +[2024-06-17 22:45:31,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39048.5, 300 sec: 39321.6). Total num frames: 106594304. Throughput: 0: 39711.7. Samples: 106712220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 22:45:31,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:45:32,741][12883] Updated weights for policy 0, policy_version 6510 (0.0039) +[2024-06-17 22:45:36,070][12883] Updated weights for policy 0, policy_version 6520 (0.0027) +[2024-06-17 22:45:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.9, 300 sec: 39543.8). Total num frames: 106856448. Throughput: 0: 39664.1. Samples: 106954700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) +[2024-06-17 22:45:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:45:40,820][12883] Updated weights for policy 0, policy_version 6530 (0.0038) +[2024-06-17 22:45:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 40413.9, 300 sec: 39432.7). Total num frames: 107036672. Throughput: 0: 39781.2. Samples: 107190260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) +[2024-06-17 22:45:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:45:44,386][12883] Updated weights for policy 0, policy_version 6540 (0.0049) +[2024-06-17 22:45:46,994][12645] Fps is (10 sec: 34405.8, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 107200512. Throughput: 0: 39610.5. Samples: 107307560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) +[2024-06-17 22:45:47,003][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:45:49,230][12883] Updated weights for policy 0, policy_version 6550 (0.0034) +[2024-06-17 22:45:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.8, 300 sec: 39543.8). Total num frames: 107446272. Throughput: 0: 39830.1. Samples: 107550420. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) +[2024-06-17 22:45:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:45:52,589][12883] Updated weights for policy 0, policy_version 6560 (0.0035) +[2024-06-17 22:45:56,295][12862] Signal inference workers to stop experience collection... (1550 times) +[2024-06-17 22:45:56,295][12862] Signal inference workers to resume experience collection... (1550 times) +[2024-06-17 22:45:56,336][12883] InferenceWorker_p0-w0: stopping experience collection (1550 times) +[2024-06-17 22:45:56,336][12883] InferenceWorker_p0-w0: resuming experience collection (1550 times) +[2024-06-17 22:45:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39594.6, 300 sec: 39488.2). Total num frames: 107610112. Throughput: 0: 40048.5. Samples: 107791180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-17 22:45:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:45:57,550][12883] Updated weights for policy 0, policy_version 6570 (0.0032) +[2024-06-17 22:46:00,607][12883] Updated weights for policy 0, policy_version 6580 (0.0025) +[2024-06-17 22:46:01,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39321.5, 300 sec: 39488.2). Total num frames: 107823104. Throughput: 0: 39822.5. Samples: 107908480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-17 22:46:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:46:05,635][12883] Updated weights for policy 0, policy_version 6590 (0.0045) +[2024-06-17 22:46:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40415.8, 300 sec: 39543.7). Total num frames: 108036096. Throughput: 0: 40073.3. Samples: 108150720. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) +[2024-06-17 22:46:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:46:08,758][12883] Updated weights for policy 0, policy_version 6600 (0.0032) +[2024-06-17 22:46:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.6, 300 sec: 39488.2). Total num frames: 108199936. Throughput: 0: 39814.6. Samples: 108385840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-17 22:46:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:46:13,982][12883] Updated weights for policy 0, policy_version 6610 (0.0038) +[2024-06-17 22:46:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.8, 300 sec: 39654.8). Total num frames: 108445696. Throughput: 0: 39714.7. Samples: 108499380. Policy #0 lag: (min: 2.0, avg: 10.5, max: 25.0) +[2024-06-17 22:46:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:46:17,057][12883] Updated weights for policy 0, policy_version 6620 (0.0031) +[2024-06-17 22:46:21,996][12883] Updated weights for policy 0, policy_version 6630 (0.0036) +[2024-06-17 22:46:21,996][12645] Fps is (10 sec: 42588.8, 60 sec: 39866.2, 300 sec: 39543.4). Total num frames: 108625920. Throughput: 0: 39760.6. Samples: 108744020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-17 22:46:21,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:46:25,307][12883] Updated weights for policy 0, policy_version 6640 (0.0039) +[2024-06-17 22:46:26,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39594.6, 300 sec: 39654.8). Total num frames: 108822528. Throughput: 0: 39638.2. Samples: 108973980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-17 22:46:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:46:30,075][12883] Updated weights for policy 0, policy_version 6650 (0.0030) +[2024-06-17 22:46:31,994][12645] Fps is (10 sec: 37691.7, 60 sec: 40140.8, 300 sec: 39433.0). Total num frames: 109002752. Throughput: 0: 39711.7. Samples: 109094580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 22:46:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:46:33,734][12883] Updated weights for policy 0, policy_version 6660 (0.0039) +[2024-06-17 22:46:36,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39048.5, 300 sec: 39654.8). Total num frames: 109199360. Throughput: 0: 39606.8. Samples: 109332720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-17 22:46:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:46:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006665_109199360.pth... +[2024-06-17 22:46:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006085_99696640.pth +[2024-06-17 22:46:38,697][12883] Updated weights for policy 0, policy_version 6670 (0.0034) +[2024-06-17 22:46:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 39867.8, 300 sec: 39599.3). Total num frames: 109428736. Throughput: 0: 39501.4. Samples: 109568740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 22:46:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:46:42,015][12883] Updated weights for policy 0, policy_version 6680 (0.0029) +[2024-06-17 22:46:46,727][12883] Updated weights for policy 0, policy_version 6690 (0.0031) +[2024-06-17 22:46:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40413.9, 300 sec: 39599.3). Total num frames: 109625344. Throughput: 0: 39761.3. Samples: 109697740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-17 22:46:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:46:50,204][12883] Updated weights for policy 0, policy_version 6700 (0.0037) +[2024-06-17 22:46:51,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39321.6, 300 sec: 39655.1). Total num frames: 109805568. Throughput: 0: 39487.6. Samples: 109927660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-17 22:46:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:46:54,874][12883] Updated weights for policy 0, policy_version 6710 (0.0034) +[2024-06-17 22:46:56,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40140.9, 300 sec: 39543.8). Total num frames: 110018560. Throughput: 0: 39596.0. Samples: 110167660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:46:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:46:58,693][12883] Updated weights for policy 0, policy_version 6720 (0.0053) +[2024-06-17 22:47:01,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39321.7, 300 sec: 39543.8). Total num frames: 110182400. Throughput: 0: 39729.8. Samples: 110287220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-17 22:47:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:47:03,290][12883] Updated weights for policy 0, policy_version 6730 (0.0051) +[2024-06-17 22:47:06,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39321.7, 300 sec: 39543.8). Total num frames: 110395392. Throughput: 0: 39362.0. Samples: 110515220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 22:47:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:47:07,750][12883] Updated weights for policy 0, policy_version 6740 (0.0037) +[2024-06-17 22:47:11,347][12883] Updated weights for policy 0, policy_version 6750 (0.0040) +[2024-06-17 22:47:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40140.7, 300 sec: 39599.3). Total num frames: 110608384. Throughput: 0: 39517.3. Samples: 110752260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) +[2024-06-17 22:47:11,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:47:15,741][12883] Updated weights for policy 0, policy_version 6760 (0.0026) +[2024-06-17 22:47:16,994][12645] Fps is (10 sec: 39320.9, 60 sec: 39048.5, 300 sec: 39654.8). Total num frames: 110788608. Throughput: 0: 39666.6. Samples: 110879580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 22:47:16,998][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:47:19,503][12883] Updated weights for policy 0, policy_version 6770 (0.0044) +[2024-06-17 22:47:21,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39323.1, 300 sec: 39488.2). Total num frames: 110985216. Throughput: 0: 39560.9. Samples: 111112960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 22:47:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:47:24,276][12883] Updated weights for policy 0, policy_version 6780 (0.0040) +[2024-06-17 22:47:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 111181824. Throughput: 0: 39768.8. Samples: 111358340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-17 22:47:26,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-17 22:47:27,010][12862] Saving new best policy, reward=0.013! +[2024-06-17 22:47:27,712][12883] Updated weights for policy 0, policy_version 6790 (0.0041) +[2024-06-17 22:47:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39867.7, 300 sec: 39654.8). Total num frames: 111394816. Throughput: 0: 39492.5. Samples: 111474900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-17 22:47:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:47:32,152][12883] Updated weights for policy 0, policy_version 6800 (0.0048) +[2024-06-17 22:47:34,457][12862] Signal inference workers to stop experience collection... (1600 times) +[2024-06-17 22:47:34,502][12883] InferenceWorker_p0-w0: stopping experience collection (1600 times) +[2024-06-17 22:47:34,514][12862] Signal inference workers to resume experience collection... (1600 times) +[2024-06-17 22:47:34,529][12883] InferenceWorker_p0-w0: resuming experience collection (1600 times) +[2024-06-17 22:47:35,783][12883] Updated weights for policy 0, policy_version 6810 (0.0050) +[2024-06-17 22:47:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40413.7, 300 sec: 39654.8). Total num frames: 111624192. Throughput: 0: 39598.1. Samples: 111709580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) +[2024-06-17 22:47:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:47:40,827][12883] Updated weights for policy 0, policy_version 6820 (0.0045) +[2024-06-17 22:47:41,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38775.4, 300 sec: 39543.8). Total num frames: 111755264. Throughput: 0: 39667.9. Samples: 111952720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) +[2024-06-17 22:47:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:47:44,091][12883] Updated weights for policy 0, policy_version 6830 (0.0023) +[2024-06-17 22:47:46,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39594.7, 300 sec: 39599.3). Total num frames: 112001024. Throughput: 0: 39340.8. Samples: 112057560. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-17 22:47:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:47:49,179][12883] Updated weights for policy 0, policy_version 6840 (0.0039) +[2024-06-17 22:47:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39594.6, 300 sec: 39543.8). Total num frames: 112181248. Throughput: 0: 39728.7. Samples: 112303020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 22:47:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:47:52,667][12883] Updated weights for policy 0, policy_version 6850 (0.0036) +[2024-06-17 22:47:56,994][12645] Fps is (10 sec: 34406.6, 60 sec: 38775.4, 300 sec: 39599.3). Total num frames: 112345088. Throughput: 0: 39771.2. Samples: 112541960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) +[2024-06-17 22:47:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:47:57,344][12883] Updated weights for policy 0, policy_version 6860 (0.0040) +[2024-06-17 22:48:00,643][12883] Updated weights for policy 0, policy_version 6870 (0.0033) +[2024-06-17 22:48:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40140.7, 300 sec: 39543.8). Total num frames: 112590848. Throughput: 0: 39459.6. Samples: 112655260. Policy #0 lag: (min: 1.0, avg: 12.6, max: 22.0) +[2024-06-17 22:48:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:48:06,053][12883] Updated weights for policy 0, policy_version 6880 (0.0045) +[2024-06-17 22:48:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 112754688. Throughput: 0: 39630.7. Samples: 112896340. Policy #0 lag: (min: 1.0, avg: 12.6, max: 22.0) +[2024-06-17 22:48:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:48:09,189][12883] Updated weights for policy 0, policy_version 6890 (0.0045) +[2024-06-17 22:48:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39594.6, 300 sec: 39655.1). Total num frames: 112984064. Throughput: 0: 39437.3. Samples: 113133020. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) +[2024-06-17 22:48:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:48:14,134][12883] Updated weights for policy 0, policy_version 6900 (0.0033) +[2024-06-17 22:48:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.8, 300 sec: 39654.9). Total num frames: 113180672. Throughput: 0: 39605.0. Samples: 113257120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 22:48:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:48:17,431][12883] Updated weights for policy 0, policy_version 6910 (0.0054) +[2024-06-17 22:48:21,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39321.5, 300 sec: 39599.3). Total num frames: 113344512. Throughput: 0: 39524.5. Samples: 113488180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 22:48:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:48:22,740][12883] Updated weights for policy 0, policy_version 6920 (0.0050) +[2024-06-17 22:48:25,632][12883] Updated weights for policy 0, policy_version 6930 (0.0032) +[2024-06-17 22:48:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.9, 300 sec: 39599.3). Total num frames: 113590272. Throughput: 0: 39256.1. Samples: 113719240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-17 22:48:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:48:31,105][12883] Updated weights for policy 0, policy_version 6940 (0.0044) +[2024-06-17 22:48:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 113754112. Throughput: 0: 39709.3. Samples: 113844480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-17 22:48:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:48:33,974][12883] Updated weights for policy 0, policy_version 6950 (0.0038) +[2024-06-17 22:48:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39048.7, 300 sec: 39654.8). Total num frames: 113967104. Throughput: 0: 39470.4. Samples: 114079180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-17 22:48:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:48:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006956_113967104.pth... +[2024-06-17 22:48:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006376_104464384.pth +[2024-06-17 22:48:39,329][12883] Updated weights for policy 0, policy_version 6960 (0.0043) +[2024-06-17 22:48:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40414.0, 300 sec: 39710.4). Total num frames: 114180096. Throughput: 0: 39540.5. Samples: 114321280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-17 22:48:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:48:42,070][12883] Updated weights for policy 0, policy_version 6970 (0.0041) +[2024-06-17 22:48:46,994][12645] Fps is (10 sec: 36045.2, 60 sec: 38775.6, 300 sec: 39599.3). Total num frames: 114327552. Throughput: 0: 39589.0. Samples: 114436760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-17 22:48:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:48:47,275][12883] Updated weights for policy 0, policy_version 6980 (0.0030) +[2024-06-17 22:48:50,293][12862] Signal inference workers to stop experience collection... (1650 times) +[2024-06-17 22:48:50,293][12862] Signal inference workers to resume experience collection... (1650 times) +[2024-06-17 22:48:50,342][12883] InferenceWorker_p0-w0: stopping experience collection (1650 times) +[2024-06-17 22:48:50,342][12883] InferenceWorker_p0-w0: resuming experience collection (1650 times) +[2024-06-17 22:48:50,474][12883] Updated weights for policy 0, policy_version 6990 (0.0044) +[2024-06-17 22:48:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39867.8, 300 sec: 39654.8). Total num frames: 114573312. Throughput: 0: 39578.6. Samples: 114677380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-17 22:48:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:48:55,236][12883] Updated weights for policy 0, policy_version 7000 (0.0047) +[2024-06-17 22:48:56,994][12645] Fps is (10 sec: 40959.0, 60 sec: 39867.6, 300 sec: 39654.8). Total num frames: 114737152. Throughput: 0: 39927.1. Samples: 114929740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-17 22:48:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:48:58,834][12883] Updated weights for policy 0, policy_version 7010 (0.0027) +[2024-06-17 22:49:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.7, 300 sec: 39710.7). Total num frames: 114966528. Throughput: 0: 39608.5. Samples: 115039500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-17 22:49:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:03,142][12883] Updated weights for policy 0, policy_version 7020 (0.0031) +[2024-06-17 22:49:06,512][12883] Updated weights for policy 0, policy_version 7030 (0.0046) +[2024-06-17 22:49:06,998][12645] Fps is (10 sec: 44218.3, 60 sec: 40410.9, 300 sec: 39654.5). Total num frames: 115179520. Throughput: 0: 40073.1. Samples: 115291640. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) +[2024-06-17 22:49:06,999][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:11,447][12883] Updated weights for policy 0, policy_version 7040 (0.0038) +[2024-06-17 22:49:11,994][12645] Fps is (10 sec: 39320.8, 60 sec: 39594.7, 300 sec: 39710.7). Total num frames: 115359744. Throughput: 0: 40234.5. Samples: 115529800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 22:49:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:14,867][12883] Updated weights for policy 0, policy_version 7050 (0.0034) +[2024-06-17 22:49:16,994][12645] Fps is (10 sec: 39338.8, 60 sec: 39867.7, 300 sec: 39654.9). Total num frames: 115572736. Throughput: 0: 40054.3. Samples: 115646920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 22:49:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:19,448][12883] Updated weights for policy 0, policy_version 7060 (0.0048) +[2024-06-17 22:49:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 39710.4). Total num frames: 115769344. Throughput: 0: 40306.6. Samples: 115892980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) +[2024-06-17 22:49:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:22,862][12883] Updated weights for policy 0, policy_version 7070 (0.0046) +[2024-06-17 22:49:26,994][12645] Fps is (10 sec: 39320.8, 60 sec: 39594.5, 300 sec: 39710.4). Total num frames: 115965952. Throughput: 0: 40134.0. Samples: 116127320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-17 22:49:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:27,401][12883] Updated weights for policy 0, policy_version 7080 (0.0055) +[2024-06-17 22:49:31,043][12883] Updated weights for policy 0, policy_version 7090 (0.0035) +[2024-06-17 22:49:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 39710.4). Total num frames: 116178944. Throughput: 0: 40363.9. Samples: 116253140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 22:49:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:35,585][12883] Updated weights for policy 0, policy_version 7100 (0.0040) +[2024-06-17 22:49:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 116342784. Throughput: 0: 40245.3. Samples: 116488420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 22:49:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:39,303][12883] Updated weights for policy 0, policy_version 7110 (0.0039) +[2024-06-17 22:49:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 116572160. Throughput: 0: 40041.5. Samples: 116731600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-17 22:49:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:49:43,901][12883] Updated weights for policy 0, policy_version 7120 (0.0036) +[2024-06-17 22:49:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40686.8, 300 sec: 39765.9). Total num frames: 116768768. Throughput: 0: 40171.9. Samples: 116847240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 22:49:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:49:47,496][12883] Updated weights for policy 0, policy_version 7130 (0.0039) +[2024-06-17 22:49:51,994][12645] Fps is (10 sec: 39320.8, 60 sec: 39867.6, 300 sec: 39765.9). Total num frames: 116965376. Throughput: 0: 39918.8. Samples: 117087820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 22:49:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:49:52,017][12883] Updated weights for policy 0, policy_version 7140 (0.0029) +[2024-06-17 22:49:56,086][12883] Updated weights for policy 0, policy_version 7150 (0.0033) +[2024-06-17 22:49:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40686.9, 300 sec: 39710.3). Total num frames: 117178368. Throughput: 0: 39778.2. Samples: 117319820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-17 22:49:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:50:00,494][12883] Updated weights for policy 0, policy_version 7160 (0.0044) +[2024-06-17 22:50:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39867.6, 300 sec: 39821.8). Total num frames: 117358592. Throughput: 0: 39957.7. Samples: 117445020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-17 22:50:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:50:04,455][12883] Updated weights for policy 0, policy_version 7170 (0.0041) +[2024-06-17 22:50:06,996][12645] Fps is (10 sec: 37675.5, 60 sec: 39596.0, 300 sec: 39654.5). Total num frames: 117555200. Throughput: 0: 39751.4. Samples: 117681880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 22:50:06,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:50:08,584][12883] Updated weights for policy 0, policy_version 7180 (0.0038) +[2024-06-17 22:50:12,000][12645] Fps is (10 sec: 40934.7, 60 sec: 40136.7, 300 sec: 39820.6). Total num frames: 117768192. Throughput: 0: 39887.0. Samples: 117922480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) +[2024-06-17 22:50:12,001][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:50:12,446][12883] Updated weights for policy 0, policy_version 7190 (0.0031) +[2024-06-17 22:50:16,781][12883] Updated weights for policy 0, policy_version 7200 (0.0045) +[2024-06-17 22:50:16,996][12645] Fps is (10 sec: 40959.7, 60 sec: 39866.2, 300 sec: 39765.6). Total num frames: 117964800. Throughput: 0: 39770.9. Samples: 118042920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 22:50:16,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:50:20,665][12883] Updated weights for policy 0, policy_version 7210 (0.0033) +[2024-06-17 22:50:21,994][12645] Fps is (10 sec: 39346.0, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 118161408. Throughput: 0: 39902.7. Samples: 118284040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 22:50:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:50:25,182][12883] Updated weights for policy 0, policy_version 7220 (0.0036) +[2024-06-17 22:50:26,876][12862] Signal inference workers to stop experience collection... (1700 times) +[2024-06-17 22:50:26,876][12862] Signal inference workers to resume experience collection... (1700 times) +[2024-06-17 22:50:26,920][12883] InferenceWorker_p0-w0: stopping experience collection (1700 times) +[2024-06-17 22:50:26,920][12883] InferenceWorker_p0-w0: resuming experience collection (1700 times) +[2024-06-17 22:50:26,994][12645] Fps is (10 sec: 39330.1, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 118358016. Throughput: 0: 39720.7. Samples: 118519040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 22:50:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:50:28,961][12883] Updated weights for policy 0, policy_version 7230 (0.0055) +[2024-06-17 22:50:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.6, 300 sec: 39654.8). Total num frames: 118554624. Throughput: 0: 39710.2. Samples: 118634200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 22:50:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 22:50:33,159][12883] Updated weights for policy 0, policy_version 7240 (0.0045) +[2024-06-17 22:50:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 39765.9). Total num frames: 118767616. Throughput: 0: 39776.9. Samples: 118877780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-17 22:50:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:50:37,070][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007250_118784000.pth... +[2024-06-17 22:50:37,079][12883] Updated weights for policy 0, policy_version 7250 (0.0029) +[2024-06-17 22:50:37,116][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006665_109199360.pth +[2024-06-17 22:50:41,685][12883] Updated weights for policy 0, policy_version 7260 (0.0048) +[2024-06-17 22:50:41,996][12645] Fps is (10 sec: 40951.0, 60 sec: 39866.2, 300 sec: 39876.7). Total num frames: 118964224. Throughput: 0: 39825.7. Samples: 119112060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-17 22:50:41,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:50:45,783][12883] Updated weights for policy 0, policy_version 7270 (0.0035) +[2024-06-17 22:50:46,994][12645] Fps is (10 sec: 36045.3, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 119128064. Throughput: 0: 39641.9. Samples: 119228900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-17 22:50:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:50:50,077][12883] Updated weights for policy 0, policy_version 7280 (0.0034) +[2024-06-17 22:50:51,994][12645] Fps is (10 sec: 39330.3, 60 sec: 39867.8, 300 sec: 39821.4). Total num frames: 119357440. Throughput: 0: 39689.0. Samples: 119467800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-17 22:50:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:50:54,049][12883] Updated weights for policy 0, policy_version 7290 (0.0035) +[2024-06-17 22:50:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39321.6, 300 sec: 39710.4). Total num frames: 119537664. Throughput: 0: 39642.7. Samples: 119706160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 22:50:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:50:58,325][12883] Updated weights for policy 0, policy_version 7300 (0.0047) +[2024-06-17 22:51:01,994][12645] Fps is (10 sec: 39322.3, 60 sec: 39867.9, 300 sec: 39710.4). Total num frames: 119750656. Throughput: 0: 39485.2. Samples: 119819660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 22:51:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:51:02,155][12883] Updated weights for policy 0, policy_version 7310 (0.0035) +[2024-06-17 22:51:06,486][12883] Updated weights for policy 0, policy_version 7320 (0.0036) +[2024-06-17 22:51:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 39869.3, 300 sec: 39821.5). Total num frames: 119947264. Throughput: 0: 39605.4. Samples: 120066280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 22:51:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:51:10,603][12883] Updated weights for policy 0, policy_version 7330 (0.0036) +[2024-06-17 22:51:11,994][12645] Fps is (10 sec: 39320.9, 60 sec: 39598.7, 300 sec: 39654.8). Total num frames: 120143872. Throughput: 0: 39499.2. Samples: 120296500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-17 22:51:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:51:14,679][12883] Updated weights for policy 0, policy_version 7340 (0.0035) +[2024-06-17 22:51:16,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39869.2, 300 sec: 39766.2). Total num frames: 120356864. Throughput: 0: 39676.0. Samples: 120419620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) +[2024-06-17 22:51:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:51:18,552][12883] Updated weights for policy 0, policy_version 7350 (0.0041) +[2024-06-17 22:51:21,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39048.5, 300 sec: 39599.3). Total num frames: 120504320. Throughput: 0: 39558.6. Samples: 120657920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) +[2024-06-17 22:51:21,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:51:23,238][12883] Updated weights for policy 0, policy_version 7360 (0.0044) +[2024-06-17 22:51:26,891][12883] Updated weights for policy 0, policy_version 7370 (0.0032) +[2024-06-17 22:51:26,994][12645] Fps is (10 sec: 39322.3, 60 sec: 39867.9, 300 sec: 39821.5). Total num frames: 120750080. Throughput: 0: 39656.3. Samples: 120896500. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-17 22:51:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:51:31,503][12883] Updated weights for policy 0, policy_version 7380 (0.0039) +[2024-06-17 22:51:31,994][12645] Fps is (10 sec: 42599.3, 60 sec: 39594.8, 300 sec: 39765.9). Total num frames: 120930304. Throughput: 0: 39777.4. Samples: 121018880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 22:51:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:51:35,009][12883] Updated weights for policy 0, policy_version 7390 (0.0044) +[2024-06-17 22:51:36,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39321.6, 300 sec: 39654.8). Total num frames: 121126912. Throughput: 0: 39715.6. Samples: 121255000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-17 22:51:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:51:39,411][12883] Updated weights for policy 0, policy_version 7400 (0.0045) +[2024-06-17 22:51:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 39869.2, 300 sec: 39765.9). Total num frames: 121356288. Throughput: 0: 39553.4. Samples: 121486060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-17 22:51:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:51:43,081][12883] Updated weights for policy 0, policy_version 7410 (0.0045) +[2024-06-17 22:51:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 121520128. Throughput: 0: 39868.0. Samples: 121613720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 22:51:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:51:47,391][12883] Updated weights for policy 0, policy_version 7420 (0.0039) +[2024-06-17 22:51:51,850][12883] Updated weights for policy 0, policy_version 7430 (0.0031) +[2024-06-17 22:51:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 121733120. Throughput: 0: 39690.2. Samples: 121852340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 22:51:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:51:55,810][12883] Updated weights for policy 0, policy_version 7440 (0.0035) +[2024-06-17 22:51:56,996][12645] Fps is (10 sec: 40950.9, 60 sec: 39866.4, 300 sec: 39821.1). Total num frames: 121929728. Throughput: 0: 39720.8. Samples: 122084020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) +[2024-06-17 22:51:56,997][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:51:59,874][12883] Updated weights for policy 0, policy_version 7450 (0.0039) +[2024-06-17 22:52:01,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39321.5, 300 sec: 39710.4). Total num frames: 122109952. Throughput: 0: 39762.4. Samples: 122208920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) +[2024-06-17 22:52:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:52:03,547][12862] Signal inference workers to stop experience collection... (1750 times) +[2024-06-17 22:52:03,574][12883] InferenceWorker_p0-w0: stopping experience collection (1750 times) +[2024-06-17 22:52:03,605][12862] Signal inference workers to resume experience collection... (1750 times) +[2024-06-17 22:52:03,613][12883] InferenceWorker_p0-w0: resuming experience collection (1750 times) +[2024-06-17 22:52:03,750][12883] Updated weights for policy 0, policy_version 7460 (0.0041) +[2024-06-17 22:52:06,994][12645] Fps is (10 sec: 39329.9, 60 sec: 39594.6, 300 sec: 39710.4). Total num frames: 122322944. Throughput: 0: 39752.5. Samples: 122446780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-17 22:52:06,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:52:08,823][12883] Updated weights for policy 0, policy_version 7470 (0.0037) +[2024-06-17 22:52:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.8, 300 sec: 39821.5). Total num frames: 122535936. Throughput: 0: 39472.4. Samples: 122672760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 22:52:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:52:12,274][12883] Updated weights for policy 0, policy_version 7480 (0.0040) +[2024-06-17 22:52:16,707][12883] Updated weights for policy 0, policy_version 7490 (0.0033) +[2024-06-17 22:52:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39321.7, 300 sec: 39765.9). Total num frames: 122716160. Throughput: 0: 39543.1. Samples: 122798320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-17 22:52:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:52:20,994][12883] Updated weights for policy 0, policy_version 7500 (0.0038) +[2024-06-17 22:52:21,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 122896384. Throughput: 0: 39579.6. Samples: 123036080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-17 22:52:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:52:24,862][12883] Updated weights for policy 0, policy_version 7510 (0.0040) +[2024-06-17 22:52:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39321.5, 300 sec: 39710.4). Total num frames: 123109376. Throughput: 0: 39721.3. Samples: 123273520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-17 22:52:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:52:29,416][12883] Updated weights for policy 0, policy_version 7520 (0.0048) +[2024-06-17 22:52:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39594.6, 300 sec: 39599.3). Total num frames: 123305984. Throughput: 0: 39473.7. Samples: 123390040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) +[2024-06-17 22:52:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:52:32,984][12883] Updated weights for policy 0, policy_version 7530 (0.0036) +[2024-06-17 22:52:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39594.6, 300 sec: 39821.4). Total num frames: 123502592. Throughput: 0: 39406.6. Samples: 123625640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) +[2024-06-17 22:52:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:52:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007538_123502592.pth... +[2024-06-17 22:52:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006956_113967104.pth +[2024-06-17 22:52:37,659][12883] Updated weights for policy 0, policy_version 7540 (0.0035) +[2024-06-17 22:52:41,767][12883] Updated weights for policy 0, policy_version 7550 (0.0044) +[2024-06-17 22:52:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39321.7, 300 sec: 39710.4). Total num frames: 123715584. Throughput: 0: 39574.4. Samples: 123864780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-17 22:52:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:52:45,933][12883] Updated weights for policy 0, policy_version 7560 (0.0038) +[2024-06-17 22:52:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.6, 300 sec: 39710.4). Total num frames: 123895808. Throughput: 0: 39393.7. Samples: 123981640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-17 22:52:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:52:49,723][12883] Updated weights for policy 0, policy_version 7570 (0.0035) +[2024-06-17 22:52:51,994][12645] Fps is (10 sec: 39320.6, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 124108800. Throughput: 0: 39462.6. Samples: 124222600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-17 22:52:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:52:54,100][12883] Updated weights for policy 0, policy_version 7580 (0.0032) +[2024-06-17 22:52:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39323.0, 300 sec: 39654.8). Total num frames: 124289024. Throughput: 0: 39580.9. Samples: 124453900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 22:52:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:52:57,827][12883] Updated weights for policy 0, policy_version 7590 (0.0028) +[2024-06-17 22:53:01,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 124485632. Throughput: 0: 39499.9. Samples: 124575820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-17 22:53:01,999][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:53:02,395][12883] Updated weights for policy 0, policy_version 7600 (0.0036) +[2024-06-17 22:53:06,338][12883] Updated weights for policy 0, policy_version 7610 (0.0042) +[2024-06-17 22:53:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39654.9). Total num frames: 124682240. Throughput: 0: 39436.0. Samples: 124810700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-17 22:53:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:53:10,351][12883] Updated weights for policy 0, policy_version 7620 (0.0038) +[2024-06-17 22:53:11,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39048.6, 300 sec: 39654.8). Total num frames: 124878848. Throughput: 0: 39531.7. Samples: 125052440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 22:53:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:53:14,511][12883] Updated weights for policy 0, policy_version 7630 (0.0033) +[2024-06-17 22:53:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39594.7, 300 sec: 39821.5). Total num frames: 125091840. Throughput: 0: 39535.1. Samples: 125169120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-17 22:53:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:53:18,303][12883] Updated weights for policy 0, policy_version 7640 (0.0026) +[2024-06-17 22:53:21,994][12645] Fps is (10 sec: 42597.4, 60 sec: 40140.7, 300 sec: 39710.3). Total num frames: 125304832. Throughput: 0: 39701.7. Samples: 125412220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-17 22:53:21,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:53:23,527][12883] Updated weights for policy 0, policy_version 7650 (0.0033) +[2024-06-17 22:53:26,523][12883] Updated weights for policy 0, policy_version 7660 (0.0032) +[2024-06-17 22:53:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 125517824. Throughput: 0: 39626.9. Samples: 125648000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 22:53:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:53:31,595][12883] Updated weights for policy 0, policy_version 7670 (0.0039) +[2024-06-17 22:53:31,994][12645] Fps is (10 sec: 36045.4, 60 sec: 39321.6, 300 sec: 39654.8). Total num frames: 125665280. Throughput: 0: 39813.4. Samples: 125773240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-17 22:53:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:53:34,456][12862] Signal inference workers to stop experience collection... (1800 times) +[2024-06-17 22:53:34,457][12862] Signal inference workers to resume experience collection... (1800 times) +[2024-06-17 22:53:34,497][12883] InferenceWorker_p0-w0: stopping experience collection (1800 times) +[2024-06-17 22:53:34,497][12883] InferenceWorker_p0-w0: resuming experience collection (1800 times) +[2024-06-17 22:53:34,610][12883] Updated weights for policy 0, policy_version 7680 (0.0039) +[2024-06-17 22:53:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 125894656. Throughput: 0: 39648.5. Samples: 126006780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) +[2024-06-17 22:53:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:53:39,848][12883] Updated weights for policy 0, policy_version 7690 (0.0033) +[2024-06-17 22:53:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 126091264. Throughput: 0: 39800.8. Samples: 126244940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) +[2024-06-17 22:53:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:53:43,275][12883] Updated weights for policy 0, policy_version 7700 (0.0042) +[2024-06-17 22:53:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 126287872. Throughput: 0: 39635.5. Samples: 126359420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-17 22:53:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:53:48,186][12883] Updated weights for policy 0, policy_version 7710 (0.0035) +[2024-06-17 22:53:51,330][12883] Updated weights for policy 0, policy_version 7720 (0.0036) +[2024-06-17 22:53:51,996][12645] Fps is (10 sec: 40951.0, 60 sec: 39866.3, 300 sec: 39876.7). Total num frames: 126500864. Throughput: 0: 39787.3. Samples: 126601220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 23.0) +[2024-06-17 22:53:51,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:53:56,609][12883] Updated weights for policy 0, policy_version 7730 (0.0045) +[2024-06-17 22:53:56,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 126648320. Throughput: 0: 39756.0. Samples: 126841460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-17 22:53:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:53:59,329][12883] Updated weights for policy 0, policy_version 7740 (0.0039) +[2024-06-17 22:54:01,994][12645] Fps is (10 sec: 37691.8, 60 sec: 39867.8, 300 sec: 39655.4). Total num frames: 126877696. Throughput: 0: 39617.3. Samples: 126951900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-17 22:54:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:54:04,826][12883] Updated weights for policy 0, policy_version 7750 (0.0037) +[2024-06-17 22:54:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 127074304. Throughput: 0: 39674.4. Samples: 127197560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 22:54:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:54:07,680][12883] Updated weights for policy 0, policy_version 7760 (0.0030) +[2024-06-17 22:54:11,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39594.5, 300 sec: 39599.3). Total num frames: 127254528. Throughput: 0: 39660.0. Samples: 127432700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 22:54:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:54:13,003][12883] Updated weights for policy 0, policy_version 7770 (0.0039) +[2024-06-17 22:54:16,125][12883] Updated weights for policy 0, policy_version 7780 (0.0031) +[2024-06-17 22:54:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 127483904. Throughput: 0: 39512.4. Samples: 127551300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-17 22:54:16,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:54:21,684][12883] Updated weights for policy 0, policy_version 7790 (0.0034) +[2024-06-17 22:54:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39321.7, 300 sec: 39654.8). Total num frames: 127664128. Throughput: 0: 39580.4. Samples: 127787900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-17 22:54:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:54:24,334][12883] Updated weights for policy 0, policy_version 7800 (0.0031) +[2024-06-17 22:54:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39048.6, 300 sec: 39599.3). Total num frames: 127860736. Throughput: 0: 39512.5. Samples: 128023000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) +[2024-06-17 22:54:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:54:29,511][12883] Updated weights for policy 0, policy_version 7810 (0.0032) +[2024-06-17 22:54:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.8, 300 sec: 39821.5). Total num frames: 128090112. Throughput: 0: 39763.5. Samples: 128148780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-17 22:54:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:54:33,096][12883] Updated weights for policy 0, policy_version 7820 (0.0038) +[2024-06-17 22:54:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 128253952. Throughput: 0: 39644.1. Samples: 128385120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-17 22:54:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:54:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007828_128253952.pth... +[2024-06-17 22:54:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007250_118784000.pth +[2024-06-17 22:54:37,912][12883] Updated weights for policy 0, policy_version 7830 (0.0039) +[2024-06-17 22:54:41,350][12883] Updated weights for policy 0, policy_version 7840 (0.0041) +[2024-06-17 22:54:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 128483328. Throughput: 0: 39446.6. Samples: 128616560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-17 22:54:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:54:46,255][12883] Updated weights for policy 0, policy_version 7850 (0.0034) +[2024-06-17 22:54:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 39599.3). Total num frames: 128647168. Throughput: 0: 39651.6. Samples: 128736220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-17 22:54:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:54:49,250][12883] Updated weights for policy 0, policy_version 7860 (0.0040) +[2024-06-17 22:54:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39323.1, 300 sec: 39599.3). Total num frames: 128860160. Throughput: 0: 39555.2. Samples: 128977540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-17 22:54:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:54:54,699][12883] Updated weights for policy 0, policy_version 7870 (0.0043) +[2024-06-17 22:54:56,994][12645] Fps is (10 sec: 42597.2, 60 sec: 40413.7, 300 sec: 39710.4). Total num frames: 129073152. Throughput: 0: 39592.8. Samples: 129214380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-17 22:54:56,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:54:57,511][12883] Updated weights for policy 0, policy_version 7880 (0.0035) +[2024-06-17 22:55:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39321.5, 300 sec: 39599.6). Total num frames: 129236992. Throughput: 0: 39584.9. Samples: 129332620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 22:55:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:55:03,060][12883] Updated weights for policy 0, policy_version 7890 (0.0045) +[2024-06-17 22:55:03,446][12862] Signal inference workers to stop experience collection... (1850 times) +[2024-06-17 22:55:03,446][12862] Signal inference workers to resume experience collection... (1850 times) +[2024-06-17 22:55:03,474][12883] InferenceWorker_p0-w0: stopping experience collection (1850 times) +[2024-06-17 22:55:03,475][12883] InferenceWorker_p0-w0: resuming experience collection (1850 times) +[2024-06-17 22:55:05,769][12883] Updated weights for policy 0, policy_version 7900 (0.0041) +[2024-06-17 22:55:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.7, 300 sec: 39655.7). Total num frames: 129466368. Throughput: 0: 39649.8. Samples: 129572140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:55:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:55:10,997][12883] Updated weights for policy 0, policy_version 7910 (0.0031) +[2024-06-17 22:55:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 39599.6). Total num frames: 129646592. Throughput: 0: 39849.3. Samples: 129816220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-17 22:55:11,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:55:13,911][12883] Updated weights for policy 0, policy_version 7920 (0.0029) +[2024-06-17 22:55:16,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39594.7, 300 sec: 39654.8). Total num frames: 129859584. Throughput: 0: 39676.1. Samples: 129934200. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-17 22:55:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:55:18,994][12883] Updated weights for policy 0, policy_version 7930 (0.0038) +[2024-06-17 22:55:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.8, 300 sec: 39654.9). Total num frames: 130056192. Throughput: 0: 39539.2. Samples: 130164380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 22:55:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:55:22,303][12883] Updated weights for policy 0, policy_version 7940 (0.0032) +[2024-06-17 22:55:26,930][12883] Updated weights for policy 0, policy_version 7950 (0.0034) +[2024-06-17 22:55:26,996][12645] Fps is (10 sec: 39312.5, 60 sec: 39866.2, 300 sec: 39654.5). Total num frames: 130252800. Throughput: 0: 39884.3. Samples: 130411440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 22:55:26,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:55:30,406][12883] Updated weights for policy 0, policy_version 7960 (0.0043) +[2024-06-17 22:55:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 130449408. Throughput: 0: 40025.7. Samples: 130537380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 22:55:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:55:34,770][12883] Updated weights for policy 0, policy_version 7970 (0.0031) +[2024-06-17 22:55:36,994][12645] Fps is (10 sec: 39330.0, 60 sec: 39867.7, 300 sec: 39599.6). Total num frames: 130646016. Throughput: 0: 39956.3. Samples: 130775580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-17 22:55:36,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:55:38,975][12883] Updated weights for policy 0, policy_version 7980 (0.0038) +[2024-06-17 22:55:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39321.5, 300 sec: 39710.3). Total num frames: 130842624. Throughput: 0: 40000.0. Samples: 131014380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-17 22:55:41,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:55:43,068][12883] Updated weights for policy 0, policy_version 7990 (0.0037) +[2024-06-17 22:55:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40140.7, 300 sec: 39654.8). Total num frames: 131055616. Throughput: 0: 40111.9. Samples: 131137660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-17 22:55:46,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:55:47,175][12883] Updated weights for policy 0, policy_version 8000 (0.0034) +[2024-06-17 22:55:51,169][12883] Updated weights for policy 0, policy_version 8010 (0.0039) +[2024-06-17 22:55:51,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 131252224. Throughput: 0: 40143.2. Samples: 131378580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-17 22:55:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:55:55,113][12883] Updated weights for policy 0, policy_version 8020 (0.0035) +[2024-06-17 22:55:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40140.9, 300 sec: 39765.9). Total num frames: 131481600. Throughput: 0: 40044.9. Samples: 131618240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 22:55:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:55:59,588][12883] Updated weights for policy 0, policy_version 8030 (0.0034) +[2024-06-17 22:56:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.9, 300 sec: 39654.8). Total num frames: 131645440. Throughput: 0: 40237.8. Samples: 131744900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:56:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:56:03,283][12883] Updated weights for policy 0, policy_version 8040 (0.0034) +[2024-06-17 22:56:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 131858432. Throughput: 0: 40404.4. Samples: 131982580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 22:56:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:56:07,432][12883] Updated weights for policy 0, policy_version 8050 (0.0028) +[2024-06-17 22:56:11,869][12883] Updated weights for policy 0, policy_version 8060 (0.0031) +[2024-06-17 22:56:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40140.9, 300 sec: 39654.9). Total num frames: 132055040. Throughput: 0: 40256.3. Samples: 132222880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:56:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:56:15,524][12883] Updated weights for policy 0, policy_version 8070 (0.0028) +[2024-06-17 22:56:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 132268032. Throughput: 0: 40017.5. Samples: 132338160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-17 22:56:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:56:19,890][12883] Updated weights for policy 0, policy_version 8080 (0.0032) +[2024-06-17 22:56:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39867.7, 300 sec: 39654.8). Total num frames: 132448256. Throughput: 0: 40074.7. Samples: 132578940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 22:56:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:56:23,410][12883] Updated weights for policy 0, policy_version 8090 (0.0031) +[2024-06-17 22:56:26,996][12645] Fps is (10 sec: 37675.2, 60 sec: 39867.9, 300 sec: 39710.1). Total num frames: 132644864. Throughput: 0: 40092.6. Samples: 132818620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 22:56:26,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:56:28,231][12883] Updated weights for policy 0, policy_version 8100 (0.0033) +[2024-06-17 22:56:31,782][12883] Updated weights for policy 0, policy_version 8110 (0.0033) +[2024-06-17 22:56:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40413.9, 300 sec: 39821.5). Total num frames: 132874240. Throughput: 0: 39904.1. Samples: 132933340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-17 22:56:31,998][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:56:36,158][12883] Updated weights for policy 0, policy_version 8120 (0.0050) +[2024-06-17 22:56:36,996][12645] Fps is (10 sec: 40959.1, 60 sec: 40139.4, 300 sec: 39654.5). Total num frames: 133054464. Throughput: 0: 40030.0. Samples: 133180020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) +[2024-06-17 22:56:36,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:56:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008122_133070848.pth... +[2024-06-17 22:56:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007538_123502592.pth +[2024-06-17 22:56:40,422][12883] Updated weights for policy 0, policy_version 8130 (0.0033) +[2024-06-17 22:56:41,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40140.9, 300 sec: 39765.9). Total num frames: 133251072. Throughput: 0: 39974.7. Samples: 133417100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) +[2024-06-17 22:56:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:56:44,634][12883] Updated weights for policy 0, policy_version 8140 (0.0038) +[2024-06-17 22:56:46,318][12862] Signal inference workers to stop experience collection... (1900 times) +[2024-06-17 22:56:46,319][12862] Signal inference workers to resume experience collection... (1900 times) +[2024-06-17 22:56:46,335][12883] InferenceWorker_p0-w0: stopping experience collection (1900 times) +[2024-06-17 22:56:46,335][12883] InferenceWorker_p0-w0: resuming experience collection (1900 times) +[2024-06-17 22:56:46,994][12645] Fps is (10 sec: 39329.7, 60 sec: 39867.7, 300 sec: 39710.3). Total num frames: 133447680. Throughput: 0: 39765.6. Samples: 133534360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-17 22:56:46,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:56:48,518][12883] Updated weights for policy 0, policy_version 8150 (0.0029) +[2024-06-17 22:56:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 39766.2). Total num frames: 133660672. Throughput: 0: 39796.5. Samples: 133773420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-17 22:56:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:56:52,721][12883] Updated weights for policy 0, policy_version 8160 (0.0040) +[2024-06-17 22:56:56,524][12883] Updated weights for policy 0, policy_version 8170 (0.0037) +[2024-06-17 22:56:56,994][12645] Fps is (10 sec: 40961.0, 60 sec: 39594.8, 300 sec: 39821.5). Total num frames: 133857280. Throughput: 0: 39755.2. Samples: 134011860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-17 22:56:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:57:01,080][12883] Updated weights for policy 0, policy_version 8180 (0.0054) +[2024-06-17 22:57:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39867.6, 300 sec: 39710.4). Total num frames: 134037504. Throughput: 0: 39865.6. Samples: 134132120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 22:57:01,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:57:05,144][12883] Updated weights for policy 0, policy_version 8190 (0.0038) +[2024-06-17 22:57:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40140.7, 300 sec: 39765.9). Total num frames: 134266880. Throughput: 0: 39848.0. Samples: 134372100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-17 22:57:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:57:09,517][12883] Updated weights for policy 0, policy_version 8200 (0.0039) +[2024-06-17 22:57:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 134430720. Throughput: 0: 39740.5. Samples: 134606860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:57:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:57:13,183][12883] Updated weights for policy 0, policy_version 8210 (0.0042) +[2024-06-17 22:57:16,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39594.5, 300 sec: 39821.4). Total num frames: 134643712. Throughput: 0: 39837.3. Samples: 134726020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 22:57:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:57:17,695][12883] Updated weights for policy 0, policy_version 8220 (0.0046) +[2024-06-17 22:57:21,611][12883] Updated weights for policy 0, policy_version 8230 (0.0037) +[2024-06-17 22:57:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 40139.3, 300 sec: 39821.1). Total num frames: 134856704. Throughput: 0: 39696.8. Samples: 134966380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-17 22:57:21,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:57:25,631][12883] Updated weights for policy 0, policy_version 8240 (0.0034) +[2024-06-17 22:57:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39869.0, 300 sec: 39765.9). Total num frames: 135036928. Throughput: 0: 39716.4. Samples: 135204340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-17 22:57:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:57:29,659][12883] Updated weights for policy 0, policy_version 8250 (0.0037) +[2024-06-17 22:57:31,996][12645] Fps is (10 sec: 39321.8, 60 sec: 39593.2, 300 sec: 39821.2). Total num frames: 135249920. Throughput: 0: 39796.4. Samples: 135325280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-17 22:57:32,005][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:57:34,165][12883] Updated weights for policy 0, policy_version 8260 (0.0036) +[2024-06-17 22:57:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39869.2, 300 sec: 39765.9). Total num frames: 135446528. Throughput: 0: 39988.0. Samples: 135572880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-17 22:57:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:57:37,510][12883] Updated weights for policy 0, policy_version 8270 (0.0035) +[2024-06-17 22:57:41,994][12645] Fps is (10 sec: 37691.6, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 135626752. Throughput: 0: 39941.7. Samples: 135809240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-17 22:57:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:57:42,297][12883] Updated weights for policy 0, policy_version 8280 (0.0047) +[2024-06-17 22:57:45,866][12883] Updated weights for policy 0, policy_version 8290 (0.0038) +[2024-06-17 22:57:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40140.9, 300 sec: 39821.5). Total num frames: 135856128. Throughput: 0: 39912.9. Samples: 135928200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-17 22:57:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:57:50,218][12883] Updated weights for policy 0, policy_version 8300 (0.0028) +[2024-06-17 22:57:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39321.6, 300 sec: 39765.9). Total num frames: 136019968. Throughput: 0: 39768.6. Samples: 136161680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-17 22:57:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:57:54,007][12883] Updated weights for policy 0, policy_version 8310 (0.0038) +[2024-06-17 22:57:56,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39594.6, 300 sec: 39821.5). Total num frames: 136232960. Throughput: 0: 39826.7. Samples: 136399060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) +[2024-06-17 22:57:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:57:58,635][12883] Updated weights for policy 0, policy_version 8320 (0.0037) +[2024-06-17 22:58:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 40413.9, 300 sec: 39932.5). Total num frames: 136462336. Throughput: 0: 39951.1. Samples: 136523820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-17 22:58:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:58:02,279][12883] Updated weights for policy 0, policy_version 8330 (0.0042) +[2024-06-17 22:58:06,556][12883] Updated weights for policy 0, policy_version 8340 (0.0042) +[2024-06-17 22:58:06,994][12645] Fps is (10 sec: 40959.1, 60 sec: 39594.6, 300 sec: 39876.9). Total num frames: 136642560. Throughput: 0: 39870.2. Samples: 136760460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-17 22:58:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:58:10,868][12883] Updated weights for policy 0, policy_version 8350 (0.0037) +[2024-06-17 22:58:11,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40140.7, 300 sec: 39821.4). Total num frames: 136839168. Throughput: 0: 40040.4. Samples: 137006160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-17 22:58:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:58:14,449][12883] Updated weights for policy 0, policy_version 8360 (0.0033) +[2024-06-17 22:58:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 39821.4). Total num frames: 137052160. Throughput: 0: 39832.5. Samples: 137117660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-17 22:58:16,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:58:18,848][12883] Updated weights for policy 0, policy_version 8370 (0.0042) +[2024-06-17 22:58:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39869.2, 300 sec: 39765.9). Total num frames: 137248768. Throughput: 0: 39784.4. Samples: 137363180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-17 22:58:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:58:23,287][12883] Updated weights for policy 0, policy_version 8380 (0.0038) +[2024-06-17 22:58:27,000][12645] Fps is (10 sec: 39297.8, 60 sec: 40136.7, 300 sec: 39931.7). Total num frames: 137445376. Throughput: 0: 39727.4. Samples: 137597220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-17 22:58:27,000][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:58:27,138][12883] Updated weights for policy 0, policy_version 8390 (0.0054) +[2024-06-17 22:58:31,314][12883] Updated weights for policy 0, policy_version 8400 (0.0034) +[2024-06-17 22:58:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39869.1, 300 sec: 39821.4). Total num frames: 137641984. Throughput: 0: 39988.3. Samples: 137727680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 22:58:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:58:35,353][12883] Updated weights for policy 0, policy_version 8410 (0.0040) +[2024-06-17 22:58:36,994][12645] Fps is (10 sec: 39346.3, 60 sec: 39867.7, 300 sec: 39821.5). Total num frames: 137838592. Throughput: 0: 39856.0. Samples: 137955200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 22:58:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:58:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008413_137838592.pth... +[2024-06-17 22:58:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007828_128253952.pth +[2024-06-17 22:58:37,652][12862] Signal inference workers to stop experience collection... (1950 times) +[2024-06-17 22:58:37,652][12862] Signal inference workers to resume experience collection... (1950 times) +[2024-06-17 22:58:37,699][12883] InferenceWorker_p0-w0: stopping experience collection (1950 times) +[2024-06-17 22:58:37,699][12883] InferenceWorker_p0-w0: resuming experience collection (1950 times) +[2024-06-17 22:58:39,583][12883] Updated weights for policy 0, policy_version 8420 (0.0030) +[2024-06-17 22:58:41,993][12645] Fps is (10 sec: 39322.8, 60 sec: 40140.9, 300 sec: 39821.5). Total num frames: 138035200. Throughput: 0: 39903.2. Samples: 138194700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-17 22:58:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:58:43,886][12883] Updated weights for policy 0, policy_version 8430 (0.0033) +[2024-06-17 22:58:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.7, 300 sec: 39766.2). Total num frames: 138231808. Throughput: 0: 39865.4. Samples: 138317760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-17 22:58:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:58:47,786][12883] Updated weights for policy 0, policy_version 8440 (0.0047) +[2024-06-17 22:58:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 138428416. Throughput: 0: 39845.1. Samples: 138553480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-17 22:58:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 22:58:52,067][12883] Updated weights for policy 0, policy_version 8450 (0.0031) +[2024-06-17 22:58:55,859][12883] Updated weights for policy 0, policy_version 8460 (0.0038) +[2024-06-17 22:58:56,996][12645] Fps is (10 sec: 42588.7, 60 sec: 40412.3, 300 sec: 39932.2). Total num frames: 138657792. Throughput: 0: 39713.7. Samples: 138793360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 22:58:56,997][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:59:00,207][12883] Updated weights for policy 0, policy_version 8470 (0.0046) +[2024-06-17 22:59:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39321.7, 300 sec: 39821.5). Total num frames: 138821632. Throughput: 0: 39932.6. Samples: 138914620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) +[2024-06-17 22:59:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:59:04,075][12883] Updated weights for policy 0, policy_version 8480 (0.0043) +[2024-06-17 22:59:06,994][12645] Fps is (10 sec: 36053.0, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 139018240. Throughput: 0: 39652.0. Samples: 139147520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:59:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:59:08,733][12883] Updated weights for policy 0, policy_version 8490 (0.0040) +[2024-06-17 22:59:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39867.9, 300 sec: 39821.5). Total num frames: 139231232. Throughput: 0: 39900.8. Samples: 139392500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 22:59:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:59:12,142][12883] Updated weights for policy 0, policy_version 8500 (0.0030) +[2024-06-17 22:59:16,752][12883] Updated weights for policy 0, policy_version 8510 (0.0040) +[2024-06-17 22:59:17,000][12645] Fps is (10 sec: 40934.4, 60 sec: 39590.7, 300 sec: 39876.2). Total num frames: 139427840. Throughput: 0: 39633.7. Samples: 139511440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-17 22:59:17,000][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 22:59:20,469][12883] Updated weights for policy 0, policy_version 8520 (0.0049) +[2024-06-17 22:59:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39594.7, 300 sec: 39877.0). Total num frames: 139624448. Throughput: 0: 39678.7. Samples: 139740740. Policy #0 lag: (min: 1.0, avg: 12.7, max: 24.0) +[2024-06-17 22:59:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:59:24,954][12883] Updated weights for policy 0, policy_version 8530 (0.0049) +[2024-06-17 22:59:26,994][12645] Fps is (10 sec: 37706.9, 60 sec: 39325.7, 300 sec: 39710.4). Total num frames: 139804672. Throughput: 0: 39804.8. Samples: 139985920. Policy #0 lag: (min: 1.0, avg: 12.7, max: 24.0) +[2024-06-17 22:59:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:59:28,951][12883] Updated weights for policy 0, policy_version 8540 (0.0043) +[2024-06-17 22:59:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39594.7, 300 sec: 39877.0). Total num frames: 140017664. Throughput: 0: 39696.3. Samples: 140104100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) +[2024-06-17 22:59:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 22:59:33,223][12883] Updated weights for policy 0, policy_version 8550 (0.0036) +[2024-06-17 22:59:36,828][12883] Updated weights for policy 0, policy_version 8560 (0.0047) +[2024-06-17 22:59:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 140247040. Throughput: 0: 39918.3. Samples: 140349800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 22:59:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:59:41,400][12883] Updated weights for policy 0, policy_version 8570 (0.0051) +[2024-06-17 22:59:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 140410880. Throughput: 0: 39751.4. Samples: 140582080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 22:59:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 22:59:45,034][12883] Updated weights for policy 0, policy_version 8580 (0.0035) +[2024-06-17 22:59:46,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39594.6, 300 sec: 39821.4). Total num frames: 140607488. Throughput: 0: 39653.3. Samples: 140699020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 22:59:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 22:59:49,749][12883] Updated weights for policy 0, policy_version 8590 (0.0039) +[2024-06-17 22:59:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.7, 300 sec: 39821.5). Total num frames: 140820480. Throughput: 0: 39823.6. Samples: 140939580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 22:59:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 22:59:53,492][12883] Updated weights for policy 0, policy_version 8600 (0.0046) +[2024-06-17 22:59:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39323.1, 300 sec: 39932.5). Total num frames: 141017088. Throughput: 0: 39798.5. Samples: 141183440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 22:59:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 22:59:58,062][12883] Updated weights for policy 0, policy_version 8610 (0.0041) +[2024-06-17 22:59:58,397][12862] Signal inference workers to stop experience collection... (2000 times) +[2024-06-17 22:59:58,398][12862] Signal inference workers to resume experience collection... (2000 times) +[2024-06-17 22:59:58,419][12883] InferenceWorker_p0-w0: stopping experience collection (2000 times) +[2024-06-17 22:59:58,419][12883] InferenceWorker_p0-w0: resuming experience collection (2000 times) +[2024-06-17 23:00:01,826][12883] Updated weights for policy 0, policy_version 8620 (0.0046) +[2024-06-17 23:00:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.7, 300 sec: 39877.0). Total num frames: 141230080. Throughput: 0: 39781.5. Samples: 141301360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 23:00:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:00:06,274][12883] Updated weights for policy 0, policy_version 8630 (0.0039) +[2024-06-17 23:00:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39867.6, 300 sec: 39877.0). Total num frames: 141410304. Throughput: 0: 39966.5. Samples: 141539240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) +[2024-06-17 23:00:06,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:00:10,381][12883] Updated weights for policy 0, policy_version 8640 (0.0038) +[2024-06-17 23:00:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39867.6, 300 sec: 39877.0). Total num frames: 141623296. Throughput: 0: 39781.2. Samples: 141776080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 23:00:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:00:14,303][12883] Updated weights for policy 0, policy_version 8650 (0.0045) +[2024-06-17 23:00:16,996][12645] Fps is (10 sec: 40951.3, 60 sec: 39870.4, 300 sec: 39876.7). Total num frames: 141819904. Throughput: 0: 39667.0. Samples: 141889200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 23:00:16,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:00:18,357][12883] Updated weights for policy 0, policy_version 8660 (0.0030) +[2024-06-17 23:00:21,994][12645] Fps is (10 sec: 37684.0, 60 sec: 39594.7, 300 sec: 39821.8). Total num frames: 142000128. Throughput: 0: 39714.7. Samples: 142136960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:00:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:00:22,567][12883] Updated weights for policy 0, policy_version 8670 (0.0042) +[2024-06-17 23:00:26,994][12645] Fps is (10 sec: 37691.9, 60 sec: 39867.8, 300 sec: 39821.5). Total num frames: 142196736. Throughput: 0: 39760.0. Samples: 142371280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:00:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:00:27,027][12883] Updated weights for policy 0, policy_version 8680 (0.0042) +[2024-06-17 23:00:31,065][12883] Updated weights for policy 0, policy_version 8690 (0.0049) +[2024-06-17 23:00:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 142426112. Throughput: 0: 39959.5. Samples: 142497200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:00:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:00:35,139][12883] Updated weights for policy 0, policy_version 8700 (0.0038) +[2024-06-17 23:00:36,995][12645] Fps is (10 sec: 40952.4, 60 sec: 39320.4, 300 sec: 39876.8). Total num frames: 142606336. Throughput: 0: 39785.5. Samples: 142730000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-17 23:00:36,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:00:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008704_142606336.pth... +[2024-06-17 23:00:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008122_133070848.pth +[2024-06-17 23:00:39,273][12883] Updated weights for policy 0, policy_version 8710 (0.0043) +[2024-06-17 23:00:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.8, 300 sec: 39932.5). Total num frames: 142835712. Throughput: 0: 39548.0. Samples: 142963100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 19.0) +[2024-06-17 23:00:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:00:43,095][12883] Updated weights for policy 0, policy_version 8720 (0.0039) +[2024-06-17 23:00:46,994][12645] Fps is (10 sec: 39328.3, 60 sec: 39867.7, 300 sec: 39821.4). Total num frames: 142999552. Throughput: 0: 39691.5. Samples: 143087480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 19.0) +[2024-06-17 23:00:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:00:47,443][12883] Updated weights for policy 0, policy_version 8730 (0.0039) +[2024-06-17 23:00:51,277][12883] Updated weights for policy 0, policy_version 8740 (0.0035) +[2024-06-17 23:00:51,996][12645] Fps is (10 sec: 37675.1, 60 sec: 39866.3, 300 sec: 39765.6). Total num frames: 143212544. Throughput: 0: 39691.5. Samples: 143325440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) +[2024-06-17 23:00:51,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:00:56,001][12883] Updated weights for policy 0, policy_version 8750 (0.0043) +[2024-06-17 23:00:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39867.8, 300 sec: 39877.0). Total num frames: 143409152. Throughput: 0: 39603.7. Samples: 143558240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 23:00:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:00:59,377][12883] Updated weights for policy 0, policy_version 8760 (0.0033) +[2024-06-17 23:01:01,994][12645] Fps is (10 sec: 36052.5, 60 sec: 39048.5, 300 sec: 39710.4). Total num frames: 143572992. Throughput: 0: 39743.7. Samples: 143677580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 23:01:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:01:04,113][12883] Updated weights for policy 0, policy_version 8770 (0.0044) +[2024-06-17 23:01:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.9, 300 sec: 39821.5). Total num frames: 143802368. Throughput: 0: 39556.9. Samples: 143917020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-17 23:01:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:01:07,690][12883] Updated weights for policy 0, policy_version 8780 (0.0034) +[2024-06-17 23:01:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39321.7, 300 sec: 39710.4). Total num frames: 143982592. Throughput: 0: 39765.8. Samples: 144160740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-17 23:01:11,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:01:12,216][12883] Updated weights for policy 0, policy_version 8790 (0.0039) +[2024-06-17 23:01:15,921][12883] Updated weights for policy 0, policy_version 8800 (0.0041) +[2024-06-17 23:01:16,994][12645] Fps is (10 sec: 39320.7, 60 sec: 39596.1, 300 sec: 39821.4). Total num frames: 144195584. Throughput: 0: 39492.4. Samples: 144274360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-17 23:01:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:01:20,167][12883] Updated weights for policy 0, policy_version 8810 (0.0041) +[2024-06-17 23:01:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.7, 300 sec: 39821.7). Total num frames: 144392192. Throughput: 0: 39740.3. Samples: 144518240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-17 23:01:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:01:24,306][12883] Updated weights for policy 0, policy_version 8820 (0.0033) +[2024-06-17 23:01:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 144588800. Throughput: 0: 39899.2. Samples: 144758560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:01:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:01:28,375][12883] Updated weights for policy 0, policy_version 8830 (0.0048) +[2024-06-17 23:01:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39594.6, 300 sec: 39821.7). Total num frames: 144801792. Throughput: 0: 39757.3. Samples: 144876560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:01:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:01:32,471][12883] Updated weights for policy 0, policy_version 8840 (0.0044) +[2024-06-17 23:01:36,434][12883] Updated weights for policy 0, policy_version 8850 (0.0036) +[2024-06-17 23:01:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40141.9, 300 sec: 39877.0). Total num frames: 145014784. Throughput: 0: 39831.2. Samples: 145117760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-17 23:01:37,003][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:01:40,619][12883] Updated weights for policy 0, policy_version 8860 (0.0034) +[2024-06-17 23:01:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39048.5, 300 sec: 39765.9). Total num frames: 145178624. Throughput: 0: 39839.0. Samples: 145351000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-17 23:01:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:01:43,463][12862] Signal inference workers to stop experience collection... (2050 times) +[2024-06-17 23:01:43,467][12862] Signal inference workers to resume experience collection... (2050 times) +[2024-06-17 23:01:43,492][12883] InferenceWorker_p0-w0: stopping experience collection (2050 times) +[2024-06-17 23:01:43,493][12883] InferenceWorker_p0-w0: resuming experience collection (2050 times) +[2024-06-17 23:01:44,707][12883] Updated weights for policy 0, policy_version 8870 (0.0030) +[2024-06-17 23:01:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 39765.9). Total num frames: 145391616. Throughput: 0: 39832.9. Samples: 145470060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-17 23:01:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:01:49,102][12883] Updated weights for policy 0, policy_version 8880 (0.0033) +[2024-06-17 23:01:51,994][12645] Fps is (10 sec: 42597.0, 60 sec: 39868.9, 300 sec: 39821.4). Total num frames: 145604608. Throughput: 0: 39976.3. Samples: 145715980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-17 23:01:51,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:01:52,923][12883] Updated weights for policy 0, policy_version 8890 (0.0042) +[2024-06-17 23:01:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.6, 300 sec: 39821.4). Total num frames: 145784832. Throughput: 0: 39733.6. Samples: 145948760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) +[2024-06-17 23:01:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:01:57,600][12883] Updated weights for policy 0, policy_version 8900 (0.0043) +[2024-06-17 23:02:00,973][12883] Updated weights for policy 0, policy_version 8910 (0.0059) +[2024-06-17 23:02:01,994][12645] Fps is (10 sec: 39323.5, 60 sec: 40413.9, 300 sec: 39765.9). Total num frames: 145997824. Throughput: 0: 39873.9. Samples: 146068680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) +[2024-06-17 23:02:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:02:06,247][12883] Updated weights for policy 0, policy_version 8920 (0.0042) +[2024-06-17 23:02:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.5, 300 sec: 39877.0). Total num frames: 146194432. Throughput: 0: 39877.6. Samples: 146312740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 23:02:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:02:09,852][12883] Updated weights for policy 0, policy_version 8930 (0.0035) +[2024-06-17 23:02:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 39821.5). Total num frames: 146391040. Throughput: 0: 39619.5. Samples: 146541440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-17 23:02:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:02:14,426][12883] Updated weights for policy 0, policy_version 8940 (0.0041) +[2024-06-17 23:02:16,996][12645] Fps is (10 sec: 40951.4, 60 sec: 40139.4, 300 sec: 39821.5). Total num frames: 146604032. Throughput: 0: 39756.8. Samples: 146665700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-17 23:02:16,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:02:18,095][12883] Updated weights for policy 0, policy_version 8950 (0.0040) +[2024-06-17 23:02:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 146767872. Throughput: 0: 39688.1. Samples: 146903720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 23:02:21,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:02:23,013][12883] Updated weights for policy 0, policy_version 8960 (0.0030) +[2024-06-17 23:02:26,043][12883] Updated weights for policy 0, policy_version 8970 (0.0035) +[2024-06-17 23:02:26,994][12645] Fps is (10 sec: 37691.8, 60 sec: 39867.7, 300 sec: 39766.2). Total num frames: 146980864. Throughput: 0: 39754.4. Samples: 147139940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-17 23:02:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:02:31,031][12883] Updated weights for policy 0, policy_version 8980 (0.0046) +[2024-06-17 23:02:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39594.8, 300 sec: 39765.9). Total num frames: 147177472. Throughput: 0: 39761.0. Samples: 147259300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-17 23:02:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:02:34,361][12883] Updated weights for policy 0, policy_version 8990 (0.0042) +[2024-06-17 23:02:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39321.6, 300 sec: 39821.4). Total num frames: 147374080. Throughput: 0: 39562.5. Samples: 147496280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-17 23:02:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:02:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008995_147374080.pth... +[2024-06-17 23:02:37,111][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008413_137838592.pth +[2024-06-17 23:02:39,295][12883] Updated weights for policy 0, policy_version 9000 (0.0040) +[2024-06-17 23:02:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40140.9, 300 sec: 39765.9). Total num frames: 147587072. Throughput: 0: 39544.9. Samples: 147728280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-17 23:02:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:02:42,683][12883] Updated weights for policy 0, policy_version 9010 (0.0045) +[2024-06-17 23:02:46,996][12645] Fps is (10 sec: 37675.0, 60 sec: 39320.1, 300 sec: 39765.6). Total num frames: 147750912. Throughput: 0: 39580.2. Samples: 147849880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-17 23:02:46,997][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:02:47,437][12883] Updated weights for policy 0, policy_version 9020 (0.0041) +[2024-06-17 23:02:50,860][12883] Updated weights for policy 0, policy_version 9030 (0.0038) +[2024-06-17 23:02:51,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39048.8, 300 sec: 39710.4). Total num frames: 147947520. Throughput: 0: 39416.5. Samples: 148086480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-17 23:02:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:02:55,389][12883] Updated weights for policy 0, policy_version 9040 (0.0033) +[2024-06-17 23:02:56,994][12645] Fps is (10 sec: 40968.9, 60 sec: 39594.7, 300 sec: 39654.8). Total num frames: 148160512. Throughput: 0: 39734.1. Samples: 148329480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-17 23:02:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:02:58,879][12883] Updated weights for policy 0, policy_version 9050 (0.0046) +[2024-06-17 23:03:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39321.5, 300 sec: 39710.4). Total num frames: 148357120. Throughput: 0: 39595.2. Samples: 148447400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-17 23:03:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:03:03,846][12883] Updated weights for policy 0, policy_version 9060 (0.0034) +[2024-06-17 23:03:06,933][12883] Updated weights for policy 0, policy_version 9070 (0.0033) +[2024-06-17 23:03:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 40140.9, 300 sec: 39877.0). Total num frames: 148602880. Throughput: 0: 39645.8. Samples: 148687780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-17 23:03:07,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:03:09,886][12862] Signal inference workers to stop experience collection... (2100 times) +[2024-06-17 23:03:09,887][12862] Signal inference workers to resume experience collection... (2100 times) +[2024-06-17 23:03:09,910][12883] InferenceWorker_p0-w0: stopping experience collection (2100 times) +[2024-06-17 23:03:09,910][12883] InferenceWorker_p0-w0: resuming experience collection (2100 times) +[2024-06-17 23:03:11,995][12645] Fps is (10 sec: 37679.0, 60 sec: 39047.8, 300 sec: 39599.2). Total num frames: 148733952. Throughput: 0: 39718.4. Samples: 148927320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-17 23:03:11,996][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:03:12,342][12883] Updated weights for policy 0, policy_version 9080 (0.0031) +[2024-06-17 23:03:15,628][12883] Updated weights for policy 0, policy_version 9090 (0.0042) +[2024-06-17 23:03:16,994][12645] Fps is (10 sec: 34406.0, 60 sec: 39049.9, 300 sec: 39654.8). Total num frames: 148946944. Throughput: 0: 39598.9. Samples: 149041260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-17 23:03:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:03:20,568][12883] Updated weights for policy 0, policy_version 9100 (0.0034) +[2024-06-17 23:03:21,994][12645] Fps is (10 sec: 42604.0, 60 sec: 39867.8, 300 sec: 39711.2). Total num frames: 149159936. Throughput: 0: 39786.4. Samples: 149286660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) +[2024-06-17 23:03:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:03:23,658][12883] Updated weights for policy 0, policy_version 9110 (0.0034) +[2024-06-17 23:03:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.6, 300 sec: 39710.4). Total num frames: 149356544. Throughput: 0: 39911.1. Samples: 149524280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) +[2024-06-17 23:03:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:03:28,376][12883] Updated weights for policy 0, policy_version 9120 (0.0040) +[2024-06-17 23:03:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.6, 300 sec: 39710.4). Total num frames: 149553152. Throughput: 0: 39803.4. Samples: 149640940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-17 23:03:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:03:32,488][12883] Updated weights for policy 0, policy_version 9130 (0.0051) +[2024-06-17 23:03:36,933][12883] Updated weights for policy 0, policy_version 9140 (0.0041) +[2024-06-17 23:03:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39594.7, 300 sec: 39710.3). Total num frames: 149749760. Throughput: 0: 39660.9. Samples: 149871220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 23:03:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:03:40,852][12883] Updated weights for policy 0, policy_version 9150 (0.0039) +[2024-06-17 23:03:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39321.6, 300 sec: 39710.4). Total num frames: 149946368. Throughput: 0: 39754.3. Samples: 150118420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 23:03:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:03:44,872][12883] Updated weights for policy 0, policy_version 9160 (0.0039) +[2024-06-17 23:03:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40142.4, 300 sec: 39765.9). Total num frames: 150159360. Throughput: 0: 39811.7. Samples: 150238920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 23:03:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:03:48,780][12883] Updated weights for policy 0, policy_version 9170 (0.0042) +[2024-06-17 23:03:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40140.9, 300 sec: 39655.1). Total num frames: 150355968. Throughput: 0: 39714.7. Samples: 150474940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 23:03:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:03:53,012][12883] Updated weights for policy 0, policy_version 9180 (0.0038) +[2024-06-17 23:03:56,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 150536192. Throughput: 0: 39523.2. Samples: 150705820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 23:03:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:03:57,374][12883] Updated weights for policy 0, policy_version 9190 (0.0023) +[2024-06-17 23:04:01,170][12883] Updated weights for policy 0, policy_version 9200 (0.0027) +[2024-06-17 23:04:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40414.0, 300 sec: 39877.0). Total num frames: 150781952. Throughput: 0: 39639.7. Samples: 150825040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:04:01,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:04:05,352][12883] Updated weights for policy 0, policy_version 9210 (0.0030) +[2024-06-17 23:04:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39048.4, 300 sec: 39710.3). Total num frames: 150945792. Throughput: 0: 39531.4. Samples: 151065580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-17 23:04:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:04:09,381][12883] Updated weights for policy 0, policy_version 9220 (0.0047) +[2024-06-17 23:04:11,994][12645] Fps is (10 sec: 36044.5, 60 sec: 40141.6, 300 sec: 39711.2). Total num frames: 151142400. Throughput: 0: 39502.2. Samples: 151301880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-17 23:04:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:04:13,839][12883] Updated weights for policy 0, policy_version 9230 (0.0047) +[2024-06-17 23:04:16,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39594.7, 300 sec: 39654.8). Total num frames: 151322624. Throughput: 0: 39659.1. Samples: 151425600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:04:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:04:17,586][12883] Updated weights for policy 0, policy_version 9240 (0.0046) +[2024-06-17 23:04:21,773][12883] Updated weights for policy 0, policy_version 9250 (0.0052) +[2024-06-17 23:04:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.6, 300 sec: 39821.4). Total num frames: 151552000. Throughput: 0: 39987.1. Samples: 151670640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-17 23:04:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:04:25,480][12883] Updated weights for policy 0, policy_version 9260 (0.0048) +[2024-06-17 23:04:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 40140.7, 300 sec: 39821.4). Total num frames: 151764992. Throughput: 0: 39785.7. Samples: 151908780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-17 23:04:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:04:30,035][12883] Updated weights for policy 0, policy_version 9270 (0.0030) +[2024-06-17 23:04:31,994][12645] Fps is (10 sec: 37683.9, 60 sec: 39594.7, 300 sec: 39599.3). Total num frames: 151928832. Throughput: 0: 39842.3. Samples: 152031820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-17 23:04:31,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:04:33,655][12883] Updated weights for policy 0, policy_version 9280 (0.0036) +[2024-06-17 23:04:36,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 152141824. Throughput: 0: 40015.0. Samples: 152275620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:04:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:04:37,094][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009287_152158208.pth... +[2024-06-17 23:04:37,160][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008704_142606336.pth +[2024-06-17 23:04:38,103][12883] Updated weights for policy 0, policy_version 9290 (0.0034) +[2024-06-17 23:04:41,617][12883] Updated weights for policy 0, policy_version 9300 (0.0050) +[2024-06-17 23:04:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40413.9, 300 sec: 39877.0). Total num frames: 152371200. Throughput: 0: 40063.6. Samples: 152508680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:04:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:04:46,761][12883] Updated weights for policy 0, policy_version 9310 (0.0034) +[2024-06-17 23:04:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.6, 300 sec: 39765.9). Total num frames: 152551424. Throughput: 0: 40218.5. Samples: 152634880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:04:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:04:50,020][12883] Updated weights for policy 0, policy_version 9320 (0.0030) +[2024-06-17 23:04:50,057][12862] Signal inference workers to stop experience collection... (2150 times) +[2024-06-17 23:04:50,057][12862] Signal inference workers to resume experience collection... (2150 times) +[2024-06-17 23:04:50,067][12883] InferenceWorker_p0-w0: stopping experience collection (2150 times) +[2024-06-17 23:04:50,091][12883] InferenceWorker_p0-w0: resuming experience collection (2150 times) +[2024-06-17 23:04:51,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39867.6, 300 sec: 39765.9). Total num frames: 152748032. Throughput: 0: 40034.7. Samples: 152867140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:04:51,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:04:54,704][12883] Updated weights for policy 0, policy_version 9330 (0.0046) +[2024-06-17 23:04:56,997][12645] Fps is (10 sec: 40944.8, 60 sec: 40411.4, 300 sec: 39765.4). Total num frames: 152961024. Throughput: 0: 40228.2. Samples: 153112300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:04:56,998][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:04:58,138][12883] Updated weights for policy 0, policy_version 9340 (0.0031) +[2024-06-17 23:05:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39048.4, 300 sec: 39710.4). Total num frames: 153124864. Throughput: 0: 40091.0. Samples: 153229700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:05:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:05:03,363][12883] Updated weights for policy 0, policy_version 9350 (0.0038) +[2024-06-17 23:05:06,142][12883] Updated weights for policy 0, policy_version 9360 (0.0051) +[2024-06-17 23:05:06,994][12645] Fps is (10 sec: 40975.2, 60 sec: 40413.9, 300 sec: 39821.5). Total num frames: 153370624. Throughput: 0: 39958.7. Samples: 153468780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:05:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:05:11,537][12883] Updated weights for policy 0, policy_version 9370 (0.0034) +[2024-06-17 23:05:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40140.8, 300 sec: 39766.2). Total num frames: 153550848. Throughput: 0: 40198.4. Samples: 153717700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-17 23:05:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:05:14,065][12883] Updated weights for policy 0, policy_version 9380 (0.0029) +[2024-06-17 23:05:17,000][12645] Fps is (10 sec: 37659.9, 60 sec: 40409.6, 300 sec: 39820.6). Total num frames: 153747456. Throughput: 0: 39974.3. Samples: 153830920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 23:05:17,001][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:05:19,582][12883] Updated weights for policy 0, policy_version 9390 (0.0037) +[2024-06-17 23:05:21,998][12645] Fps is (10 sec: 42577.8, 60 sec: 40410.7, 300 sec: 39931.9). Total num frames: 153976832. Throughput: 0: 39918.9. Samples: 154072160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 23:05:21,999][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:05:22,628][12883] Updated weights for policy 0, policy_version 9400 (0.0040) +[2024-06-17 23:05:26,994][12645] Fps is (10 sec: 36067.6, 60 sec: 39048.6, 300 sec: 39599.3). Total num frames: 154107904. Throughput: 0: 40170.7. Samples: 154316360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 23:05:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:05:27,605][12883] Updated weights for policy 0, policy_version 9410 (0.0047) +[2024-06-17 23:05:30,883][12883] Updated weights for policy 0, policy_version 9420 (0.0029) +[2024-06-17 23:05:31,996][12645] Fps is (10 sec: 40970.5, 60 sec: 40958.4, 300 sec: 39932.5). Total num frames: 154386432. Throughput: 0: 39885.2. Samples: 154429800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 23:05:31,997][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:05:35,747][12883] Updated weights for policy 0, policy_version 9430 (0.0026) +[2024-06-17 23:05:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39867.8, 300 sec: 39654.8). Total num frames: 154533888. Throughput: 0: 39913.4. Samples: 154663240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 23:05:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:05:39,022][12883] Updated weights for policy 0, policy_version 9440 (0.0037) +[2024-06-17 23:05:41,994][12645] Fps is (10 sec: 32775.5, 60 sec: 39048.5, 300 sec: 39710.4). Total num frames: 154714112. Throughput: 0: 39829.2. Samples: 154904460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 23:05:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:05:44,083][12883] Updated weights for policy 0, policy_version 9450 (0.0032) +[2024-06-17 23:05:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 40413.9, 300 sec: 39877.3). Total num frames: 154976256. Throughput: 0: 39911.7. Samples: 155025720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-17 23:05:46,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:05:47,063][12883] Updated weights for policy 0, policy_version 9460 (0.0035) +[2024-06-17 23:05:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.8, 300 sec: 39710.4). Total num frames: 155123712. Throughput: 0: 40036.6. Samples: 155270420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 23:05:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:05:52,368][12883] Updated weights for policy 0, policy_version 9470 (0.0041) +[2024-06-17 23:05:55,488][12883] Updated weights for policy 0, policy_version 9480 (0.0030) +[2024-06-17 23:05:57,000][12645] Fps is (10 sec: 37659.6, 60 sec: 39866.1, 300 sec: 39931.7). Total num frames: 155353088. Throughput: 0: 39584.7. Samples: 155499260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 23:05:57,000][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:06:00,467][12883] Updated weights for policy 0, policy_version 9490 (0.0047) +[2024-06-17 23:06:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40140.9, 300 sec: 39765.9). Total num frames: 155533312. Throughput: 0: 39895.9. Samples: 155625980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) +[2024-06-17 23:06:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:06:03,796][12883] Updated weights for policy 0, policy_version 9500 (0.0043) +[2024-06-17 23:06:06,996][12645] Fps is (10 sec: 36058.4, 60 sec: 39047.0, 300 sec: 39765.6). Total num frames: 155713536. Throughput: 0: 39779.9. Samples: 155862160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-17 23:06:06,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:06:08,312][12862] Signal inference workers to stop experience collection... (2200 times) +[2024-06-17 23:06:08,340][12883] InferenceWorker_p0-w0: stopping experience collection (2200 times) +[2024-06-17 23:06:08,379][12862] Signal inference workers to resume experience collection... (2200 times) +[2024-06-17 23:06:08,380][12883] InferenceWorker_p0-w0: resuming experience collection (2200 times) +[2024-06-17 23:06:08,523][12883] Updated weights for policy 0, policy_version 9510 (0.0030) +[2024-06-17 23:06:11,743][12883] Updated weights for policy 0, policy_version 9520 (0.0034) +[2024-06-17 23:06:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40413.9, 300 sec: 39932.6). Total num frames: 155975680. Throughput: 0: 39773.8. Samples: 156106180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-17 23:06:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:06:16,635][12883] Updated weights for policy 0, policy_version 9530 (0.0042) +[2024-06-17 23:06:16,994][12645] Fps is (10 sec: 44247.9, 60 sec: 40145.0, 300 sec: 39877.0). Total num frames: 156155904. Throughput: 0: 40182.9. Samples: 156237940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 23:06:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:06:19,979][12883] Updated weights for policy 0, policy_version 9540 (0.0040) +[2024-06-17 23:06:21,994][12645] Fps is (10 sec: 37682.3, 60 sec: 39597.7, 300 sec: 39877.0). Total num frames: 156352512. Throughput: 0: 40246.6. Samples: 156474340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 23:06:22,003][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:06:24,582][12883] Updated weights for policy 0, policy_version 9550 (0.0024) +[2024-06-17 23:06:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 39877.0). Total num frames: 156565504. Throughput: 0: 40461.4. Samples: 156725220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-17 23:06:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:06:27,939][12883] Updated weights for policy 0, policy_version 9560 (0.0040) +[2024-06-17 23:06:31,994][12645] Fps is (10 sec: 39322.3, 60 sec: 39323.1, 300 sec: 39765.9). Total num frames: 156745728. Throughput: 0: 40341.3. Samples: 156841080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 23:06:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:06:33,224][12883] Updated weights for policy 0, policy_version 9570 (0.0035) +[2024-06-17 23:06:35,990][12883] Updated weights for policy 0, policy_version 9580 (0.0032) +[2024-06-17 23:06:36,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40686.9, 300 sec: 39988.1). Total num frames: 156975104. Throughput: 0: 40163.8. Samples: 157077800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 23:06:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:06:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009581_156975104.pth... +[2024-06-17 23:06:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008995_147374080.pth +[2024-06-17 23:06:41,155][12883] Updated weights for policy 0, policy_version 9590 (0.0048) +[2024-06-17 23:06:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 39877.0). Total num frames: 157155328. Throughput: 0: 40462.1. Samples: 157319800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:06:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:06:44,655][12883] Updated weights for policy 0, policy_version 9600 (0.0037) +[2024-06-17 23:06:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39594.6, 300 sec: 39821.5). Total num frames: 157351936. Throughput: 0: 40147.0. Samples: 157432600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:06:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:06:49,554][12883] Updated weights for policy 0, policy_version 9610 (0.0038) +[2024-06-17 23:06:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 39932.6). Total num frames: 157564928. Throughput: 0: 40339.6. Samples: 157677340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:06:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:06:52,647][12883] Updated weights for policy 0, policy_version 9620 (0.0034) +[2024-06-17 23:06:56,996][12645] Fps is (10 sec: 39313.1, 60 sec: 39870.4, 300 sec: 39821.1). Total num frames: 157745152. Throughput: 0: 40326.4. Samples: 157920960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-17 23:06:56,997][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:06:57,602][12883] Updated weights for policy 0, policy_version 9630 (0.0044) +[2024-06-17 23:07:01,316][12883] Updated weights for policy 0, policy_version 9640 (0.0029) +[2024-06-17 23:07:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 39877.0). Total num frames: 157958144. Throughput: 0: 39921.7. Samples: 158034420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-17 23:07:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:07:05,582][12883] Updated weights for policy 0, policy_version 9650 (0.0040) +[2024-06-17 23:07:06,994][12645] Fps is (10 sec: 40968.7, 60 sec: 40688.5, 300 sec: 39877.0). Total num frames: 158154752. Throughput: 0: 40048.0. Samples: 158276500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-17 23:07:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:07:09,323][12883] Updated weights for policy 0, policy_version 9660 (0.0036) +[2024-06-17 23:07:11,996][12645] Fps is (10 sec: 39312.8, 60 sec: 39593.1, 300 sec: 39821.4). Total num frames: 158351360. Throughput: 0: 39765.1. Samples: 158514740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-17 23:07:11,996][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:07:13,610][12883] Updated weights for policy 0, policy_version 9670 (0.0038) +[2024-06-17 23:07:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.7, 300 sec: 39988.1). Total num frames: 158564352. Throughput: 0: 39797.2. Samples: 158631960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-17 23:07:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:07:17,757][12883] Updated weights for policy 0, policy_version 9680 (0.0042) +[2024-06-17 23:07:21,509][12883] Updated weights for policy 0, policy_version 9690 (0.0052) +[2024-06-17 23:07:21,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 158760960. Throughput: 0: 39920.1. Samples: 158874200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 23:07:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:07:26,137][12883] Updated weights for policy 0, policy_version 9700 (0.0043) +[2024-06-17 23:07:26,996][12645] Fps is (10 sec: 40951.0, 60 sec: 40139.2, 300 sec: 39987.7). Total num frames: 158973952. Throughput: 0: 39796.2. Samples: 159110720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-17 23:07:26,997][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:07:28,917][12862] Signal inference workers to stop experience collection... (2250 times) +[2024-06-17 23:07:28,971][12862] Signal inference workers to resume experience collection... (2250 times) +[2024-06-17 23:07:28,972][12883] InferenceWorker_p0-w0: stopping experience collection (2250 times) +[2024-06-17 23:07:28,991][12883] InferenceWorker_p0-w0: resuming experience collection (2250 times) +[2024-06-17 23:07:29,760][12883] Updated weights for policy 0, policy_version 9710 (0.0046) +[2024-06-17 23:07:31,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39867.6, 300 sec: 39877.0). Total num frames: 159137792. Throughput: 0: 39865.7. Samples: 159226560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-17 23:07:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:07:34,262][12883] Updated weights for policy 0, policy_version 9720 (0.0042) +[2024-06-17 23:07:36,994][12645] Fps is (10 sec: 37692.0, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 159350784. Throughput: 0: 39797.8. Samples: 159468240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 23:07:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:07:38,087][12883] Updated weights for policy 0, policy_version 9730 (0.0032) +[2024-06-17 23:07:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.7, 300 sec: 39988.4). Total num frames: 159547392. Throughput: 0: 39757.5. Samples: 159709960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 23:07:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:07:42,451][12883] Updated weights for policy 0, policy_version 9740 (0.0032) +[2024-06-17 23:07:46,068][12883] Updated weights for policy 0, policy_version 9750 (0.0044) +[2024-06-17 23:07:46,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40140.8, 300 sec: 40043.6). Total num frames: 159760384. Throughput: 0: 39885.3. Samples: 159829260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:07:46,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:07:50,830][12883] Updated weights for policy 0, policy_version 9760 (0.0037) +[2024-06-17 23:07:51,995][12645] Fps is (10 sec: 37678.6, 60 sec: 39320.8, 300 sec: 39876.8). Total num frames: 159924224. Throughput: 0: 39813.2. Samples: 160068140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 23:07:51,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:07:54,674][12883] Updated weights for policy 0, policy_version 9770 (0.0031) +[2024-06-17 23:07:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39869.2, 300 sec: 39932.5). Total num frames: 160137216. Throughput: 0: 39745.1. Samples: 160303180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 23:07:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:07:59,683][12883] Updated weights for policy 0, policy_version 9780 (0.0034) +[2024-06-17 23:08:01,994][12645] Fps is (10 sec: 44242.3, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 160366592. Throughput: 0: 39892.1. Samples: 160427100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-17 23:08:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:08:02,705][12883] Updated weights for policy 0, policy_version 9790 (0.0044) +[2024-06-17 23:08:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39594.7, 300 sec: 39988.2). Total num frames: 160530432. Throughput: 0: 39750.6. Samples: 160662980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 23:08:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:08:07,609][12883] Updated weights for policy 0, policy_version 9800 (0.0044) +[2024-06-17 23:08:11,312][12883] Updated weights for policy 0, policy_version 9810 (0.0041) +[2024-06-17 23:08:11,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39596.1, 300 sec: 39932.5). Total num frames: 160727040. Throughput: 0: 39748.6. Samples: 160899320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-17 23:08:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:08:15,970][12883] Updated weights for policy 0, policy_version 9820 (0.0040) +[2024-06-17 23:08:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 160940032. Throughput: 0: 39907.2. Samples: 161022380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 23:08:16,999][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:08:19,249][12883] Updated weights for policy 0, policy_version 9830 (0.0022) +[2024-06-17 23:08:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 161136640. Throughput: 0: 39747.5. Samples: 161256880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 23:08:21,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:08:24,060][12883] Updated weights for policy 0, policy_version 9840 (0.0040) +[2024-06-17 23:08:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39323.1, 300 sec: 39932.5). Total num frames: 161333248. Throughput: 0: 39620.0. Samples: 161492860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-17 23:08:26,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-17 23:08:27,035][12862] Saving new best policy, reward=0.015! +[2024-06-17 23:08:28,279][12883] Updated weights for policy 0, policy_version 9850 (0.0038) +[2024-06-17 23:08:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39867.8, 300 sec: 39932.5). Total num frames: 161529856. Throughput: 0: 39668.0. Samples: 161614320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 23:08:32,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:08:32,136][12883] Updated weights for policy 0, policy_version 9860 (0.0036) +[2024-06-17 23:08:36,446][12883] Updated weights for policy 0, policy_version 9870 (0.0054) +[2024-06-17 23:08:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39594.6, 300 sec: 39932.5). Total num frames: 161726464. Throughput: 0: 39637.6. Samples: 161851780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 23:08:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:08:37,195][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009873_161759232.pth... +[2024-06-17 23:08:37,256][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009287_152158208.pth +[2024-06-17 23:08:40,331][12883] Updated weights for policy 0, policy_version 9880 (0.0034) +[2024-06-17 23:08:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39867.6, 300 sec: 39932.5). Total num frames: 161939456. Throughput: 0: 39699.9. Samples: 162089680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:08:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:08:44,582][12883] Updated weights for policy 0, policy_version 9890 (0.0039) +[2024-06-17 23:08:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39321.6, 300 sec: 39877.0). Total num frames: 162119680. Throughput: 0: 39651.1. Samples: 162211400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:08:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:08:48,416][12883] Updated weights for policy 0, policy_version 9900 (0.0034) +[2024-06-17 23:08:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39868.5, 300 sec: 39932.5). Total num frames: 162316288. Throughput: 0: 39585.0. Samples: 162444300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-17 23:08:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:08:52,787][12883] Updated weights for policy 0, policy_version 9910 (0.0049) +[2024-06-17 23:08:56,859][12883] Updated weights for policy 0, policy_version 9920 (0.0036) +[2024-06-17 23:08:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39867.7, 300 sec: 39821.4). Total num frames: 162529280. Throughput: 0: 39589.0. Samples: 162680820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-17 23:08:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:09:01,505][12883] Updated weights for policy 0, policy_version 9930 (0.0048) +[2024-06-17 23:09:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39048.6, 300 sec: 39877.0). Total num frames: 162709504. Throughput: 0: 39531.6. Samples: 162801300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-17 23:09:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:09:04,835][12883] Updated weights for policy 0, policy_version 9940 (0.0035) +[2024-06-17 23:09:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.8, 300 sec: 39988.1). Total num frames: 162938880. Throughput: 0: 39549.7. Samples: 163036620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) +[2024-06-17 23:09:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:09:09,555][12862] Signal inference workers to stop experience collection... (2300 times) +[2024-06-17 23:09:09,559][12862] Signal inference workers to resume experience collection... (2300 times) +[2024-06-17 23:09:09,575][12883] InferenceWorker_p0-w0: stopping experience collection (2300 times) +[2024-06-17 23:09:09,612][12883] InferenceWorker_p0-w0: resuming experience collection (2300 times) +[2024-06-17 23:09:09,702][12883] Updated weights for policy 0, policy_version 9950 (0.0046) +[2024-06-17 23:09:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 39988.1). Total num frames: 163119104. Throughput: 0: 39674.7. Samples: 163278220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:09:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:09:12,739][12883] Updated weights for policy 0, policy_version 9960 (0.0045) +[2024-06-17 23:09:16,994][12645] Fps is (10 sec: 36044.5, 60 sec: 39321.5, 300 sec: 39821.4). Total num frames: 163299328. Throughput: 0: 39648.8. Samples: 163398520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:09:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:09:17,844][12883] Updated weights for policy 0, policy_version 9970 (0.0035) +[2024-06-17 23:09:20,792][12883] Updated weights for policy 0, policy_version 9980 (0.0039) +[2024-06-17 23:09:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39867.6, 300 sec: 39877.0). Total num frames: 163528704. Throughput: 0: 39586.1. Samples: 163633160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-17 23:09:22,000][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:09:25,923][12883] Updated weights for policy 0, policy_version 9990 (0.0046) +[2024-06-17 23:09:26,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 163708928. Throughput: 0: 39713.1. Samples: 163876760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-17 23:09:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:09:29,553][12883] Updated weights for policy 0, policy_version 10000 (0.0047) +[2024-06-17 23:09:31,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39594.7, 300 sec: 39877.0). Total num frames: 163905536. Throughput: 0: 39517.4. Samples: 163989680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:09:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:09:34,187][12883] Updated weights for policy 0, policy_version 10010 (0.0047) +[2024-06-17 23:09:36,996][12645] Fps is (10 sec: 40951.3, 60 sec: 39866.3, 300 sec: 39821.2). Total num frames: 164118528. Throughput: 0: 39789.8. Samples: 164234920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-17 23:09:36,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:09:37,605][12883] Updated weights for policy 0, policy_version 10020 (0.0041) +[2024-06-17 23:09:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39321.7, 300 sec: 39821.5). Total num frames: 164298752. Throughput: 0: 39867.5. Samples: 164474860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-17 23:09:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:09:42,450][12883] Updated weights for policy 0, policy_version 10030 (0.0032) +[2024-06-17 23:09:45,931][12883] Updated weights for policy 0, policy_version 10040 (0.0031) +[2024-06-17 23:09:46,994][12645] Fps is (10 sec: 40968.1, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 164528128. Throughput: 0: 39927.4. Samples: 164598040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 23:09:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:09:50,738][12883] Updated weights for policy 0, policy_version 10050 (0.0035) +[2024-06-17 23:09:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40140.8, 300 sec: 39877.5). Total num frames: 164724736. Throughput: 0: 40081.8. Samples: 164840300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-17 23:09:51,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:09:53,964][12883] Updated weights for policy 0, policy_version 10060 (0.0042) +[2024-06-17 23:09:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39594.6, 300 sec: 39932.5). Total num frames: 164904960. Throughput: 0: 39937.2. Samples: 165075400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-17 23:09:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:09:58,842][12883] Updated weights for policy 0, policy_version 10070 (0.0046) +[2024-06-17 23:10:01,993][12645] Fps is (10 sec: 40960.8, 60 sec: 40413.9, 300 sec: 39877.0). Total num frames: 165134336. Throughput: 0: 39902.5. Samples: 165194120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) +[2024-06-17 23:10:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:10:02,105][12883] Updated weights for policy 0, policy_version 10080 (0.0041) +[2024-06-17 23:10:06,980][12883] Updated weights for policy 0, policy_version 10090 (0.0043) +[2024-06-17 23:10:06,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 165314560. Throughput: 0: 40070.4. Samples: 165436320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) +[2024-06-17 23:10:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:10:10,625][12883] Updated weights for policy 0, policy_version 10100 (0.0043) +[2024-06-17 23:10:11,994][12645] Fps is (10 sec: 37682.4, 60 sec: 39867.7, 300 sec: 39877.8). Total num frames: 165511168. Throughput: 0: 39847.9. Samples: 165669920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 23:10:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:10:15,034][12883] Updated weights for policy 0, policy_version 10110 (0.0026) +[2024-06-17 23:10:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40413.9, 300 sec: 39822.1). Total num frames: 165724160. Throughput: 0: 40071.0. Samples: 165792880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:10:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:10:18,786][12883] Updated weights for policy 0, policy_version 10120 (0.0043) +[2024-06-17 23:10:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39321.7, 300 sec: 39932.5). Total num frames: 165888000. Throughput: 0: 39785.0. Samples: 166025160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:10:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:10:23,558][12883] Updated weights for policy 0, policy_version 10130 (0.0039) +[2024-06-17 23:10:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.8, 300 sec: 39766.2). Total num frames: 166117376. Throughput: 0: 39698.7. Samples: 166261300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 23:10:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:10:27,020][12883] Updated weights for policy 0, policy_version 10140 (0.0034) +[2024-06-17 23:10:31,663][12883] Updated weights for policy 0, policy_version 10150 (0.0036) +[2024-06-17 23:10:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 166297600. Throughput: 0: 39552.0. Samples: 166377880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 23:10:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:10:35,129][12883] Updated weights for policy 0, policy_version 10160 (0.0043) +[2024-06-17 23:10:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39869.1, 300 sec: 39988.1). Total num frames: 166510592. Throughput: 0: 39447.6. Samples: 166615440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 23:10:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:10:37,158][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010164_166526976.pth... +[2024-06-17 23:10:37,201][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009581_156975104.pth +[2024-06-17 23:10:39,815][12883] Updated weights for policy 0, policy_version 10170 (0.0033) +[2024-06-17 23:10:41,988][12862] Signal inference workers to stop experience collection... (2350 times) +[2024-06-17 23:10:41,988][12862] Signal inference workers to resume experience collection... (2350 times) +[2024-06-17 23:10:41,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 166690816. Throughput: 0: 39711.7. Samples: 166862420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-17 23:10:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:10:42,017][12883] InferenceWorker_p0-w0: stopping experience collection (2350 times) +[2024-06-17 23:10:42,017][12883] InferenceWorker_p0-w0: resuming experience collection (2350 times) +[2024-06-17 23:10:43,214][12883] Updated weights for policy 0, policy_version 10180 (0.0036) +[2024-06-17 23:10:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39321.7, 300 sec: 39877.0). Total num frames: 166887424. Throughput: 0: 39691.1. Samples: 166980220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-17 23:10:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:10:47,722][12883] Updated weights for policy 0, policy_version 10190 (0.0030) +[2024-06-17 23:10:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 39867.8, 300 sec: 39877.8). Total num frames: 167116800. Throughput: 0: 39616.4. Samples: 167219060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-17 23:10:51,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-17 23:10:51,995][12862] Saving new best policy, reward=0.016! +[2024-06-17 23:10:51,998][12883] Updated weights for policy 0, policy_version 10200 (0.0035) +[2024-06-17 23:10:55,837][12883] Updated weights for policy 0, policy_version 10210 (0.0041) +[2024-06-17 23:10:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39867.8, 300 sec: 39877.0). Total num frames: 167297024. Throughput: 0: 39807.1. Samples: 167461240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-17 23:10:56,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-17 23:10:57,010][12862] Saving new best policy, reward=0.023! +[2024-06-17 23:11:00,234][12883] Updated weights for policy 0, policy_version 10220 (0.0046) +[2024-06-17 23:11:01,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39048.5, 300 sec: 39877.3). Total num frames: 167477248. Throughput: 0: 39686.4. Samples: 167578760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:11:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:11:03,945][12883] Updated weights for policy 0, policy_version 10230 (0.0042) +[2024-06-17 23:11:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 167706624. Throughput: 0: 39904.9. Samples: 167820880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:11:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:11:08,353][12883] Updated weights for policy 0, policy_version 10240 (0.0047) +[2024-06-17 23:11:11,996][12645] Fps is (10 sec: 44226.6, 60 sec: 40139.4, 300 sec: 39876.7). Total num frames: 167919616. Throughput: 0: 39835.8. Samples: 168054000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:11:11,997][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:11:12,224][12883] Updated weights for policy 0, policy_version 10250 (0.0035) +[2024-06-17 23:11:16,762][12883] Updated weights for policy 0, policy_version 10260 (0.0038) +[2024-06-17 23:11:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39594.8, 300 sec: 39821.5). Total num frames: 168099840. Throughput: 0: 40035.7. Samples: 168179480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:11:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:11:20,255][12883] Updated weights for policy 0, policy_version 10270 (0.0037) +[2024-06-17 23:11:21,994][12645] Fps is (10 sec: 36051.7, 60 sec: 39867.5, 300 sec: 39710.3). Total num frames: 168280064. Throughput: 0: 39959.8. Samples: 168413640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:11:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:11:24,851][12883] Updated weights for policy 0, policy_version 10280 (0.0041) +[2024-06-17 23:11:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39321.6, 300 sec: 39765.9). Total num frames: 168476672. Throughput: 0: 40070.7. Samples: 168665600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-17 23:11:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:11:28,422][12883] Updated weights for policy 0, policy_version 10290 (0.0044) +[2024-06-17 23:11:31,994][12645] Fps is (10 sec: 42599.4, 60 sec: 40140.8, 300 sec: 39765.9). Total num frames: 168706048. Throughput: 0: 39954.5. Samples: 168778180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 23:11:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:11:33,322][12883] Updated weights for policy 0, policy_version 10300 (0.0032) +[2024-06-17 23:11:36,746][12883] Updated weights for policy 0, policy_version 10310 (0.0038) +[2024-06-17 23:11:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 168919040. Throughput: 0: 40009.3. Samples: 169019480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-17 23:11:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:11:41,299][12883] Updated weights for policy 0, policy_version 10320 (0.0039) +[2024-06-17 23:11:41,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 169082880. Throughput: 0: 39942.3. Samples: 169258640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-17 23:11:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:11:44,899][12883] Updated weights for policy 0, policy_version 10330 (0.0037) +[2024-06-17 23:11:46,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.8, 300 sec: 39765.9). Total num frames: 169295872. Throughput: 0: 39905.3. Samples: 169374500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-17 23:11:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:11:49,676][12883] Updated weights for policy 0, policy_version 10340 (0.0044) +[2024-06-17 23:11:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 39594.6, 300 sec: 39821.7). Total num frames: 169492480. Throughput: 0: 39969.6. Samples: 169619520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-17 23:11:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:11:53,304][12883] Updated weights for policy 0, policy_version 10350 (0.0046) +[2024-06-17 23:11:56,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40139.3, 300 sec: 39821.1). Total num frames: 169705472. Throughput: 0: 40072.0. Samples: 169857240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:11:56,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:11:57,772][12883] Updated weights for policy 0, policy_version 10360 (0.0035) +[2024-06-17 23:11:59,423][12862] Signal inference workers to stop experience collection... (2400 times) +[2024-06-17 23:11:59,471][12883] InferenceWorker_p0-w0: stopping experience collection (2400 times) +[2024-06-17 23:11:59,535][12862] Signal inference workers to resume experience collection... (2400 times) +[2024-06-17 23:11:59,535][12883] InferenceWorker_p0-w0: resuming experience collection (2400 times) +[2024-06-17 23:12:01,681][12883] Updated weights for policy 0, policy_version 10370 (0.0032) +[2024-06-17 23:12:01,996][12645] Fps is (10 sec: 42589.5, 60 sec: 40685.4, 300 sec: 39876.7). Total num frames: 169918464. Throughput: 0: 39910.9. Samples: 169975560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:12:01,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:12:06,255][12883] Updated weights for policy 0, policy_version 10380 (0.0031) +[2024-06-17 23:12:06,994][12645] Fps is (10 sec: 37691.9, 60 sec: 39594.7, 300 sec: 39766.2). Total num frames: 170082304. Throughput: 0: 39975.0. Samples: 170212500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:12:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:12:09,953][12883] Updated weights for policy 0, policy_version 10390 (0.0036) +[2024-06-17 23:12:11,994][12645] Fps is (10 sec: 36052.4, 60 sec: 39323.0, 300 sec: 39710.4). Total num frames: 170278912. Throughput: 0: 39612.7. Samples: 170448180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:12:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:12:14,382][12883] Updated weights for policy 0, policy_version 10400 (0.0030) +[2024-06-17 23:12:16,994][12645] Fps is (10 sec: 40959.2, 60 sec: 39867.6, 300 sec: 39765.9). Total num frames: 170491904. Throughput: 0: 39757.7. Samples: 170567280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) +[2024-06-17 23:12:16,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:12:18,100][12883] Updated weights for policy 0, policy_version 10410 (0.0029) +[2024-06-17 23:12:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40141.0, 300 sec: 39710.7). Total num frames: 170688512. Throughput: 0: 39747.1. Samples: 170808100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-17 23:12:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:12:22,478][12883] Updated weights for policy 0, policy_version 10420 (0.0034) +[2024-06-17 23:12:26,650][12883] Updated weights for policy 0, policy_version 10430 (0.0042) +[2024-06-17 23:12:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.8, 300 sec: 39821.5). Total num frames: 170885120. Throughput: 0: 39695.5. Samples: 171044940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-17 23:12:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:12:30,736][12883] Updated weights for policy 0, policy_version 10440 (0.0044) +[2024-06-17 23:12:31,996][12645] Fps is (10 sec: 37675.0, 60 sec: 39320.2, 300 sec: 39710.1). Total num frames: 171065344. Throughput: 0: 39793.1. Samples: 171165280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-17 23:12:31,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:12:34,974][12883] Updated weights for policy 0, policy_version 10450 (0.0040) +[2024-06-17 23:12:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39765.9). Total num frames: 171278336. Throughput: 0: 39622.3. Samples: 171402520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-17 23:12:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:12:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010455_171294720.pth... +[2024-06-17 23:12:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009873_161759232.pth +[2024-06-17 23:12:38,729][12883] Updated weights for policy 0, policy_version 10460 (0.0037) +[2024-06-17 23:12:41,994][12645] Fps is (10 sec: 44246.7, 60 sec: 40413.8, 300 sec: 39821.5). Total num frames: 171507712. Throughput: 0: 39601.1. Samples: 171639200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-17 23:12:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:12:43,004][12883] Updated weights for policy 0, policy_version 10470 (0.0037) +[2024-06-17 23:12:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 39867.6, 300 sec: 39877.1). Total num frames: 171687936. Throughput: 0: 39662.7. Samples: 171760300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-17 23:12:46,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:12:47,191][12883] Updated weights for policy 0, policy_version 10480 (0.0024) +[2024-06-17 23:12:51,504][12883] Updated weights for policy 0, policy_version 10490 (0.0031) +[2024-06-17 23:12:51,996][12645] Fps is (10 sec: 37674.6, 60 sec: 39866.3, 300 sec: 39821.1). Total num frames: 171884544. Throughput: 0: 39694.8. Samples: 171998860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-17 23:12:51,997][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:12:55,583][12883] Updated weights for policy 0, policy_version 10500 (0.0045) +[2024-06-17 23:12:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39596.1, 300 sec: 39710.4). Total num frames: 172081152. Throughput: 0: 39809.7. Samples: 172239620. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) +[2024-06-17 23:12:56,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:12:59,753][12883] Updated weights for policy 0, policy_version 10510 (0.0037) +[2024-06-17 23:13:01,996][12645] Fps is (10 sec: 39321.7, 60 sec: 39321.6, 300 sec: 39821.2). Total num frames: 172277760. Throughput: 0: 39813.7. Samples: 172358980. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) +[2024-06-17 23:13:01,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:13:03,810][12883] Updated weights for policy 0, policy_version 10520 (0.0032) +[2024-06-17 23:13:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.6, 300 sec: 39821.5). Total num frames: 172474368. Throughput: 0: 39515.1. Samples: 172586280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) +[2024-06-17 23:13:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:13:07,874][12883] Updated weights for policy 0, policy_version 10530 (0.0050) +[2024-06-17 23:13:11,994][12645] Fps is (10 sec: 37691.2, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 172654592. Throughput: 0: 39622.1. Samples: 172827940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 23:13:11,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:13:12,485][12883] Updated weights for policy 0, policy_version 10540 (0.0052) +[2024-06-17 23:13:16,137][12883] Updated weights for policy 0, policy_version 10550 (0.0031) +[2024-06-17 23:13:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 172867584. Throughput: 0: 39446.8. Samples: 172940300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 23:13:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:13:21,052][12883] Updated weights for policy 0, policy_version 10560 (0.0037) +[2024-06-17 23:13:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 39867.7, 300 sec: 39821.4). Total num frames: 173080576. Throughput: 0: 39685.2. Samples: 173188360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) +[2024-06-17 23:13:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:13:24,299][12883] Updated weights for policy 0, policy_version 10570 (0.0024) +[2024-06-17 23:13:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 173260800. Throughput: 0: 39600.9. Samples: 173421240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) +[2024-06-17 23:13:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:13:29,358][12883] Updated weights for policy 0, policy_version 10580 (0.0030) +[2024-06-17 23:13:29,699][12862] Signal inference workers to stop experience collection... (2450 times) +[2024-06-17 23:13:29,699][12862] Signal inference workers to resume experience collection... (2450 times) +[2024-06-17 23:13:29,729][12883] InferenceWorker_p0-w0: stopping experience collection (2450 times) +[2024-06-17 23:13:29,730][12883] InferenceWorker_p0-w0: resuming experience collection (2450 times) +[2024-06-17 23:13:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40142.2, 300 sec: 39821.4). Total num frames: 173473792. Throughput: 0: 39700.0. Samples: 173546800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-17 23:13:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:13:32,832][12883] Updated weights for policy 0, policy_version 10590 (0.0055) +[2024-06-17 23:13:36,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39321.5, 300 sec: 39654.8). Total num frames: 173637632. Throughput: 0: 39739.7. Samples: 173787060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-17 23:13:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:13:37,375][12883] Updated weights for policy 0, policy_version 10600 (0.0029) +[2024-06-17 23:13:40,862][12883] Updated weights for policy 0, policy_version 10610 (0.0046) +[2024-06-17 23:13:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 173883392. Throughput: 0: 39606.7. Samples: 174021920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-17 23:13:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:13:45,538][12883] Updated weights for policy 0, policy_version 10620 (0.0050) +[2024-06-17 23:13:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 174080000. Throughput: 0: 39785.0. Samples: 174149220. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) +[2024-06-17 23:13:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:13:48,957][12883] Updated weights for policy 0, policy_version 10630 (0.0039) +[2024-06-17 23:13:51,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39323.1, 300 sec: 39710.4). Total num frames: 174243840. Throughput: 0: 39804.5. Samples: 174377480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) +[2024-06-17 23:13:52,004][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:13:53,823][12883] Updated weights for policy 0, policy_version 10640 (0.0035) +[2024-06-17 23:13:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39594.7, 300 sec: 39821.4). Total num frames: 174456832. Throughput: 0: 39708.9. Samples: 174614840. Policy #0 lag: (min: 1.0, avg: 7.2, max: 19.0) +[2024-06-17 23:13:57,003][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:13:57,518][12883] Updated weights for policy 0, policy_version 10650 (0.0034) +[2024-06-17 23:14:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39050.0, 300 sec: 39599.3). Total num frames: 174620672. Throughput: 0: 39812.9. Samples: 174731880. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) +[2024-06-17 23:14:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:14:02,269][12883] Updated weights for policy 0, policy_version 10660 (0.0045) +[2024-06-17 23:14:05,405][12883] Updated weights for policy 0, policy_version 10670 (0.0043) +[2024-06-17 23:14:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 174882816. Throughput: 0: 39650.7. Samples: 174972640. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) +[2024-06-17 23:14:07,000][12645] Avg episode reward: [(0, '0.018')] +[2024-06-17 23:14:10,622][12883] Updated weights for policy 0, policy_version 10680 (0.0039) +[2024-06-17 23:14:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39867.7, 300 sec: 39821.5). Total num frames: 175046656. Throughput: 0: 39807.9. Samples: 175212600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-17 23:14:11,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:14:13,547][12883] Updated weights for policy 0, policy_version 10690 (0.0039) +[2024-06-17 23:14:16,994][12645] Fps is (10 sec: 34406.6, 60 sec: 39321.6, 300 sec: 39654.8). Total num frames: 175226880. Throughput: 0: 39465.8. Samples: 175322760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-17 23:14:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:14:18,823][12883] Updated weights for policy 0, policy_version 10700 (0.0040) +[2024-06-17 23:14:21,992][12883] Updated weights for policy 0, policy_version 10710 (0.0044) +[2024-06-17 23:14:21,994][12645] Fps is (10 sec: 42599.2, 60 sec: 39867.9, 300 sec: 39877.0). Total num frames: 175472640. Throughput: 0: 39532.2. Samples: 175566000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) +[2024-06-17 23:14:21,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:14:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39321.5, 300 sec: 39710.4). Total num frames: 175620096. Throughput: 0: 39836.0. Samples: 175814540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) +[2024-06-17 23:14:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:14:27,119][12883] Updated weights for policy 0, policy_version 10720 (0.0034) +[2024-06-17 23:14:29,943][12883] Updated weights for policy 0, policy_version 10730 (0.0036) +[2024-06-17 23:14:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40140.8, 300 sec: 39877.3). Total num frames: 175882240. Throughput: 0: 39381.0. Samples: 175921360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 24.0) +[2024-06-17 23:14:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:14:35,331][12883] Updated weights for policy 0, policy_version 10740 (0.0049) +[2024-06-17 23:14:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40140.9, 300 sec: 39821.5). Total num frames: 176046080. Throughput: 0: 39965.3. Samples: 176175920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-17 23:14:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:14:37,100][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010746_176062464.pth... +[2024-06-17 23:14:37,154][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010164_166526976.pth +[2024-06-17 23:14:38,141][12883] Updated weights for policy 0, policy_version 10750 (0.0047) +[2024-06-17 23:14:41,994][12645] Fps is (10 sec: 34406.5, 60 sec: 39048.6, 300 sec: 39654.8). Total num frames: 176226304. Throughput: 0: 39911.6. Samples: 176410860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-17 23:14:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:14:43,425][12883] Updated weights for policy 0, policy_version 10760 (0.0038) +[2024-06-17 23:14:46,080][12862] Signal inference workers to stop experience collection... (2500 times) +[2024-06-17 23:14:46,080][12862] Signal inference workers to resume experience collection... (2500 times) +[2024-06-17 23:14:46,128][12883] InferenceWorker_p0-w0: stopping experience collection (2500 times) +[2024-06-17 23:14:46,128][12883] InferenceWorker_p0-w0: resuming experience collection (2500 times) +[2024-06-17 23:14:46,211][12883] Updated weights for policy 0, policy_version 10770 (0.0025) +[2024-06-17 23:14:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 39867.8, 300 sec: 39821.4). Total num frames: 176472064. Throughput: 0: 40015.9. Samples: 176532600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:14:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:14:51,910][12883] Updated weights for policy 0, policy_version 10780 (0.0035) +[2024-06-17 23:14:51,996][12645] Fps is (10 sec: 39312.8, 60 sec: 39593.2, 300 sec: 39710.1). Total num frames: 176619520. Throughput: 0: 39991.9. Samples: 176772360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:14:51,997][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:14:54,576][12883] Updated weights for policy 0, policy_version 10790 (0.0037) +[2024-06-17 23:14:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 39765.9). Total num frames: 176865280. Throughput: 0: 39754.7. Samples: 177001560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-17 23:14:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:15:00,345][12883] Updated weights for policy 0, policy_version 10800 (0.0031) +[2024-06-17 23:15:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 40413.9, 300 sec: 39765.9). Total num frames: 177045504. Throughput: 0: 40156.5. Samples: 177129800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-17 23:15:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:15:02,898][12883] Updated weights for policy 0, policy_version 10810 (0.0053) +[2024-06-17 23:15:06,996][12645] Fps is (10 sec: 37675.2, 60 sec: 39320.2, 300 sec: 39765.6). Total num frames: 177242112. Throughput: 0: 39896.2. Samples: 177361420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-17 23:15:06,996][12645] Avg episode reward: [(0, '0.013')] +[2024-06-17 23:15:08,362][12883] Updated weights for policy 0, policy_version 10820 (0.0036) +[2024-06-17 23:15:11,222][12883] Updated weights for policy 0, policy_version 10830 (0.0041) +[2024-06-17 23:15:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 39765.9). Total num frames: 177455104. Throughput: 0: 39647.1. Samples: 177598660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:15:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:15:16,309][12883] Updated weights for policy 0, policy_version 10840 (0.0052) +[2024-06-17 23:15:16,994][12645] Fps is (10 sec: 37691.2, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 177618944. Throughput: 0: 39988.8. Samples: 177720860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:15:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:15:19,818][12883] Updated weights for policy 0, policy_version 10850 (0.0036) +[2024-06-17 23:15:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.6, 300 sec: 39821.4). Total num frames: 177864704. Throughput: 0: 39476.4. Samples: 177952360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-17 23:15:21,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:15:24,957][12883] Updated weights for policy 0, policy_version 10860 (0.0031) +[2024-06-17 23:15:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 178012160. Throughput: 0: 39642.2. Samples: 178194760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-17 23:15:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:15:28,253][12883] Updated weights for policy 0, policy_version 10870 (0.0034) +[2024-06-17 23:15:31,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39048.6, 300 sec: 39710.4). Total num frames: 178225152. Throughput: 0: 39428.6. Samples: 178306880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 23:15:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:15:33,259][12883] Updated weights for policy 0, policy_version 10880 (0.0036) +[2024-06-17 23:15:36,495][12883] Updated weights for policy 0, policy_version 10890 (0.0048) +[2024-06-17 23:15:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 178421760. Throughput: 0: 39397.0. Samples: 178545140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 23:15:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:15:41,474][12883] Updated weights for policy 0, policy_version 10900 (0.0044) +[2024-06-17 23:15:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 178618368. Throughput: 0: 39684.5. Samples: 178787360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 23:15:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:15:44,763][12883] Updated weights for policy 0, policy_version 10910 (0.0046) +[2024-06-17 23:15:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 178847744. Throughput: 0: 39424.0. Samples: 178903880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-17 23:15:46,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:15:49,587][12883] Updated weights for policy 0, policy_version 10920 (0.0043) +[2024-06-17 23:15:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40142.3, 300 sec: 39765.9). Total num frames: 179027968. Throughput: 0: 39604.2. Samples: 179143520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-17 23:15:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:15:52,842][12883] Updated weights for policy 0, policy_version 10930 (0.0043) +[2024-06-17 23:15:56,994][12645] Fps is (10 sec: 36044.5, 60 sec: 39048.6, 300 sec: 39765.9). Total num frames: 179208192. Throughput: 0: 39435.6. Samples: 179373260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-17 23:15:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:15:58,016][12883] Updated weights for policy 0, policy_version 10940 (0.0038) +[2024-06-17 23:16:01,122][12883] Updated weights for policy 0, policy_version 10950 (0.0038) +[2024-06-17 23:16:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39594.6, 300 sec: 39710.3). Total num frames: 179421184. Throughput: 0: 39349.3. Samples: 179491580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-17 23:16:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:16:06,205][12883] Updated weights for policy 0, policy_version 10960 (0.0032) +[2024-06-17 23:16:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39596.1, 300 sec: 39655.1). Total num frames: 179617792. Throughput: 0: 39646.2. Samples: 179736440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:16:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:16:09,343][12883] Updated weights for policy 0, policy_version 10970 (0.0027) +[2024-06-17 23:16:11,994][12645] Fps is (10 sec: 39322.3, 60 sec: 39321.7, 300 sec: 39710.4). Total num frames: 179814400. Throughput: 0: 39561.8. Samples: 179975040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 23:16:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:16:12,982][12862] Signal inference workers to stop experience collection... (2550 times) +[2024-06-17 23:16:12,982][12862] Signal inference workers to resume experience collection... (2550 times) +[2024-06-17 23:16:13,003][12883] InferenceWorker_p0-w0: stopping experience collection (2550 times) +[2024-06-17 23:16:13,008][12883] InferenceWorker_p0-w0: resuming experience collection (2550 times) +[2024-06-17 23:16:14,263][12883] Updated weights for policy 0, policy_version 10980 (0.0049) +[2024-06-17 23:16:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 39821.5). Total num frames: 180027392. Throughput: 0: 39788.4. Samples: 180097360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 23:16:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:16:17,608][12883] Updated weights for policy 0, policy_version 10990 (0.0035) +[2024-06-17 23:16:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38775.4, 300 sec: 39710.3). Total num frames: 180191232. Throughput: 0: 39837.4. Samples: 180337820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 23:16:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:16:22,767][12883] Updated weights for policy 0, policy_version 11000 (0.0035) +[2024-06-17 23:16:25,672][12883] Updated weights for policy 0, policy_version 11010 (0.0051) +[2024-06-17 23:16:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.8, 300 sec: 39765.9). Total num frames: 180436992. Throughput: 0: 39701.3. Samples: 180573920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-17 23:16:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:16:30,797][12883] Updated weights for policy 0, policy_version 11020 (0.0033) +[2024-06-17 23:16:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40140.7, 300 sec: 39710.4). Total num frames: 180633600. Throughput: 0: 39907.4. Samples: 180699720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-17 23:16:31,995][12645] Avg episode reward: [(0, '0.014')] +[2024-06-17 23:16:33,723][12883] Updated weights for policy 0, policy_version 11030 (0.0038) +[2024-06-17 23:16:36,994][12645] Fps is (10 sec: 34406.4, 60 sec: 39321.7, 300 sec: 39654.8). Total num frames: 180781056. Throughput: 0: 39769.8. Samples: 180933160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-17 23:16:36,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:16:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011035_180797440.pth... +[2024-06-17 23:16:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010455_171294720.pth +[2024-06-17 23:16:38,960][12883] Updated weights for policy 0, policy_version 11040 (0.0046) +[2024-06-17 23:16:41,991][12883] Updated weights for policy 0, policy_version 11050 (0.0039) +[2024-06-17 23:16:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.8, 300 sec: 39821.4). Total num frames: 181043200. Throughput: 0: 39896.4. Samples: 181168600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-17 23:16:41,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:16:47,000][12645] Fps is (10 sec: 40934.1, 60 sec: 39044.4, 300 sec: 39654.0). Total num frames: 181190656. Throughput: 0: 40157.1. Samples: 181298900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-17 23:16:47,001][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:16:47,158][12883] Updated weights for policy 0, policy_version 11060 (0.0037) +[2024-06-17 23:16:49,923][12883] Updated weights for policy 0, policy_version 11070 (0.0029) +[2024-06-17 23:16:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 39710.7). Total num frames: 181420032. Throughput: 0: 39855.6. Samples: 181529940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-17 23:16:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:16:55,325][12883] Updated weights for policy 0, policy_version 11080 (0.0036) +[2024-06-17 23:16:56,994][12645] Fps is (10 sec: 42624.8, 60 sec: 40140.7, 300 sec: 39655.1). Total num frames: 181616640. Throughput: 0: 39851.9. Samples: 181768380. Policy #0 lag: (min: 1.0, avg: 6.9, max: 19.0) +[2024-06-17 23:16:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:16:57,869][12883] Updated weights for policy 0, policy_version 11090 (0.0043) +[2024-06-17 23:17:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 181796864. Throughput: 0: 39844.4. Samples: 181890360. Policy #0 lag: (min: 1.0, avg: 6.9, max: 19.0) +[2024-06-17 23:17:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:17:03,249][12883] Updated weights for policy 0, policy_version 11100 (0.0025) +[2024-06-17 23:17:06,483][12883] Updated weights for policy 0, policy_version 11110 (0.0037) +[2024-06-17 23:17:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 39821.5). Total num frames: 182026240. Throughput: 0: 39800.5. Samples: 182128840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 23:17:07,000][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:17:11,237][12883] Updated weights for policy 0, policy_version 11120 (0.0045) +[2024-06-17 23:17:11,994][12645] Fps is (10 sec: 40957.9, 60 sec: 39867.4, 300 sec: 39710.3). Total num frames: 182206464. Throughput: 0: 39922.6. Samples: 182370460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-17 23:17:11,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:17:14,756][12883] Updated weights for policy 0, policy_version 11130 (0.0036) +[2024-06-17 23:17:16,994][12645] Fps is (10 sec: 36044.5, 60 sec: 39321.5, 300 sec: 39654.8). Total num frames: 182386688. Throughput: 0: 39695.1. Samples: 182486000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-17 23:17:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:17:19,796][12883] Updated weights for policy 0, policy_version 11140 (0.0042) +[2024-06-17 23:17:21,994][12645] Fps is (10 sec: 40962.5, 60 sec: 40414.0, 300 sec: 39765.9). Total num frames: 182616064. Throughput: 0: 39873.9. Samples: 182727480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 23:17:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:17:23,355][12883] Updated weights for policy 0, policy_version 11150 (0.0029) +[2024-06-17 23:17:26,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39048.6, 300 sec: 39710.7). Total num frames: 182779904. Throughput: 0: 40000.6. Samples: 182968620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-17 23:17:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:17:27,989][12883] Updated weights for policy 0, policy_version 11160 (0.0055) +[2024-06-17 23:17:28,392][12862] Signal inference workers to stop experience collection... (2600 times) +[2024-06-17 23:17:28,435][12883] InferenceWorker_p0-w0: stopping experience collection (2600 times) +[2024-06-17 23:17:28,507][12862] Signal inference workers to resume experience collection... (2600 times) +[2024-06-17 23:17:28,508][12883] InferenceWorker_p0-w0: resuming experience collection (2600 times) +[2024-06-17 23:17:31,464][12883] Updated weights for policy 0, policy_version 11170 (0.0036) +[2024-06-17 23:17:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39867.9, 300 sec: 39821.5). Total num frames: 183025664. Throughput: 0: 39676.3. Samples: 183084080. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) +[2024-06-17 23:17:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:17:35,920][12883] Updated weights for policy 0, policy_version 11180 (0.0037) +[2024-06-17 23:17:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.9, 300 sec: 39654.8). Total num frames: 183205888. Throughput: 0: 39965.8. Samples: 183328400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) +[2024-06-17 23:17:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:17:39,999][12883] Updated weights for policy 0, policy_version 11190 (0.0040) +[2024-06-17 23:17:41,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39048.6, 300 sec: 39654.9). Total num frames: 183386112. Throughput: 0: 40049.5. Samples: 183570600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 23:17:41,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-17 23:17:44,007][12883] Updated weights for policy 0, policy_version 11200 (0.0047) +[2024-06-17 23:17:46,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40691.1, 300 sec: 39821.7). Total num frames: 183631872. Throughput: 0: 40014.5. Samples: 183691020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 23:17:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:17:48,008][12883] Updated weights for policy 0, policy_version 11210 (0.0034) +[2024-06-17 23:17:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 183812096. Throughput: 0: 40032.5. Samples: 183930300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:17:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:17:52,097][12883] Updated weights for policy 0, policy_version 11220 (0.0044) +[2024-06-17 23:17:55,835][12883] Updated weights for policy 0, policy_version 11230 (0.0031) +[2024-06-17 23:17:56,996][12645] Fps is (10 sec: 36037.2, 60 sec: 39593.3, 300 sec: 39710.4). Total num frames: 183992320. Throughput: 0: 40081.1. Samples: 184174180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:17:56,997][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:18:00,093][12883] Updated weights for policy 0, policy_version 11240 (0.0035) +[2024-06-17 23:18:01,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40960.0, 300 sec: 39932.5). Total num frames: 184254464. Throughput: 0: 40155.2. Samples: 184292980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-17 23:18:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:18:04,573][12883] Updated weights for policy 0, policy_version 11250 (0.0051) +[2024-06-17 23:18:06,994][12645] Fps is (10 sec: 42607.4, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 184418304. Throughput: 0: 40387.3. Samples: 184544920. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-17 23:18:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:18:08,112][12883] Updated weights for policy 0, policy_version 11260 (0.0045) +[2024-06-17 23:18:11,994][12645] Fps is (10 sec: 36044.8, 60 sec: 40141.1, 300 sec: 39821.5). Total num frames: 184614912. Throughput: 0: 40191.5. Samples: 184777240. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-17 23:18:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:18:13,149][12883] Updated weights for policy 0, policy_version 11270 (0.0043) +[2024-06-17 23:18:16,197][12883] Updated weights for policy 0, policy_version 11280 (0.0035) +[2024-06-17 23:18:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 39821.5). Total num frames: 184827904. Throughput: 0: 40386.0. Samples: 184901460. Policy #0 lag: (min: 2.0, avg: 10.7, max: 24.0) +[2024-06-17 23:18:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:18:21,261][12883] Updated weights for policy 0, policy_version 11290 (0.0041) +[2024-06-17 23:18:21,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 184991744. Throughput: 0: 40360.3. Samples: 185144620. Policy #0 lag: (min: 2.0, avg: 10.7, max: 24.0) +[2024-06-17 23:18:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:18:24,493][12883] Updated weights for policy 0, policy_version 11300 (0.0045) +[2024-06-17 23:18:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 39877.0). Total num frames: 185237504. Throughput: 0: 40215.5. Samples: 185380300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 23:18:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:18:29,545][12883] Updated weights for policy 0, policy_version 11310 (0.0035) +[2024-06-17 23:18:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 40140.7, 300 sec: 39988.1). Total num frames: 185434112. Throughput: 0: 40424.6. Samples: 185510120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 23:18:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:18:32,804][12883] Updated weights for policy 0, policy_version 11320 (0.0042) +[2024-06-17 23:18:36,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39867.6, 300 sec: 39710.4). Total num frames: 185597952. Throughput: 0: 40289.7. Samples: 185743340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:18:36,999][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:18:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011328_185597952.pth... +[2024-06-17 23:18:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010746_176062464.pth +[2024-06-17 23:18:37,684][12883] Updated weights for policy 0, policy_version 11330 (0.0032) +[2024-06-17 23:18:40,783][12883] Updated weights for policy 0, policy_version 11340 (0.0032) +[2024-06-17 23:18:41,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40959.9, 300 sec: 39877.0). Total num frames: 185843712. Throughput: 0: 40080.6. Samples: 185977720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-17 23:18:42,000][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:18:45,689][12883] Updated weights for policy 0, policy_version 11350 (0.0044) +[2024-06-17 23:18:46,994][12645] Fps is (10 sec: 39322.5, 60 sec: 39321.8, 300 sec: 39821.5). Total num frames: 185991168. Throughput: 0: 40220.6. Samples: 186102900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-17 23:18:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:18:49,018][12883] Updated weights for policy 0, policy_version 11360 (0.0029) +[2024-06-17 23:18:52,000][12645] Fps is (10 sec: 39297.4, 60 sec: 40409.6, 300 sec: 39931.7). Total num frames: 186236928. Throughput: 0: 39786.2. Samples: 186335540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-17 23:18:52,001][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:18:54,382][12883] Updated weights for policy 0, policy_version 11370 (0.0033) +[2024-06-17 23:18:56,394][12862] Signal inference workers to stop experience collection... (2650 times) +[2024-06-17 23:18:56,395][12862] Signal inference workers to resume experience collection... (2650 times) +[2024-06-17 23:18:56,412][12883] InferenceWorker_p0-w0: stopping experience collection (2650 times) +[2024-06-17 23:18:56,412][12883] InferenceWorker_p0-w0: resuming experience collection (2650 times) +[2024-06-17 23:18:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40415.5, 300 sec: 39988.1). Total num frames: 186417152. Throughput: 0: 39861.5. Samples: 186571000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-17 23:18:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:18:57,157][12883] Updated weights for policy 0, policy_version 11380 (0.0035) +[2024-06-17 23:19:01,994][12645] Fps is (10 sec: 36067.3, 60 sec: 39048.6, 300 sec: 39710.4). Total num frames: 186597376. Throughput: 0: 39741.4. Samples: 186689820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-17 23:19:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:19:02,765][12883] Updated weights for policy 0, policy_version 11390 (0.0029) +[2024-06-17 23:19:05,210][12883] Updated weights for policy 0, policy_version 11400 (0.0033) +[2024-06-17 23:19:06,994][12645] Fps is (10 sec: 40959.0, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 186826752. Throughput: 0: 39519.1. Samples: 186922980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-17 23:19:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:19:10,647][12883] Updated weights for policy 0, policy_version 11410 (0.0026) +[2024-06-17 23:19:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 186990592. Throughput: 0: 39739.7. Samples: 187168580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) +[2024-06-17 23:19:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:19:13,613][12883] Updated weights for policy 0, policy_version 11420 (0.0048) +[2024-06-17 23:19:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 187203584. Throughput: 0: 39359.4. Samples: 187281300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-17 23:19:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:19:18,469][12883] Updated weights for policy 0, policy_version 11430 (0.0036) +[2024-06-17 23:19:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.9, 300 sec: 39932.5). Total num frames: 187400192. Throughput: 0: 39668.5. Samples: 187528420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-17 23:19:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:19:22,223][12883] Updated weights for policy 0, policy_version 11440 (0.0041) +[2024-06-17 23:19:26,318][12883] Updated weights for policy 0, policy_version 11450 (0.0040) +[2024-06-17 23:19:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 39710.4). Total num frames: 187596800. Throughput: 0: 39849.0. Samples: 187770920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-17 23:19:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:19:30,307][12883] Updated weights for policy 0, policy_version 11460 (0.0036) +[2024-06-17 23:19:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.7, 300 sec: 39932.5). Total num frames: 187826176. Throughput: 0: 39809.2. Samples: 187894320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-17 23:19:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:19:34,809][12883] Updated weights for policy 0, policy_version 11470 (0.0042) +[2024-06-17 23:19:36,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39594.8, 300 sec: 39821.5). Total num frames: 187973632. Throughput: 0: 39635.4. Samples: 188118880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-17 23:19:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:19:38,908][12883] Updated weights for policy 0, policy_version 11480 (0.0041) +[2024-06-17 23:19:41,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39048.5, 300 sec: 39710.4). Total num frames: 188186624. Throughput: 0: 39799.7. Samples: 188362000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-17 23:19:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:19:42,817][12883] Updated weights for policy 0, policy_version 11490 (0.0038) +[2024-06-17 23:19:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40140.7, 300 sec: 39932.8). Total num frames: 188399616. Throughput: 0: 39867.5. Samples: 188483860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:19:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:19:47,156][12883] Updated weights for policy 0, policy_version 11500 (0.0032) +[2024-06-17 23:19:51,065][12883] Updated weights for policy 0, policy_version 11510 (0.0044) +[2024-06-17 23:19:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 39598.8, 300 sec: 39821.5). Total num frames: 188612608. Throughput: 0: 39955.2. Samples: 188720960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:19:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:19:55,219][12883] Updated weights for policy 0, policy_version 11520 (0.0036) +[2024-06-17 23:19:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.6, 300 sec: 39821.5). Total num frames: 188792832. Throughput: 0: 39744.4. Samples: 188957080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-17 23:19:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:19:59,229][12883] Updated weights for policy 0, policy_version 11530 (0.0049) +[2024-06-17 23:20:01,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39867.7, 300 sec: 39821.7). Total num frames: 188989440. Throughput: 0: 39782.3. Samples: 189071500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-17 23:20:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:20:03,939][12883] Updated weights for policy 0, policy_version 11540 (0.0026) +[2024-06-17 23:20:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.7, 300 sec: 39821.4). Total num frames: 189202432. Throughput: 0: 39569.3. Samples: 189309040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-17 23:20:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:20:07,361][12883] Updated weights for policy 0, policy_version 11550 (0.0048) +[2024-06-17 23:20:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 189382656. Throughput: 0: 39676.1. Samples: 189556340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:20:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:20:12,020][12883] Updated weights for policy 0, policy_version 11560 (0.0037) +[2024-06-17 23:20:16,157][12883] Updated weights for policy 0, policy_version 11570 (0.0049) +[2024-06-17 23:20:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39867.8, 300 sec: 39765.9). Total num frames: 189595648. Throughput: 0: 39529.8. Samples: 189673160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:20:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:20:20,220][12883] Updated weights for policy 0, policy_version 11580 (0.0048) +[2024-06-17 23:20:21,996][12645] Fps is (10 sec: 42588.2, 60 sec: 40139.3, 300 sec: 39987.8). Total num frames: 189808640. Throughput: 0: 39948.1. Samples: 189916640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:20:21,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:20:24,104][12883] Updated weights for policy 0, policy_version 11590 (0.0037) +[2024-06-17 23:20:24,878][12862] Signal inference workers to stop experience collection... (2700 times) +[2024-06-17 23:20:24,878][12862] Signal inference workers to resume experience collection... (2700 times) +[2024-06-17 23:20:24,894][12883] InferenceWorker_p0-w0: stopping experience collection (2700 times) +[2024-06-17 23:20:24,926][12883] InferenceWorker_p0-w0: resuming experience collection (2700 times) +[2024-06-17 23:20:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 189988864. Throughput: 0: 39821.9. Samples: 190153980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:20:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:20:28,859][12883] Updated weights for policy 0, policy_version 11600 (0.0033) +[2024-06-17 23:20:31,994][12645] Fps is (10 sec: 39330.5, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 190201856. Throughput: 0: 39797.8. Samples: 190274760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) +[2024-06-17 23:20:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:20:32,135][12883] Updated weights for policy 0, policy_version 11610 (0.0031) +[2024-06-17 23:20:36,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39867.7, 300 sec: 39821.5). Total num frames: 190365696. Throughput: 0: 39875.2. Samples: 190515340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) +[2024-06-17 23:20:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:20:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011620_190382080.pth... +[2024-06-17 23:20:37,072][12883] Updated weights for policy 0, policy_version 11620 (0.0035) +[2024-06-17 23:20:37,124][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011035_180797440.pth +[2024-06-17 23:20:40,637][12883] Updated weights for policy 0, policy_version 11630 (0.0039) +[2024-06-17 23:20:41,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40140.8, 300 sec: 39821.4). Total num frames: 190595072. Throughput: 0: 39848.8. Samples: 190750280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-17 23:20:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:20:45,291][12883] Updated weights for policy 0, policy_version 11640 (0.0027) +[2024-06-17 23:20:46,994][12645] Fps is (10 sec: 44235.8, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 190808064. Throughput: 0: 40084.4. Samples: 190875300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-17 23:20:46,998][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:20:48,661][12883] Updated weights for policy 0, policy_version 11650 (0.0037) +[2024-06-17 23:20:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39867.7, 300 sec: 39988.1). Total num frames: 191004672. Throughput: 0: 40145.8. Samples: 191115600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) +[2024-06-17 23:20:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:20:53,412][12883] Updated weights for policy 0, policy_version 11660 (0.0046) +[2024-06-17 23:20:56,693][12883] Updated weights for policy 0, policy_version 11670 (0.0031) +[2024-06-17 23:20:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.7, 300 sec: 39932.5). Total num frames: 191201280. Throughput: 0: 39918.9. Samples: 191352700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 23:20:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:21:01,651][12883] Updated weights for policy 0, policy_version 11680 (0.0031) +[2024-06-17 23:21:01,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39867.8, 300 sec: 39877.0). Total num frames: 191381504. Throughput: 0: 40057.3. Samples: 191475740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 23:21:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:21:05,120][12883] Updated weights for policy 0, policy_version 11690 (0.0043) +[2024-06-17 23:21:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39867.8, 300 sec: 39932.5). Total num frames: 191594496. Throughput: 0: 40025.6. Samples: 191717700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 23:21:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:21:09,602][12883] Updated weights for policy 0, policy_version 11700 (0.0033) +[2024-06-17 23:21:11,996][12645] Fps is (10 sec: 44227.0, 60 sec: 40685.4, 300 sec: 39987.8). Total num frames: 191823872. Throughput: 0: 40184.3. Samples: 191962360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-17 23:21:11,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:21:12,985][12883] Updated weights for policy 0, policy_version 11710 (0.0035) +[2024-06-17 23:21:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39867.6, 300 sec: 39988.1). Total num frames: 191987712. Throughput: 0: 40299.4. Samples: 192088240. Policy #0 lag: (min: 2.0, avg: 9.2, max: 20.0) +[2024-06-17 23:21:16,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:21:17,770][12883] Updated weights for policy 0, policy_version 11720 (0.0043) +[2024-06-17 23:21:20,819][12883] Updated weights for policy 0, policy_version 11730 (0.0033) +[2024-06-17 23:21:21,994][12645] Fps is (10 sec: 37691.3, 60 sec: 39869.2, 300 sec: 39877.0). Total num frames: 192200704. Throughput: 0: 40105.6. Samples: 192320100. Policy #0 lag: (min: 2.0, avg: 9.2, max: 20.0) +[2024-06-17 23:21:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:21:25,704][12883] Updated weights for policy 0, policy_version 11740 (0.0036) +[2024-06-17 23:21:26,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40140.9, 300 sec: 39877.0). Total num frames: 192397312. Throughput: 0: 40332.6. Samples: 192565240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:21:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:21:28,891][12883] Updated weights for policy 0, policy_version 11750 (0.0027) +[2024-06-17 23:21:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39867.8, 300 sec: 40043.6). Total num frames: 192593920. Throughput: 0: 40185.4. Samples: 192683640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:21:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:21:33,768][12883] Updated weights for policy 0, policy_version 11760 (0.0036) +[2024-06-17 23:21:36,804][12883] Updated weights for policy 0, policy_version 11770 (0.0043) +[2024-06-17 23:21:37,000][12645] Fps is (10 sec: 44209.0, 60 sec: 41228.7, 300 sec: 39987.2). Total num frames: 192839680. Throughput: 0: 40275.7. Samples: 192928260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:21:37,001][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:21:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39867.8, 300 sec: 39988.9). Total num frames: 192987136. Throughput: 0: 40332.1. Samples: 193167640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:21:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:21:42,163][12883] Updated weights for policy 0, policy_version 11780 (0.0036) +[2024-06-17 23:21:45,025][12883] Updated weights for policy 0, policy_version 11790 (0.0034) +[2024-06-17 23:21:46,994][12645] Fps is (10 sec: 37706.9, 60 sec: 40140.9, 300 sec: 39988.1). Total num frames: 193216512. Throughput: 0: 40103.1. Samples: 193280380. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) +[2024-06-17 23:21:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:21:50,683][12883] Updated weights for policy 0, policy_version 11800 (0.0034) +[2024-06-17 23:21:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39867.7, 300 sec: 39932.5). Total num frames: 193396736. Throughput: 0: 40269.7. Samples: 193529840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-17 23:21:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:21:52,485][12862] Signal inference workers to stop experience collection... (2750 times) +[2024-06-17 23:21:52,529][12862] Signal inference workers to resume experience collection... (2750 times) +[2024-06-17 23:21:52,530][12883] InferenceWorker_p0-w0: stopping experience collection (2750 times) +[2024-06-17 23:21:52,542][12883] InferenceWorker_p0-w0: resuming experience collection (2750 times) +[2024-06-17 23:21:52,995][12883] Updated weights for policy 0, policy_version 11810 (0.0032) +[2024-06-17 23:21:56,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 193576960. Throughput: 0: 40024.2. Samples: 193763360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-17 23:21:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:21:58,955][12883] Updated weights for policy 0, policy_version 11820 (0.0042) +[2024-06-17 23:22:01,477][12883] Updated weights for policy 0, policy_version 11830 (0.0032) +[2024-06-17 23:22:01,994][12645] Fps is (10 sec: 44237.5, 60 sec: 40960.0, 300 sec: 40043.6). Total num frames: 193839104. Throughput: 0: 39862.8. Samples: 193882060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-17 23:22:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:22:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.7, 300 sec: 39877.1). Total num frames: 193970176. Throughput: 0: 40225.4. Samples: 194130240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-17 23:22:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:22:07,098][12883] Updated weights for policy 0, policy_version 11840 (0.0035) +[2024-06-17 23:22:09,510][12883] Updated weights for policy 0, policy_version 11850 (0.0041) +[2024-06-17 23:22:11,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39596.2, 300 sec: 40043.6). Total num frames: 194199552. Throughput: 0: 40021.8. Samples: 194366220. Policy #0 lag: (min: 1.0, avg: 12.6, max: 20.0) +[2024-06-17 23:22:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:22:15,118][12883] Updated weights for policy 0, policy_version 11860 (0.0028) +[2024-06-17 23:22:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 40687.0, 300 sec: 40043.6). Total num frames: 194428928. Throughput: 0: 40237.8. Samples: 194494340. Policy #0 lag: (min: 1.0, avg: 12.6, max: 20.0) +[2024-06-17 23:22:16,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:22:17,707][12883] Updated weights for policy 0, policy_version 11870 (0.0053) +[2024-06-17 23:22:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39594.7, 300 sec: 39988.1). Total num frames: 194576384. Throughput: 0: 40108.7. Samples: 194732900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-17 23:22:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:22:23,187][12883] Updated weights for policy 0, policy_version 11880 (0.0029) +[2024-06-17 23:22:25,600][12883] Updated weights for policy 0, policy_version 11890 (0.0032) +[2024-06-17 23:22:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 39988.0). Total num frames: 194822144. Throughput: 0: 39972.4. Samples: 194966400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-17 23:22:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:22:31,198][12883] Updated weights for policy 0, policy_version 11900 (0.0036) +[2024-06-17 23:22:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 40413.9, 300 sec: 40043.6). Total num frames: 195018752. Throughput: 0: 40426.7. Samples: 195099580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:22:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:22:33,768][12883] Updated weights for policy 0, policy_version 11910 (0.0040) +[2024-06-17 23:22:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39598.8, 300 sec: 40099.1). Total num frames: 195215360. Throughput: 0: 40163.2. Samples: 195337180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:22:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:22:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011915_195215360.pth... +[2024-06-17 23:22:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011328_185597952.pth +[2024-06-17 23:22:39,175][12883] Updated weights for policy 0, policy_version 11920 (0.0045) +[2024-06-17 23:22:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40686.9, 300 sec: 39988.1). Total num frames: 195428352. Throughput: 0: 40198.2. Samples: 195572280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-17 23:22:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:22:42,264][12883] Updated weights for policy 0, policy_version 11930 (0.0033) +[2024-06-17 23:22:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 195592192. Throughput: 0: 40364.9. Samples: 195698480. Policy #0 lag: (min: 1.0, avg: 8.2, max: 22.0) +[2024-06-17 23:22:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:22:47,190][12883] Updated weights for policy 0, policy_version 11940 (0.0035) +[2024-06-17 23:22:51,107][12883] Updated weights for policy 0, policy_version 11950 (0.0036) +[2024-06-17 23:22:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40687.0, 300 sec: 40155.0). Total num frames: 195837952. Throughput: 0: 40075.9. Samples: 195933660. Policy #0 lag: (min: 1.0, avg: 8.2, max: 22.0) +[2024-06-17 23:22:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:22:55,279][12883] Updated weights for policy 0, policy_version 11960 (0.0048) +[2024-06-17 23:22:56,971][12862] Signal inference workers to stop experience collection... (2800 times) +[2024-06-17 23:22:56,972][12862] Signal inference workers to resume experience collection... (2800 times) +[2024-06-17 23:22:56,992][12883] InferenceWorker_p0-w0: stopping experience collection (2800 times) +[2024-06-17 23:22:56,992][12883] InferenceWorker_p0-w0: resuming experience collection (2800 times) +[2024-06-17 23:22:56,994][12645] Fps is (10 sec: 42597.2, 60 sec: 40686.8, 300 sec: 39877.0). Total num frames: 196018176. Throughput: 0: 40261.6. Samples: 196178000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-17 23:22:56,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:22:59,074][12883] Updated weights for policy 0, policy_version 11970 (0.0048) +[2024-06-17 23:23:01,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39321.5, 300 sec: 39932.5). Total num frames: 196198400. Throughput: 0: 40098.6. Samples: 196298780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-17 23:23:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:23:03,304][12883] Updated weights for policy 0, policy_version 11980 (0.0035) +[2024-06-17 23:23:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40959.9, 300 sec: 40043.6). Total num frames: 196427776. Throughput: 0: 40167.5. Samples: 196540440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-17 23:23:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:23:07,051][12883] Updated weights for policy 0, policy_version 11990 (0.0041) +[2024-06-17 23:23:11,711][12883] Updated weights for policy 0, policy_version 12000 (0.0041) +[2024-06-17 23:23:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40413.9, 300 sec: 39988.1). Total num frames: 196624384. Throughput: 0: 40417.0. Samples: 196785160. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-17 23:23:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:23:15,133][12883] Updated weights for policy 0, policy_version 12010 (0.0047) +[2024-06-17 23:23:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 196837376. Throughput: 0: 40089.3. Samples: 196903600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:23:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:23:19,551][12883] Updated weights for policy 0, policy_version 12020 (0.0043) +[2024-06-17 23:23:21,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41233.0, 300 sec: 40043.6). Total num frames: 197050368. Throughput: 0: 40235.0. Samples: 197147760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:23:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:23:23,515][12883] Updated weights for policy 0, policy_version 12030 (0.0041) +[2024-06-17 23:23:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 197197824. Throughput: 0: 40445.5. Samples: 197392320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:23:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:23:27,752][12883] Updated weights for policy 0, policy_version 12040 (0.0033) +[2024-06-17 23:23:31,790][12883] Updated weights for policy 0, policy_version 12050 (0.0030) +[2024-06-17 23:23:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 197443584. Throughput: 0: 40200.3. Samples: 197507500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:23:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:23:35,865][12883] Updated weights for policy 0, policy_version 12060 (0.0045) +[2024-06-17 23:23:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 40413.9, 300 sec: 39988.1). Total num frames: 197640192. Throughput: 0: 40478.3. Samples: 197755180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-17 23:23:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:23:39,862][12883] Updated weights for policy 0, policy_version 12070 (0.0037) +[2024-06-17 23:23:41,994][12645] Fps is (10 sec: 36045.4, 60 sec: 39594.8, 300 sec: 40043.6). Total num frames: 197804032. Throughput: 0: 40295.4. Samples: 197991280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-17 23:23:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:23:43,769][12883] Updated weights for policy 0, policy_version 12080 (0.0034) +[2024-06-17 23:23:46,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40413.8, 300 sec: 39933.4). Total num frames: 198017024. Throughput: 0: 40191.6. Samples: 198107400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-17 23:23:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:23:48,012][12883] Updated weights for policy 0, policy_version 12090 (0.0045) +[2024-06-17 23:23:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.8, 300 sec: 40043.6). Total num frames: 198230016. Throughput: 0: 40154.8. Samples: 198347400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) +[2024-06-17 23:23:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:23:52,076][12883] Updated weights for policy 0, policy_version 12100 (0.0049) +[2024-06-17 23:23:55,976][12883] Updated weights for policy 0, policy_version 12110 (0.0038) +[2024-06-17 23:23:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.9, 300 sec: 40099.1). Total num frames: 198426624. Throughput: 0: 39940.8. Samples: 198582500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) +[2024-06-17 23:23:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:24:00,280][12883] Updated weights for policy 0, policy_version 12120 (0.0044) +[2024-06-17 23:24:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40413.8, 300 sec: 39988.1). Total num frames: 198623232. Throughput: 0: 40005.7. Samples: 198703860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:24:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:24:03,942][12883] Updated weights for policy 0, policy_version 12130 (0.0036) +[2024-06-17 23:24:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39594.8, 300 sec: 40043.6). Total num frames: 198803456. Throughput: 0: 40030.8. Samples: 198949140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:24:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:24:08,514][12883] Updated weights for policy 0, policy_version 12140 (0.0048) +[2024-06-17 23:24:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.6, 300 sec: 40043.6). Total num frames: 199016448. Throughput: 0: 39628.8. Samples: 199175620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:24:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:24:12,759][12883] Updated weights for policy 0, policy_version 12150 (0.0038) +[2024-06-17 23:24:16,765][12883] Updated weights for policy 0, policy_version 12160 (0.0031) +[2024-06-17 23:24:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 199229440. Throughput: 0: 39861.3. Samples: 199301260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:24:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:24:20,814][12883] Updated weights for policy 0, policy_version 12170 (0.0034) +[2024-06-17 23:24:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39321.7, 300 sec: 40043.6). Total num frames: 199409664. Throughput: 0: 39707.1. Samples: 199542000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-17 23:24:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:24:25,199][12883] Updated weights for policy 0, policy_version 12180 (0.0038) +[2024-06-17 23:24:26,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 199606272. Throughput: 0: 39852.5. Samples: 199784640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-17 23:24:26,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:24:28,031][12862] Signal inference workers to stop experience collection... (2850 times) +[2024-06-17 23:24:28,032][12862] Signal inference workers to resume experience collection... (2850 times) +[2024-06-17 23:24:28,069][12883] InferenceWorker_p0-w0: stopping experience collection (2850 times) +[2024-06-17 23:24:28,069][12883] InferenceWorker_p0-w0: resuming experience collection (2850 times) +[2024-06-17 23:24:28,924][12883] Updated weights for policy 0, policy_version 12190 (0.0042) +[2024-06-17 23:24:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39594.7, 300 sec: 40154.7). Total num frames: 199819264. Throughput: 0: 39828.4. Samples: 199899680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-17 23:24:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:24:33,440][12883] Updated weights for policy 0, policy_version 12200 (0.0052) +[2024-06-17 23:24:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 39867.6, 300 sec: 40154.7). Total num frames: 200032256. Throughput: 0: 39939.4. Samples: 200144680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-17 23:24:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:24:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012209_200032256.pth... +[2024-06-17 23:24:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011620_190382080.pth +[2024-06-17 23:24:37,565][12883] Updated weights for policy 0, policy_version 12210 (0.0042) +[2024-06-17 23:24:41,340][12883] Updated weights for policy 0, policy_version 12220 (0.0039) +[2024-06-17 23:24:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.7, 300 sec: 40043.6). Total num frames: 200212480. Throughput: 0: 39877.2. Samples: 200376980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:24:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:24:45,733][12883] Updated weights for policy 0, policy_version 12230 (0.0044) +[2024-06-17 23:24:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.7, 300 sec: 40043.6). Total num frames: 200425472. Throughput: 0: 40006.2. Samples: 200504140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:24:46,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:24:49,714][12883] Updated weights for policy 0, policy_version 12240 (0.0038) +[2024-06-17 23:24:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.6, 300 sec: 40043.6). Total num frames: 200605696. Throughput: 0: 39893.7. Samples: 200744360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 23:24:51,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:24:53,971][12883] Updated weights for policy 0, policy_version 12250 (0.0036) +[2024-06-17 23:24:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 200851456. Throughput: 0: 40002.7. Samples: 200975740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 23:24:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:24:57,608][12883] Updated weights for policy 0, policy_version 12260 (0.0031) +[2024-06-17 23:25:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39867.8, 300 sec: 40043.6). Total num frames: 201015296. Throughput: 0: 40186.7. Samples: 201109660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-17 23:25:01,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:25:02,086][12883] Updated weights for policy 0, policy_version 12270 (0.0040) +[2024-06-17 23:25:05,488][12883] Updated weights for policy 0, policy_version 12280 (0.0028) +[2024-06-17 23:25:06,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 201228288. Throughput: 0: 40020.4. Samples: 201342920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:25:06,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:25:10,153][12883] Updated weights for policy 0, policy_version 12290 (0.0036) +[2024-06-17 23:25:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 201441280. Throughput: 0: 40106.9. Samples: 201589460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:25:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:25:13,922][12883] Updated weights for policy 0, policy_version 12300 (0.0037) +[2024-06-17 23:25:16,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.8, 300 sec: 40043.9). Total num frames: 201621504. Throughput: 0: 40207.2. Samples: 201709000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-17 23:25:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:25:18,139][12883] Updated weights for policy 0, policy_version 12310 (0.0044) +[2024-06-17 23:25:21,951][12883] Updated weights for policy 0, policy_version 12320 (0.0031) +[2024-06-17 23:25:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40210.2). Total num frames: 201850880. Throughput: 0: 40084.5. Samples: 201948480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-17 23:25:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:25:26,107][12883] Updated weights for policy 0, policy_version 12330 (0.0042) +[2024-06-17 23:25:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40140.7, 300 sec: 40043.6). Total num frames: 202014720. Throughput: 0: 40290.7. Samples: 202190060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-17 23:25:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:25:30,222][12883] Updated weights for policy 0, policy_version 12340 (0.0034) +[2024-06-17 23:25:32,000][12645] Fps is (10 sec: 37659.6, 60 sec: 40136.6, 300 sec: 40209.4). Total num frames: 202227712. Throughput: 0: 40273.6. Samples: 202316700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-17 23:25:32,001][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:25:34,443][12883] Updated weights for policy 0, policy_version 12350 (0.0039) +[2024-06-17 23:25:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 202440704. Throughput: 0: 40278.2. Samples: 202556880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 23:25:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:25:38,331][12883] Updated weights for policy 0, policy_version 12360 (0.0035) +[2024-06-17 23:25:41,994][12645] Fps is (10 sec: 40986.0, 60 sec: 40414.0, 300 sec: 40099.2). Total num frames: 202637312. Throughput: 0: 40480.5. Samples: 202797360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-17 23:25:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:25:42,766][12883] Updated weights for policy 0, policy_version 12370 (0.0034) +[2024-06-17 23:25:46,403][12883] Updated weights for policy 0, policy_version 12380 (0.0032) +[2024-06-17 23:25:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 202850304. Throughput: 0: 40099.1. Samples: 202914120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-17 23:25:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:25:50,772][12883] Updated weights for policy 0, policy_version 12390 (0.0028) +[2024-06-17 23:25:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.9, 300 sec: 40099.2). Total num frames: 203030528. Throughput: 0: 40370.2. Samples: 203159580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-17 23:25:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:25:54,144][12862] Signal inference workers to stop experience collection... (2900 times) +[2024-06-17 23:25:54,144][12862] Signal inference workers to resume experience collection... (2900 times) +[2024-06-17 23:25:54,163][12883] InferenceWorker_p0-w0: stopping experience collection (2900 times) +[2024-06-17 23:25:54,163][12883] InferenceWorker_p0-w0: resuming experience collection (2900 times) +[2024-06-17 23:25:54,434][12883] Updated weights for policy 0, policy_version 12400 (0.0036) +[2024-06-17 23:25:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 203243520. Throughput: 0: 40155.1. Samples: 203396440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-17 23:25:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:25:58,790][12883] Updated weights for policy 0, policy_version 12410 (0.0048) +[2024-06-17 23:26:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 203440128. Throughput: 0: 40392.0. Samples: 203526640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-17 23:26:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:26:02,490][12883] Updated weights for policy 0, policy_version 12420 (0.0045) +[2024-06-17 23:26:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 40043.9). Total num frames: 203636736. Throughput: 0: 40448.1. Samples: 203768640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-17 23:26:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:26:07,189][12883] Updated weights for policy 0, policy_version 12430 (0.0034) +[2024-06-17 23:26:10,796][12883] Updated weights for policy 0, policy_version 12440 (0.0052) +[2024-06-17 23:26:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40414.0, 300 sec: 40265.8). Total num frames: 203866112. Throughput: 0: 40361.9. Samples: 204006340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-17 23:26:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:26:15,175][12883] Updated weights for policy 0, policy_version 12450 (0.0035) +[2024-06-17 23:26:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40099.2). Total num frames: 204029952. Throughput: 0: 40319.9. Samples: 204130840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:26:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:26:18,928][12883] Updated weights for policy 0, policy_version 12460 (0.0030) +[2024-06-17 23:26:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 204242944. Throughput: 0: 40155.6. Samples: 204363880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:26:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:26:23,219][12883] Updated weights for policy 0, policy_version 12470 (0.0033) +[2024-06-17 23:26:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 204455936. Throughput: 0: 40142.6. Samples: 204603780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:26:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:26:27,028][12883] Updated weights for policy 0, policy_version 12480 (0.0033) +[2024-06-17 23:26:31,463][12883] Updated weights for policy 0, policy_version 12490 (0.0041) +[2024-06-17 23:26:31,997][12645] Fps is (10 sec: 39308.3, 60 sec: 40142.8, 300 sec: 39988.5). Total num frames: 204636160. Throughput: 0: 40277.0. Samples: 204726720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:26:31,998][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:26:35,051][12883] Updated weights for policy 0, policy_version 12500 (0.0035) +[2024-06-17 23:26:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 204849152. Throughput: 0: 40066.6. Samples: 204962580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-17 23:26:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:26:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012503_204849152.pth... +[2024-06-17 23:26:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011915_195215360.pth +[2024-06-17 23:26:39,237][12883] Updated weights for policy 0, policy_version 12510 (0.0041) +[2024-06-17 23:26:41,994][12645] Fps is (10 sec: 40973.5, 60 sec: 40140.7, 300 sec: 40099.1). Total num frames: 205045760. Throughput: 0: 40317.3. Samples: 205210720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-17 23:26:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:26:43,342][12883] Updated weights for policy 0, policy_version 12520 (0.0042) +[2024-06-17 23:26:46,999][12645] Fps is (10 sec: 40936.7, 60 sec: 40137.0, 300 sec: 40209.5). Total num frames: 205258752. Throughput: 0: 40119.3. Samples: 205332240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-17 23:26:47,000][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:26:47,422][12883] Updated weights for policy 0, policy_version 12530 (0.0034) +[2024-06-17 23:26:51,502][12883] Updated weights for policy 0, policy_version 12540 (0.0045) +[2024-06-17 23:26:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 205471744. Throughput: 0: 40137.7. Samples: 205574840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-17 23:26:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:26:55,997][12883] Updated weights for policy 0, policy_version 12550 (0.0038) +[2024-06-17 23:26:56,994][12645] Fps is (10 sec: 40983.3, 60 sec: 40413.8, 300 sec: 40099.1). Total num frames: 205668352. Throughput: 0: 40174.1. Samples: 205814180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-17 23:26:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:26:59,818][12883] Updated weights for policy 0, policy_version 12560 (0.0030) +[2024-06-17 23:27:01,994][12645] Fps is (10 sec: 39319.7, 60 sec: 40413.5, 300 sec: 40321.2). Total num frames: 205864960. Throughput: 0: 39995.1. Samples: 205930640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) +[2024-06-17 23:27:01,995][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:27:03,984][12883] Updated weights for policy 0, policy_version 12570 (0.0035) +[2024-06-17 23:27:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40140.7, 300 sec: 40154.7). Total num frames: 206045184. Throughput: 0: 40327.0. Samples: 206178600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) +[2024-06-17 23:27:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:27:07,523][12862] Signal inference workers to stop experience collection... (2950 times) +[2024-06-17 23:27:07,523][12862] Signal inference workers to resume experience collection... (2950 times) +[2024-06-17 23:27:07,544][12883] InferenceWorker_p0-w0: stopping experience collection (2950 times) +[2024-06-17 23:27:07,544][12883] InferenceWorker_p0-w0: resuming experience collection (2950 times) +[2024-06-17 23:27:07,843][12883] Updated weights for policy 0, policy_version 12580 (0.0030) +[2024-06-17 23:27:11,994][12645] Fps is (10 sec: 39323.6, 60 sec: 39867.7, 300 sec: 40099.2). Total num frames: 206258176. Throughput: 0: 40318.2. Samples: 206418100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:27:11,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:27:12,344][12883] Updated weights for policy 0, policy_version 12590 (0.0042) +[2024-06-17 23:27:16,323][12883] Updated weights for policy 0, policy_version 12600 (0.0041) +[2024-06-17 23:27:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 206471168. Throughput: 0: 40287.5. Samples: 206539520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:27:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:27:20,267][12883] Updated weights for policy 0, policy_version 12610 (0.0044) +[2024-06-17 23:27:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 40099.2). Total num frames: 206651392. Throughput: 0: 40431.7. Samples: 206782000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-17 23:27:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:27:24,188][12883] Updated weights for policy 0, policy_version 12620 (0.0035) +[2024-06-17 23:27:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 40099.1). Total num frames: 206848000. Throughput: 0: 40399.3. Samples: 207028680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-17 23:27:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:27:28,246][12883] Updated weights for policy 0, policy_version 12630 (0.0029) +[2024-06-17 23:27:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40689.3, 300 sec: 40210.2). Total num frames: 207077376. Throughput: 0: 40308.8. Samples: 207145900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-17 23:27:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:27:32,287][12883] Updated weights for policy 0, policy_version 12640 (0.0033) +[2024-06-17 23:27:36,401][12883] Updated weights for policy 0, policy_version 12650 (0.0030) +[2024-06-17 23:27:36,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 207273984. Throughput: 0: 40371.0. Samples: 207391540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-17 23:27:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:27:40,420][12883] Updated weights for policy 0, policy_version 12660 (0.0028) +[2024-06-17 23:27:41,996][12645] Fps is (10 sec: 39312.5, 60 sec: 40412.4, 300 sec: 40265.4). Total num frames: 207470592. Throughput: 0: 40307.9. Samples: 207628120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 23:27:41,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:27:44,473][12883] Updated weights for policy 0, policy_version 12670 (0.0043) +[2024-06-17 23:27:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40144.6, 300 sec: 40099.1). Total num frames: 207667200. Throughput: 0: 40252.8. Samples: 207742000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 23:27:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:27:48,899][12883] Updated weights for policy 0, policy_version 12680 (0.0033) +[2024-06-17 23:27:51,994][12645] Fps is (10 sec: 37691.7, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 207847424. Throughput: 0: 40070.3. Samples: 207981760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) +[2024-06-17 23:27:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:27:52,834][12883] Updated weights for policy 0, policy_version 12690 (0.0045) +[2024-06-17 23:27:56,994][12883] Updated weights for policy 0, policy_version 12700 (0.0028) +[2024-06-17 23:27:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 40265.8). Total num frames: 208076800. Throughput: 0: 40203.9. Samples: 208227280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) +[2024-06-17 23:27:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:28:00,778][12883] Updated weights for policy 0, policy_version 12710 (0.0031) +[2024-06-17 23:28:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40141.2, 300 sec: 40154.7). Total num frames: 208273408. Throughput: 0: 40263.6. Samples: 208351380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-17 23:28:01,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:28:05,274][12883] Updated weights for policy 0, policy_version 12720 (0.0039) +[2024-06-17 23:28:07,000][12645] Fps is (10 sec: 39297.5, 60 sec: 40409.7, 300 sec: 40153.8). Total num frames: 208470016. Throughput: 0: 40196.6. Samples: 208591100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-17 23:28:07,000][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:28:08,932][12883] Updated weights for policy 0, policy_version 12730 (0.0037) +[2024-06-17 23:28:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.8, 300 sec: 40099.1). Total num frames: 208666624. Throughput: 0: 40053.3. Samples: 208831080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-17 23:28:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:28:13,143][12883] Updated weights for policy 0, policy_version 12740 (0.0037) +[2024-06-17 23:28:16,996][12645] Fps is (10 sec: 40976.1, 60 sec: 40139.3, 300 sec: 40098.9). Total num frames: 208879616. Throughput: 0: 40136.1. Samples: 208952120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-17 23:28:16,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:28:17,339][12883] Updated weights for policy 0, policy_version 12750 (0.0040) +[2024-06-17 23:28:21,834][12883] Updated weights for policy 0, policy_version 12760 (0.0040) +[2024-06-17 23:28:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 209059840. Throughput: 0: 39954.7. Samples: 209189500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 23:28:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:28:25,857][12883] Updated weights for policy 0, policy_version 12770 (0.0036) +[2024-06-17 23:28:26,994][12645] Fps is (10 sec: 39330.7, 60 sec: 40413.8, 300 sec: 40099.2). Total num frames: 209272832. Throughput: 0: 40002.0. Samples: 209428120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-17 23:28:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:28:29,705][12883] Updated weights for policy 0, policy_version 12780 (0.0026) +[2024-06-17 23:28:31,994][12645] Fps is (10 sec: 40960.8, 60 sec: 39867.8, 300 sec: 40099.2). Total num frames: 209469440. Throughput: 0: 40175.3. Samples: 209549880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:28:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:28:33,949][12883] Updated weights for policy 0, policy_version 12790 (0.0037) +[2024-06-17 23:28:36,995][12645] Fps is (10 sec: 39316.9, 60 sec: 39867.0, 300 sec: 40210.1). Total num frames: 209666048. Throughput: 0: 40080.3. Samples: 209785420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:28:36,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:28:37,141][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012798_209682432.pth... +[2024-06-17 23:28:37,197][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012209_200032256.pth +[2024-06-17 23:28:38,101][12883] Updated weights for policy 0, policy_version 12800 (0.0038) +[2024-06-17 23:28:41,996][12645] Fps is (10 sec: 39312.9, 60 sec: 39867.8, 300 sec: 40154.4). Total num frames: 209862656. Throughput: 0: 40108.4. Samples: 210032240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-17 23:28:41,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:28:42,249][12883] Updated weights for policy 0, policy_version 12810 (0.0036) +[2024-06-17 23:28:46,414][12883] Updated weights for policy 0, policy_version 12820 (0.0033) +[2024-06-17 23:28:46,996][12645] Fps is (10 sec: 40957.2, 60 sec: 40139.7, 300 sec: 40154.4). Total num frames: 210075648. Throughput: 0: 39919.2. Samples: 210147820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-17 23:28:46,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:28:50,274][12883] Updated weights for policy 0, policy_version 12830 (0.0047) +[2024-06-17 23:28:51,994][12645] Fps is (10 sec: 40968.5, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 210272256. Throughput: 0: 40025.9. Samples: 210392020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:28:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:28:54,381][12883] Updated weights for policy 0, policy_version 12840 (0.0031) +[2024-06-17 23:28:56,995][12645] Fps is (10 sec: 39323.0, 60 sec: 39866.8, 300 sec: 40154.5). Total num frames: 210468864. Throughput: 0: 40003.6. Samples: 210631300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:28:56,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:28:58,345][12883] Updated weights for policy 0, policy_version 12850 (0.0030) +[2024-06-17 23:29:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 210665472. Throughput: 0: 40032.2. Samples: 210753480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-17 23:29:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:29:02,395][12883] Updated weights for policy 0, policy_version 12860 (0.0037) +[2024-06-17 23:29:06,201][12862] Signal inference workers to stop experience collection... (3000 times) +[2024-06-17 23:29:06,251][12883] InferenceWorker_p0-w0: stopping experience collection (3000 times) +[2024-06-17 23:29:06,316][12862] Signal inference workers to resume experience collection... (3000 times) +[2024-06-17 23:29:06,316][12883] InferenceWorker_p0-w0: resuming experience collection (3000 times) +[2024-06-17 23:29:06,458][12883] Updated weights for policy 0, policy_version 12870 (0.0038) +[2024-06-17 23:29:06,994][12645] Fps is (10 sec: 40966.2, 60 sec: 40145.0, 300 sec: 40210.2). Total num frames: 210878464. Throughput: 0: 40169.4. Samples: 210997120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-17 23:29:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:29:10,360][12883] Updated weights for policy 0, policy_version 12880 (0.0046) +[2024-06-17 23:29:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 211091456. Throughput: 0: 40203.9. Samples: 211237300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-17 23:29:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:29:14,600][12883] Updated weights for policy 0, policy_version 12890 (0.0046) +[2024-06-17 23:29:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39869.3, 300 sec: 40210.2). Total num frames: 211271680. Throughput: 0: 40153.7. Samples: 211356800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-17 23:29:16,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-17 23:29:18,406][12883] Updated weights for policy 0, policy_version 12900 (0.0031) +[2024-06-17 23:29:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 211468288. Throughput: 0: 40241.4. Samples: 211596240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:29:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:29:22,590][12883] Updated weights for policy 0, policy_version 12910 (0.0031) +[2024-06-17 23:29:26,529][12883] Updated weights for policy 0, policy_version 12920 (0.0045) +[2024-06-17 23:29:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 211681280. Throughput: 0: 40003.6. Samples: 211832320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:29:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:29:30,690][12883] Updated weights for policy 0, policy_version 12930 (0.0043) +[2024-06-17 23:29:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39867.6, 300 sec: 40099.1). Total num frames: 211861504. Throughput: 0: 40264.2. Samples: 211959640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 23:29:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:29:34,457][12883] Updated weights for policy 0, policy_version 12940 (0.0027) +[2024-06-17 23:29:36,996][12645] Fps is (10 sec: 39313.0, 60 sec: 40140.1, 300 sec: 40209.9). Total num frames: 212074496. Throughput: 0: 40188.7. Samples: 212200600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 23:29:36,997][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:29:38,863][12883] Updated weights for policy 0, policy_version 12950 (0.0032) +[2024-06-17 23:29:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 40688.4, 300 sec: 40265.8). Total num frames: 212303872. Throughput: 0: 40344.1. Samples: 212446720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) +[2024-06-17 23:29:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:29:43,146][12883] Updated weights for policy 0, policy_version 12960 (0.0031) +[2024-06-17 23:29:46,994][12645] Fps is (10 sec: 40969.5, 60 sec: 40142.1, 300 sec: 40265.8). Total num frames: 212484096. Throughput: 0: 40369.0. Samples: 212570080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) +[2024-06-17 23:29:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:29:47,102][12883] Updated weights for policy 0, policy_version 12970 (0.0040) +[2024-06-17 23:29:51,137][12883] Updated weights for policy 0, policy_version 12980 (0.0037) +[2024-06-17 23:29:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 212697088. Throughput: 0: 40176.4. Samples: 212805060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:29:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:29:54,979][12883] Updated weights for policy 0, policy_version 12990 (0.0035) +[2024-06-17 23:29:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40414.9, 300 sec: 40265.8). Total num frames: 212893696. Throughput: 0: 40232.1. Samples: 213047740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:29:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:29:59,427][12883] Updated weights for policy 0, policy_version 13000 (0.0045) +[2024-06-17 23:30:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40414.0, 300 sec: 40210.2). Total num frames: 213090304. Throughput: 0: 40369.4. Samples: 213173420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 23:30:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:30:02,820][12883] Updated weights for policy 0, policy_version 13010 (0.0042) +[2024-06-17 23:30:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 213303296. Throughput: 0: 40559.2. Samples: 213421400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 23:30:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:30:07,287][12883] Updated weights for policy 0, policy_version 13020 (0.0040) +[2024-06-17 23:30:10,904][12883] Updated weights for policy 0, policy_version 13030 (0.0036) +[2024-06-17 23:30:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40140.8, 300 sec: 40265.7). Total num frames: 213499904. Throughput: 0: 40606.2. Samples: 213659600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:30:11,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:30:15,826][12883] Updated weights for policy 0, policy_version 13040 (0.0039) +[2024-06-17 23:30:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.7, 300 sec: 40154.7). Total num frames: 213696512. Throughput: 0: 40463.1. Samples: 213780480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:30:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:30:18,958][12883] Updated weights for policy 0, policy_version 13050 (0.0044) +[2024-06-17 23:30:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 213909504. Throughput: 0: 40612.2. Samples: 214028060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-17 23:30:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:30:23,714][12883] Updated weights for policy 0, policy_version 13060 (0.0030) +[2024-06-17 23:30:26,461][12862] Signal inference workers to stop experience collection... (3050 times) +[2024-06-17 23:30:26,504][12883] InferenceWorker_p0-w0: stopping experience collection (3050 times) +[2024-06-17 23:30:26,514][12862] Signal inference workers to resume experience collection... (3050 times) +[2024-06-17 23:30:26,524][12883] InferenceWorker_p0-w0: resuming experience collection (3050 times) +[2024-06-17 23:30:26,836][12883] Updated weights for policy 0, policy_version 13070 (0.0037) +[2024-06-17 23:30:26,994][12645] Fps is (10 sec: 45875.6, 60 sec: 41233.1, 300 sec: 40433.2). Total num frames: 214155264. Throughput: 0: 40561.7. Samples: 214272000. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) +[2024-06-17 23:30:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:30:31,586][12883] Updated weights for policy 0, policy_version 13080 (0.0038) +[2024-06-17 23:30:31,995][12645] Fps is (10 sec: 39316.2, 60 sec: 40686.0, 300 sec: 40210.0). Total num frames: 214302720. Throughput: 0: 40492.4. Samples: 214392300. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) +[2024-06-17 23:30:32,000][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:30:35,259][12883] Updated weights for policy 0, policy_version 13090 (0.0052) +[2024-06-17 23:30:36,994][12645] Fps is (10 sec: 36044.9, 60 sec: 40688.4, 300 sec: 40265.8). Total num frames: 214515712. Throughput: 0: 40641.3. Samples: 214633920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:30:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:30:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013093_214515712.pth... +[2024-06-17 23:30:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012503_204849152.pth +[2024-06-17 23:30:40,049][12883] Updated weights for policy 0, policy_version 13100 (0.0033) +[2024-06-17 23:30:41,994][12645] Fps is (10 sec: 40966.0, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 214712320. Throughput: 0: 40618.7. Samples: 214875580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:30:41,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:30:43,414][12883] Updated weights for policy 0, policy_version 13110 (0.0042) +[2024-06-17 23:30:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 214908928. Throughput: 0: 40458.9. Samples: 214994080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 23:30:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:30:47,999][12883] Updated weights for policy 0, policy_version 13120 (0.0043) +[2024-06-17 23:30:51,881][12883] Updated weights for policy 0, policy_version 13130 (0.0049) +[2024-06-17 23:30:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 215121920. Throughput: 0: 40350.2. Samples: 215237160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 23:30:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:30:56,326][12883] Updated weights for policy 0, policy_version 13140 (0.0038) +[2024-06-17 23:30:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.8, 300 sec: 40265.7). Total num frames: 215318528. Throughput: 0: 40546.7. Samples: 215484200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-17 23:30:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:30:59,820][12883] Updated weights for policy 0, policy_version 13150 (0.0037) +[2024-06-17 23:31:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 215515136. Throughput: 0: 40441.5. Samples: 215600340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-17 23:31:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:31:04,467][12883] Updated weights for policy 0, policy_version 13160 (0.0039) +[2024-06-17 23:31:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 215711744. Throughput: 0: 40240.0. Samples: 215838860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-17 23:31:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:31:08,282][12883] Updated weights for policy 0, policy_version 13170 (0.0031) +[2024-06-17 23:31:11,994][12645] Fps is (10 sec: 37682.2, 60 sec: 39867.6, 300 sec: 40210.2). Total num frames: 215891968. Throughput: 0: 40206.5. Samples: 216081300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-17 23:31:11,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:31:12,556][12883] Updated weights for policy 0, policy_version 13180 (0.0045) +[2024-06-17 23:31:16,173][12883] Updated weights for policy 0, policy_version 13190 (0.0044) +[2024-06-17 23:31:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.9, 300 sec: 40265.8). Total num frames: 216121344. Throughput: 0: 40123.9. Samples: 216197820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-17 23:31:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:31:20,610][12883] Updated weights for policy 0, policy_version 13200 (0.0040) +[2024-06-17 23:31:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.7, 300 sec: 40154.7). Total num frames: 216301568. Throughput: 0: 39972.9. Samples: 216432700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-17 23:31:21,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:31:24,778][12883] Updated weights for policy 0, policy_version 13210 (0.0035) +[2024-06-17 23:31:26,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39048.5, 300 sec: 40210.7). Total num frames: 216498176. Throughput: 0: 39929.7. Samples: 216672420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-17 23:31:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:31:28,700][12883] Updated weights for policy 0, policy_version 13220 (0.0040) +[2024-06-17 23:31:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40141.7, 300 sec: 40210.2). Total num frames: 216711168. Throughput: 0: 39981.4. Samples: 216793240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:31:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:31:32,746][12883] Updated weights for policy 0, policy_version 13230 (0.0045) +[2024-06-17 23:31:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.7, 300 sec: 40154.7). Total num frames: 216891392. Throughput: 0: 39759.6. Samples: 217026340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:31:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:31:37,191][12883] Updated weights for policy 0, policy_version 13240 (0.0034) +[2024-06-17 23:31:40,634][12883] Updated weights for policy 0, policy_version 13250 (0.0033) +[2024-06-17 23:31:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.8, 300 sec: 40211.0). Total num frames: 217120768. Throughput: 0: 39613.4. Samples: 217266800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) +[2024-06-17 23:31:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:31:45,227][12883] Updated weights for policy 0, policy_version 13260 (0.0037) +[2024-06-17 23:31:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 217300992. Throughput: 0: 39733.6. Samples: 217388360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) +[2024-06-17 23:31:47,000][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:31:49,002][12883] Updated weights for policy 0, policy_version 13270 (0.0030) +[2024-06-17 23:31:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 217513984. Throughput: 0: 39788.9. Samples: 217629360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:31:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:31:53,523][12883] Updated weights for policy 0, policy_version 13280 (0.0034) +[2024-06-17 23:31:56,994][12645] Fps is (10 sec: 42599.3, 60 sec: 40140.9, 300 sec: 40210.3). Total num frames: 217726976. Throughput: 0: 39680.2. Samples: 217866900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:31:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:31:57,071][12883] Updated weights for policy 0, policy_version 13290 (0.0030) +[2024-06-17 23:32:01,652][12883] Updated weights for policy 0, policy_version 13300 (0.0046) +[2024-06-17 23:32:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.6, 300 sec: 40210.2). Total num frames: 217907200. Throughput: 0: 39909.7. Samples: 217993760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-17 23:32:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:32:05,566][12883] Updated weights for policy 0, policy_version 13310 (0.0053) +[2024-06-17 23:32:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 218103808. Throughput: 0: 39925.9. Samples: 218229360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-17 23:32:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:32:10,157][12883] Updated weights for policy 0, policy_version 13320 (0.0048) +[2024-06-17 23:32:10,165][12862] Signal inference workers to stop experience collection... (3100 times) +[2024-06-17 23:32:10,165][12862] Signal inference workers to resume experience collection... (3100 times) +[2024-06-17 23:32:10,188][12883] InferenceWorker_p0-w0: stopping experience collection (3100 times) +[2024-06-17 23:32:10,188][12883] InferenceWorker_p0-w0: resuming experience collection (3100 times) +[2024-06-17 23:32:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 218316800. Throughput: 0: 39930.7. Samples: 218469300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:32:11,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:32:13,795][12883] Updated weights for policy 0, policy_version 13330 (0.0032) +[2024-06-17 23:32:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 40210.2). Total num frames: 218513408. Throughput: 0: 40074.8. Samples: 218596600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:32:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:32:17,826][12883] Updated weights for policy 0, policy_version 13340 (0.0031) +[2024-06-17 23:32:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 218710016. Throughput: 0: 40207.9. Samples: 218835700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 23:32:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:32:22,150][12883] Updated weights for policy 0, policy_version 13350 (0.0030) +[2024-06-17 23:32:25,933][12883] Updated weights for policy 0, policy_version 13360 (0.0050) +[2024-06-17 23:32:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40414.0, 300 sec: 40154.7). Total num frames: 218923008. Throughput: 0: 40157.0. Samples: 219073860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 23:32:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:32:30,471][12883] Updated weights for policy 0, policy_version 13370 (0.0027) +[2024-06-17 23:32:31,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39867.8, 300 sec: 40099.2). Total num frames: 219103232. Throughput: 0: 40190.4. Samples: 219196920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-17 23:32:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:32:34,052][12883] Updated weights for policy 0, policy_version 13380 (0.0026) +[2024-06-17 23:32:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.9, 300 sec: 40155.0). Total num frames: 219316224. Throughput: 0: 40102.3. Samples: 219433960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-17 23:32:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:32:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013386_219316224.pth... +[2024-06-17 23:32:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012798_209682432.pth +[2024-06-17 23:32:38,572][12883] Updated weights for policy 0, policy_version 13390 (0.0040) +[2024-06-17 23:32:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40140.9, 300 sec: 40210.3). Total num frames: 219529216. Throughput: 0: 40174.2. Samples: 219674740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 23:32:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:32:42,010][12883] Updated weights for policy 0, policy_version 13400 (0.0029) +[2024-06-17 23:32:46,817][12883] Updated weights for policy 0, policy_version 13410 (0.0040) +[2024-06-17 23:32:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.9, 300 sec: 40210.2). Total num frames: 219709440. Throughput: 0: 39993.8. Samples: 219793480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-17 23:32:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:32:50,265][12883] Updated weights for policy 0, policy_version 13420 (0.0038) +[2024-06-17 23:32:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40414.0, 300 sec: 40210.3). Total num frames: 219938816. Throughput: 0: 40285.3. Samples: 220042200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 23:32:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:32:55,240][12883] Updated weights for policy 0, policy_version 13430 (0.0052) +[2024-06-17 23:32:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.6, 300 sec: 40154.7). Total num frames: 220119040. Throughput: 0: 40332.0. Samples: 220284240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-17 23:32:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:32:58,381][12883] Updated weights for policy 0, policy_version 13440 (0.0038) +[2024-06-17 23:33:01,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40413.9, 300 sec: 40211.1). Total num frames: 220332032. Throughput: 0: 40103.0. Samples: 220401240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-17 23:33:01,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:33:03,111][12883] Updated weights for policy 0, policy_version 13450 (0.0033) +[2024-06-17 23:33:06,621][12883] Updated weights for policy 0, policy_version 13460 (0.0045) +[2024-06-17 23:33:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40686.9, 300 sec: 40265.8). Total num frames: 220545024. Throughput: 0: 40279.2. Samples: 220648260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-17 23:33:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:33:11,517][12883] Updated weights for policy 0, policy_version 13470 (0.0038) +[2024-06-17 23:33:11,994][12645] Fps is (10 sec: 36045.2, 60 sec: 39594.8, 300 sec: 40043.9). Total num frames: 220692480. Throughput: 0: 40394.2. Samples: 220891600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-17 23:33:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:33:14,782][12883] Updated weights for policy 0, policy_version 13480 (0.0044) +[2024-06-17 23:33:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40413.7, 300 sec: 40265.8). Total num frames: 220938240. Throughput: 0: 40215.8. Samples: 221006640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-17 23:33:16,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:33:19,891][12883] Updated weights for policy 0, policy_version 13490 (0.0037) +[2024-06-17 23:33:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.8, 300 sec: 40099.1). Total num frames: 221102080. Throughput: 0: 40484.4. Samples: 221255760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 23:33:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:33:22,811][12883] Updated weights for policy 0, policy_version 13500 (0.0036) +[2024-06-17 23:33:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 221331456. Throughput: 0: 40322.5. Samples: 221489260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 23:33:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:33:27,765][12883] Updated weights for policy 0, policy_version 13510 (0.0034) +[2024-06-17 23:33:31,124][12883] Updated weights for policy 0, policy_version 13520 (0.0037) +[2024-06-17 23:33:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40686.8, 300 sec: 40265.9). Total num frames: 221544448. Throughput: 0: 40474.6. Samples: 221614840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-17 23:33:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:33:35,540][12883] Updated weights for policy 0, policy_version 13530 (0.0039) +[2024-06-17 23:33:36,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39867.7, 300 sec: 40155.0). Total num frames: 221708288. Throughput: 0: 40165.7. Samples: 221849660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-17 23:33:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:33:39,297][12883] Updated weights for policy 0, policy_version 13540 (0.0048) +[2024-06-17 23:33:41,999][12645] Fps is (10 sec: 39300.0, 60 sec: 40137.0, 300 sec: 40209.7). Total num frames: 221937664. Throughput: 0: 40042.7. Samples: 222086380. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) +[2024-06-17 23:33:42,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:33:43,494][12883] Updated weights for policy 0, policy_version 13550 (0.0036) +[2024-06-17 23:33:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 222117888. Throughput: 0: 40154.8. Samples: 222208200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) +[2024-06-17 23:33:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:33:47,761][12883] Updated weights for policy 0, policy_version 13560 (0.0038) +[2024-06-17 23:33:48,413][12862] Signal inference workers to stop experience collection... (3150 times) +[2024-06-17 23:33:48,423][12862] Signal inference workers to resume experience collection... (3150 times) +[2024-06-17 23:33:48,456][12883] InferenceWorker_p0-w0: stopping experience collection (3150 times) +[2024-06-17 23:33:48,457][12883] InferenceWorker_p0-w0: resuming experience collection (3150 times) +[2024-06-17 23:33:51,943][12883] Updated weights for policy 0, policy_version 13570 (0.0032) +[2024-06-17 23:33:51,994][12645] Fps is (10 sec: 39343.4, 60 sec: 39867.6, 300 sec: 40210.4). Total num frames: 222330880. Throughput: 0: 39872.0. Samples: 222442500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 23:33:51,998][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:33:56,094][12883] Updated weights for policy 0, policy_version 13580 (0.0041) +[2024-06-17 23:33:57,000][12645] Fps is (10 sec: 42571.6, 60 sec: 40409.8, 300 sec: 40264.9). Total num frames: 222543872. Throughput: 0: 39774.5. Samples: 222681700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 23:33:57,000][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:33:59,820][12883] Updated weights for policy 0, policy_version 13590 (0.0043) +[2024-06-17 23:34:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 222707712. Throughput: 0: 39938.8. Samples: 222803880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 23:34:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:34:04,293][12883] Updated weights for policy 0, policy_version 13600 (0.0043) +[2024-06-17 23:34:06,994][12645] Fps is (10 sec: 40985.1, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 222953472. Throughput: 0: 39784.8. Samples: 223046080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 23:34:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:34:07,715][12883] Updated weights for policy 0, policy_version 13610 (0.0030) +[2024-06-17 23:34:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 223117312. Throughput: 0: 39987.1. Samples: 223288680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:34:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:34:12,279][12883] Updated weights for policy 0, policy_version 13620 (0.0030) +[2024-06-17 23:34:15,720][12883] Updated weights for policy 0, policy_version 13630 (0.0040) +[2024-06-17 23:34:16,996][12645] Fps is (10 sec: 36037.0, 60 sec: 39593.3, 300 sec: 40154.4). Total num frames: 223313920. Throughput: 0: 39838.5. Samples: 223407660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:34:16,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:34:20,252][12883] Updated weights for policy 0, policy_version 13640 (0.0033) +[2024-06-17 23:34:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40686.9, 300 sec: 40210.2). Total num frames: 223543296. Throughput: 0: 39981.3. Samples: 223648820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-17 23:34:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:34:24,276][12883] Updated weights for policy 0, policy_version 13650 (0.0026) +[2024-06-17 23:34:26,994][12645] Fps is (10 sec: 39330.1, 60 sec: 39594.7, 300 sec: 40154.7). Total num frames: 223707136. Throughput: 0: 40193.4. Samples: 223894860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-17 23:34:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:34:28,479][12883] Updated weights for policy 0, policy_version 13660 (0.0028) +[2024-06-17 23:34:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.8, 300 sec: 40210.5). Total num frames: 223936512. Throughput: 0: 40011.0. Samples: 224008700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 23:34:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:34:32,559][12883] Updated weights for policy 0, policy_version 13670 (0.0033) +[2024-06-17 23:34:36,633][12883] Updated weights for policy 0, policy_version 13680 (0.0039) +[2024-06-17 23:34:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40413.8, 300 sec: 40099.1). Total num frames: 224133120. Throughput: 0: 40262.6. Samples: 224254320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 23:34:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:34:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013680_224133120.pth... +[2024-06-17 23:34:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013093_214515712.pth +[2024-06-17 23:34:40,628][12883] Updated weights for policy 0, policy_version 13690 (0.0030) +[2024-06-17 23:34:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39871.5, 300 sec: 40154.7). Total num frames: 224329728. Throughput: 0: 40267.4. Samples: 224493480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-17 23:34:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:34:44,935][12883] Updated weights for policy 0, policy_version 13700 (0.0025) +[2024-06-17 23:34:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 224542720. Throughput: 0: 40124.8. Samples: 224609500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-17 23:34:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:34:49,237][12883] Updated weights for policy 0, policy_version 13710 (0.0035) +[2024-06-17 23:34:51,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39594.6, 300 sec: 40043.6). Total num frames: 224706560. Throughput: 0: 40151.1. Samples: 224852880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-17 23:34:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:34:53,057][12883] Updated weights for policy 0, policy_version 13720 (0.0039) +[2024-06-17 23:34:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39871.9, 300 sec: 40154.7). Total num frames: 224935936. Throughput: 0: 40009.9. Samples: 225089120. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-17 23:34:56,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:34:57,212][12883] Updated weights for policy 0, policy_version 13730 (0.0043) +[2024-06-17 23:35:01,554][12883] Updated weights for policy 0, policy_version 13740 (0.0040) +[2024-06-17 23:35:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 225148928. Throughput: 0: 40072.2. Samples: 225210820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 23:35:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:35:05,739][12883] Updated weights for policy 0, policy_version 13750 (0.0045) +[2024-06-17 23:35:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 225329152. Throughput: 0: 40040.4. Samples: 225450640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 23:35:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:35:09,560][12883] Updated weights for policy 0, policy_version 13760 (0.0043) +[2024-06-17 23:35:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 225558528. Throughput: 0: 39837.0. Samples: 225687520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-17 23:35:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:35:14,017][12883] Updated weights for policy 0, policy_version 13770 (0.0038) +[2024-06-17 23:35:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40142.3, 300 sec: 40043.6). Total num frames: 225722368. Throughput: 0: 40036.0. Samples: 225810320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-17 23:35:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:35:17,366][12883] Updated weights for policy 0, policy_version 13780 (0.0036) +[2024-06-17 23:35:21,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 225918976. Throughput: 0: 39861.0. Samples: 226048060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-17 23:35:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:35:22,136][12883] Updated weights for policy 0, policy_version 13790 (0.0037) +[2024-06-17 23:35:23,206][12862] Signal inference workers to stop experience collection... (3200 times) +[2024-06-17 23:35:23,268][12883] InferenceWorker_p0-w0: stopping experience collection (3200 times) +[2024-06-17 23:35:23,329][12862] Signal inference workers to resume experience collection... (3200 times) +[2024-06-17 23:35:23,329][12883] InferenceWorker_p0-w0: resuming experience collection (3200 times) +[2024-06-17 23:35:25,272][12883] Updated weights for policy 0, policy_version 13800 (0.0048) +[2024-06-17 23:35:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 40099.3). Total num frames: 226131968. Throughput: 0: 40036.9. Samples: 226295140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-17 23:35:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:35:30,164][12883] Updated weights for policy 0, policy_version 13810 (0.0039) +[2024-06-17 23:35:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39321.7, 300 sec: 39932.5). Total num frames: 226295808. Throughput: 0: 40105.0. Samples: 226414220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 23:35:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:35:33,804][12883] Updated weights for policy 0, policy_version 13820 (0.0036) +[2024-06-17 23:35:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.9, 300 sec: 40099.1). Total num frames: 226541568. Throughput: 0: 39860.0. Samples: 226646580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-17 23:35:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:35:38,228][12883] Updated weights for policy 0, policy_version 13830 (0.0031) +[2024-06-17 23:35:41,750][12883] Updated weights for policy 0, policy_version 13840 (0.0034) +[2024-06-17 23:35:41,994][12645] Fps is (10 sec: 45874.4, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 226754560. Throughput: 0: 39974.1. Samples: 226887960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 23:35:41,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-17 23:35:46,757][12883] Updated weights for policy 0, policy_version 13850 (0.0032) +[2024-06-17 23:35:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39594.6, 300 sec: 39988.1). Total num frames: 226918400. Throughput: 0: 39987.5. Samples: 227010260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 23:35:46,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:35:49,785][12883] Updated weights for policy 0, policy_version 13860 (0.0043) +[2024-06-17 23:35:51,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40413.9, 300 sec: 40043.6). Total num frames: 227131392. Throughput: 0: 39936.0. Samples: 227247760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-17 23:35:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:35:54,805][12883] Updated weights for policy 0, policy_version 13870 (0.0022) +[2024-06-17 23:35:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39321.6, 300 sec: 39932.5). Total num frames: 227295232. Throughput: 0: 40188.9. Samples: 227496020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:35:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:35:58,331][12883] Updated weights for policy 0, policy_version 13880 (0.0041) +[2024-06-17 23:36:01,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39867.6, 300 sec: 40099.1). Total num frames: 227540992. Throughput: 0: 39990.6. Samples: 227609900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:36:01,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:36:03,038][12883] Updated weights for policy 0, policy_version 13890 (0.0030) +[2024-06-17 23:36:06,662][12883] Updated weights for policy 0, policy_version 13900 (0.0028) +[2024-06-17 23:36:06,996][12645] Fps is (10 sec: 45865.0, 60 sec: 40412.4, 300 sec: 40209.9). Total num frames: 227753984. Throughput: 0: 40051.8. Samples: 227850480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:36:06,997][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:36:11,481][12883] Updated weights for policy 0, policy_version 13910 (0.0037) +[2024-06-17 23:36:11,994][12645] Fps is (10 sec: 37684.0, 60 sec: 39321.6, 300 sec: 39988.1). Total num frames: 227917824. Throughput: 0: 39970.7. Samples: 228093820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:36:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:36:14,738][12883] Updated weights for policy 0, policy_version 13920 (0.0039) +[2024-06-17 23:36:16,994][12645] Fps is (10 sec: 37691.3, 60 sec: 40140.8, 300 sec: 40099.1). Total num frames: 228130816. Throughput: 0: 39843.4. Samples: 228207180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:36:16,995][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:36:19,608][12883] Updated weights for policy 0, policy_version 13930 (0.0046) +[2024-06-17 23:36:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 228360192. Throughput: 0: 40073.4. Samples: 228449880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:36:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:36:22,827][12883] Updated weights for policy 0, policy_version 13940 (0.0037) +[2024-06-17 23:36:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.7, 300 sec: 40043.6). Total num frames: 228524032. Throughput: 0: 39942.7. Samples: 228685380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:36:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:36:27,821][12883] Updated weights for policy 0, policy_version 13950 (0.0043) +[2024-06-17 23:36:31,288][12883] Updated weights for policy 0, policy_version 13960 (0.0040) +[2024-06-17 23:36:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 228737024. Throughput: 0: 39862.7. Samples: 228804080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:36:31,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:36:35,674][12883] Updated weights for policy 0, policy_version 13970 (0.0040) +[2024-06-17 23:36:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39594.8, 300 sec: 39988.1). Total num frames: 228917248. Throughput: 0: 40074.7. Samples: 229051120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-17 23:36:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:36:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013973_228933632.pth... +[2024-06-17 23:36:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013386_219316224.pth +[2024-06-17 23:36:39,344][12883] Updated weights for policy 0, policy_version 13980 (0.0036) +[2024-06-17 23:36:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 229130240. Throughput: 0: 39885.7. Samples: 229290880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-17 23:36:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:36:43,645][12883] Updated weights for policy 0, policy_version 13990 (0.0031) +[2024-06-17 23:36:45,161][12862] Signal inference workers to stop experience collection... (3250 times) +[2024-06-17 23:36:45,161][12862] Signal inference workers to resume experience collection... (3250 times) +[2024-06-17 23:36:45,211][12883] InferenceWorker_p0-w0: stopping experience collection (3250 times) +[2024-06-17 23:36:45,211][12883] InferenceWorker_p0-w0: resuming experience collection (3250 times) +[2024-06-17 23:36:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40413.9, 300 sec: 40099.1). Total num frames: 229343232. Throughput: 0: 40070.3. Samples: 229413060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:36:46,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:36:47,979][12883] Updated weights for policy 0, policy_version 14000 (0.0028) +[2024-06-17 23:36:51,519][12883] Updated weights for policy 0, policy_version 14010 (0.0027) +[2024-06-17 23:36:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40140.8, 300 sec: 40043.6). Total num frames: 229539840. Throughput: 0: 40073.2. Samples: 229653680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:36:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:36:55,968][12883] Updated weights for policy 0, policy_version 14020 (0.0038) +[2024-06-17 23:36:56,994][12645] Fps is (10 sec: 36045.1, 60 sec: 40140.8, 300 sec: 39988.1). Total num frames: 229703680. Throughput: 0: 40079.5. Samples: 229897400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:36:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:36:59,799][12883] Updated weights for policy 0, policy_version 14030 (0.0033) +[2024-06-17 23:37:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 229949440. Throughput: 0: 40138.3. Samples: 230013400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:37:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:37:03,982][12883] Updated weights for policy 0, policy_version 14040 (0.0029) +[2024-06-17 23:37:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 39869.2, 300 sec: 40099.2). Total num frames: 230146048. Throughput: 0: 40221.3. Samples: 230259840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-17 23:37:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:37:08,326][12883] Updated weights for policy 0, policy_version 14050 (0.0025) +[2024-06-17 23:37:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 230359040. Throughput: 0: 40122.8. Samples: 230490900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-17 23:37:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:37:12,004][12883] Updated weights for policy 0, policy_version 14060 (0.0035) +[2024-06-17 23:37:16,393][12883] Updated weights for policy 0, policy_version 14070 (0.0039) +[2024-06-17 23:37:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40140.8, 300 sec: 40099.2). Total num frames: 230539264. Throughput: 0: 40239.6. Samples: 230614860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 27.0) +[2024-06-17 23:37:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:37:20,645][12883] Updated weights for policy 0, policy_version 14080 (0.0041) +[2024-06-17 23:37:21,994][12645] Fps is (10 sec: 34406.2, 60 sec: 39048.6, 300 sec: 39932.5). Total num frames: 230703104. Throughput: 0: 40161.8. Samples: 230858400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 27.0) +[2024-06-17 23:37:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:37:24,426][12883] Updated weights for policy 0, policy_version 14090 (0.0039) +[2024-06-17 23:37:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 230948864. Throughput: 0: 40043.0. Samples: 231092820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-17 23:37:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:37:28,770][12883] Updated weights for policy 0, policy_version 14100 (0.0040) +[2024-06-17 23:37:31,994][12645] Fps is (10 sec: 44236.1, 60 sec: 40140.7, 300 sec: 40099.1). Total num frames: 231145472. Throughput: 0: 40090.6. Samples: 231217140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-17 23:37:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:37:32,623][12883] Updated weights for policy 0, policy_version 14110 (0.0035) +[2024-06-17 23:37:36,777][12883] Updated weights for policy 0, policy_version 14120 (0.0039) +[2024-06-17 23:37:36,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40413.8, 300 sec: 40043.6). Total num frames: 231342080. Throughput: 0: 40053.7. Samples: 231456100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:37:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:37:40,969][12883] Updated weights for policy 0, policy_version 14130 (0.0037) +[2024-06-17 23:37:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.7, 300 sec: 40099.1). Total num frames: 231538688. Throughput: 0: 40169.2. Samples: 231705020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:37:41,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:37:44,799][12883] Updated weights for policy 0, policy_version 14140 (0.0034) +[2024-06-17 23:37:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.8, 300 sec: 39988.1). Total num frames: 231735296. Throughput: 0: 40200.5. Samples: 231822420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:37:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:37:49,050][12883] Updated weights for policy 0, policy_version 14150 (0.0040) +[2024-06-17 23:37:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40140.7, 300 sec: 40099.2). Total num frames: 231948288. Throughput: 0: 39990.2. Samples: 232059400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-17 23:37:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:37:52,695][12883] Updated weights for policy 0, policy_version 14160 (0.0042) +[2024-06-17 23:37:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 39988.1). Total num frames: 232128512. Throughput: 0: 40205.2. Samples: 232300140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-17 23:37:56,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:37:57,554][12883] Updated weights for policy 0, policy_version 14170 (0.0039) +[2024-06-17 23:38:00,782][12883] Updated weights for policy 0, policy_version 14180 (0.0047) +[2024-06-17 23:38:01,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 232325120. Throughput: 0: 40032.5. Samples: 232416320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) +[2024-06-17 23:38:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:38:05,618][12883] Updated weights for policy 0, policy_version 14190 (0.0044) +[2024-06-17 23:38:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 232538112. Throughput: 0: 40072.9. Samples: 232661680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) +[2024-06-17 23:38:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:38:09,589][12883] Updated weights for policy 0, policy_version 14200 (0.0029) +[2024-06-17 23:38:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.7, 300 sec: 40043.6). Total num frames: 232751104. Throughput: 0: 40226.4. Samples: 232903000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) +[2024-06-17 23:38:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:38:13,941][12883] Updated weights for policy 0, policy_version 14210 (0.0037) +[2024-06-17 23:38:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 232947712. Throughput: 0: 40113.0. Samples: 233022220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) +[2024-06-17 23:38:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:38:17,512][12883] Updated weights for policy 0, policy_version 14220 (0.0045) +[2024-06-17 23:38:21,766][12883] Updated weights for policy 0, policy_version 14230 (0.0028) +[2024-06-17 23:38:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40686.9, 300 sec: 40043.6). Total num frames: 233144320. Throughput: 0: 40196.9. Samples: 233264960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) +[2024-06-17 23:38:21,995][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:38:25,383][12883] Updated weights for policy 0, policy_version 14240 (0.0037) +[2024-06-17 23:38:26,996][12645] Fps is (10 sec: 39312.8, 60 sec: 39866.3, 300 sec: 39987.8). Total num frames: 233340928. Throughput: 0: 40074.1. Samples: 233508440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) +[2024-06-17 23:38:26,996][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:38:30,184][12883] Updated weights for policy 0, policy_version 14250 (0.0032) +[2024-06-17 23:38:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 233553920. Throughput: 0: 40123.5. Samples: 233627980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) +[2024-06-17 23:38:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:38:33,406][12883] Updated weights for policy 0, policy_version 14260 (0.0034) +[2024-06-17 23:38:36,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40140.8, 300 sec: 40044.4). Total num frames: 233750528. Throughput: 0: 40395.1. Samples: 233877180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) +[2024-06-17 23:38:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:38:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014267_233750528.pth... +[2024-06-17 23:38:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013680_224133120.pth +[2024-06-17 23:38:38,150][12883] Updated weights for policy 0, policy_version 14270 (0.0039) +[2024-06-17 23:38:41,477][12862] Signal inference workers to stop experience collection... (3300 times) +[2024-06-17 23:38:41,478][12862] Signal inference workers to resume experience collection... (3300 times) +[2024-06-17 23:38:41,528][12883] InferenceWorker_p0-w0: stopping experience collection (3300 times) +[2024-06-17 23:38:41,528][12883] InferenceWorker_p0-w0: resuming experience collection (3300 times) +[2024-06-17 23:38:41,607][12883] Updated weights for policy 0, policy_version 14280 (0.0028) +[2024-06-17 23:38:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40414.0, 300 sec: 40154.7). Total num frames: 233963520. Throughput: 0: 40224.6. Samples: 234110240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-17 23:38:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:38:46,092][12883] Updated weights for policy 0, policy_version 14290 (0.0034) +[2024-06-17 23:38:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40413.9, 300 sec: 40099.2). Total num frames: 234160128. Throughput: 0: 40491.1. Samples: 234238420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-17 23:38:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:38:49,767][12883] Updated weights for policy 0, policy_version 14300 (0.0040) +[2024-06-17 23:38:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.9, 300 sec: 40044.5). Total num frames: 234356736. Throughput: 0: 40259.5. Samples: 234473360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:38:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:38:53,960][12883] Updated weights for policy 0, policy_version 14310 (0.0028) +[2024-06-17 23:38:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 234569728. Throughput: 0: 40322.1. Samples: 234717500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-17 23:38:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:38:57,659][12883] Updated weights for policy 0, policy_version 14320 (0.0047) +[2024-06-17 23:39:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40043.6). Total num frames: 234766336. Throughput: 0: 40448.0. Samples: 234842380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:39:01,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-17 23:39:02,371][12883] Updated weights for policy 0, policy_version 14330 (0.0035) +[2024-06-17 23:39:06,060][12883] Updated weights for policy 0, policy_version 14340 (0.0032) +[2024-06-17 23:39:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.7, 300 sec: 40154.7). Total num frames: 234962944. Throughput: 0: 40411.5. Samples: 235083480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:39:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:39:10,451][12883] Updated weights for policy 0, policy_version 14350 (0.0029) +[2024-06-17 23:39:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.8, 300 sec: 40210.5). Total num frames: 235175936. Throughput: 0: 40426.4. Samples: 235327540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-17 23:39:11,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:39:14,071][12883] Updated weights for policy 0, policy_version 14360 (0.0027) +[2024-06-17 23:39:16,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40140.8, 300 sec: 40043.6). Total num frames: 235356160. Throughput: 0: 40432.0. Samples: 235447420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-17 23:39:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:39:18,432][12883] Updated weights for policy 0, policy_version 14370 (0.0037) +[2024-06-17 23:39:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40265.8). Total num frames: 235585536. Throughput: 0: 40330.7. Samples: 235692060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-17 23:39:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:39:22,163][12883] Updated weights for policy 0, policy_version 14380 (0.0032) +[2024-06-17 23:39:26,606][12883] Updated weights for policy 0, policy_version 14390 (0.0040) +[2024-06-17 23:39:26,996][12645] Fps is (10 sec: 42588.9, 60 sec: 40687.0, 300 sec: 40154.4). Total num frames: 235782144. Throughput: 0: 40476.6. Samples: 235931780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-17 23:39:26,996][12645] Avg episode reward: [(0, '0.020')] +[2024-06-17 23:39:30,203][12883] Updated weights for policy 0, policy_version 14400 (0.0040) +[2024-06-17 23:39:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 235978752. Throughput: 0: 40357.8. Samples: 236054520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-17 23:39:31,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:39:34,585][12883] Updated weights for policy 0, policy_version 14410 (0.0039) +[2024-06-17 23:39:36,994][12645] Fps is (10 sec: 39329.9, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 236175360. Throughput: 0: 40515.0. Samples: 236296540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) +[2024-06-17 23:39:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:39:38,293][12883] Updated weights for policy 0, policy_version 14420 (0.0035) +[2024-06-17 23:39:41,994][12645] Fps is (10 sec: 37682.6, 60 sec: 39867.6, 300 sec: 40043.6). Total num frames: 236355584. Throughput: 0: 40607.5. Samples: 236544840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) +[2024-06-17 23:39:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:39:42,743][12883] Updated weights for policy 0, policy_version 14430 (0.0032) +[2024-06-17 23:39:46,246][12883] Updated weights for policy 0, policy_version 14440 (0.0035) +[2024-06-17 23:39:47,000][12645] Fps is (10 sec: 44209.5, 60 sec: 40955.7, 300 sec: 40376.0). Total num frames: 236617728. Throughput: 0: 40430.3. Samples: 236662000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:39:47,001][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:39:50,968][12883] Updated weights for policy 0, policy_version 14450 (0.0046) +[2024-06-17 23:39:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 236781568. Throughput: 0: 40549.9. Samples: 236908220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:39:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:39:54,457][12883] Updated weights for policy 0, policy_version 14460 (0.0037) +[2024-06-17 23:39:56,994][12645] Fps is (10 sec: 34427.5, 60 sec: 39867.7, 300 sec: 40043.6). Total num frames: 236961792. Throughput: 0: 40428.8. Samples: 237146840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-17 23:39:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:39:59,263][12883] Updated weights for policy 0, policy_version 14470 (0.0032) +[2024-06-17 23:40:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 237191168. Throughput: 0: 40403.1. Samples: 237265560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-17 23:40:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:40:02,493][12883] Updated weights for policy 0, policy_version 14480 (0.0042) +[2024-06-17 23:40:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40140.9, 300 sec: 40043.6). Total num frames: 237371392. Throughput: 0: 40419.1. Samples: 237510920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:40:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:40:07,291][12883] Updated weights for policy 0, policy_version 14490 (0.0030) +[2024-06-17 23:40:10,424][12862] Signal inference workers to stop experience collection... (3350 times) +[2024-06-17 23:40:10,424][12862] Signal inference workers to resume experience collection... (3350 times) +[2024-06-17 23:40:10,461][12883] InferenceWorker_p0-w0: stopping experience collection (3350 times) +[2024-06-17 23:40:10,462][12883] InferenceWorker_p0-w0: resuming experience collection (3350 times) +[2024-06-17 23:40:10,559][12883] Updated weights for policy 0, policy_version 14500 (0.0052) +[2024-06-17 23:40:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 237617152. Throughput: 0: 40295.3. Samples: 237744980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:40:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:40:15,447][12883] Updated weights for policy 0, policy_version 14510 (0.0032) +[2024-06-17 23:40:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 40265.8). Total num frames: 237797376. Throughput: 0: 40406.1. Samples: 237872800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) +[2024-06-17 23:40:16,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:40:18,833][12883] Updated weights for policy 0, policy_version 14520 (0.0036) +[2024-06-17 23:40:21,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 237977600. Throughput: 0: 40227.7. Samples: 238106780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) +[2024-06-17 23:40:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:40:23,697][12883] Updated weights for policy 0, policy_version 14530 (0.0030) +[2024-06-17 23:40:26,693][12883] Updated weights for policy 0, policy_version 14540 (0.0036) +[2024-06-17 23:40:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40688.4, 300 sec: 40432.4). Total num frames: 238223360. Throughput: 0: 40162.7. Samples: 238352160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) +[2024-06-17 23:40:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:40:31,817][12883] Updated weights for policy 0, policy_version 14550 (0.0033) +[2024-06-17 23:40:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 238387200. Throughput: 0: 40410.1. Samples: 238480200. Policy #0 lag: (min: 0.0, avg: 13.2, max: 24.0) +[2024-06-17 23:40:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:40:34,984][12883] Updated weights for policy 0, policy_version 14560 (0.0052) +[2024-06-17 23:40:37,000][12645] Fps is (10 sec: 36022.4, 60 sec: 40136.7, 300 sec: 40098.3). Total num frames: 238583808. Throughput: 0: 40052.2. Samples: 238710820. Policy #0 lag: (min: 0.0, avg: 13.2, max: 24.0) +[2024-06-17 23:40:37,001][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:40:37,125][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014563_238600192.pth... +[2024-06-17 23:40:37,187][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013973_228933632.pth +[2024-06-17 23:40:40,264][12883] Updated weights for policy 0, policy_version 14570 (0.0048) +[2024-06-17 23:40:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 40265.8). Total num frames: 238796800. Throughput: 0: 40224.6. Samples: 238956940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-17 23:40:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:40:43,103][12883] Updated weights for policy 0, policy_version 14580 (0.0047) +[2024-06-17 23:40:46,994][12645] Fps is (10 sec: 39346.0, 60 sec: 39325.7, 300 sec: 40154.7). Total num frames: 238977024. Throughput: 0: 40284.4. Samples: 239078360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-17 23:40:46,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:40:48,326][12883] Updated weights for policy 0, policy_version 14590 (0.0039) +[2024-06-17 23:40:51,286][12883] Updated weights for policy 0, policy_version 14600 (0.0028) +[2024-06-17 23:40:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40686.9, 300 sec: 40432.4). Total num frames: 239222784. Throughput: 0: 40147.1. Samples: 239317540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-17 23:40:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:40:56,617][12883] Updated weights for policy 0, policy_version 14610 (0.0039) +[2024-06-17 23:40:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 239386624. Throughput: 0: 40429.7. Samples: 239564320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-17 23:40:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:40:59,921][12883] Updated weights for policy 0, policy_version 14620 (0.0042) +[2024-06-17 23:41:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40140.8, 300 sec: 40155.0). Total num frames: 239599616. Throughput: 0: 40235.7. Samples: 239683400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:41:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:41:04,380][12883] Updated weights for policy 0, policy_version 14630 (0.0035) +[2024-06-17 23:41:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40960.0, 300 sec: 40376.8). Total num frames: 239828992. Throughput: 0: 40653.6. Samples: 239936200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-17 23:41:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:41:07,763][12883] Updated weights for policy 0, policy_version 14640 (0.0036) +[2024-06-17 23:41:11,993][12645] Fps is (10 sec: 40960.5, 60 sec: 39867.8, 300 sec: 40265.8). Total num frames: 240009216. Throughput: 0: 40459.8. Samples: 240172840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-17 23:41:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:41:12,281][12883] Updated weights for policy 0, policy_version 14650 (0.0032) +[2024-06-17 23:41:15,763][12883] Updated weights for policy 0, policy_version 14660 (0.0030) +[2024-06-17 23:41:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 240222208. Throughput: 0: 40344.8. Samples: 240295720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-17 23:41:16,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:41:20,274][12883] Updated weights for policy 0, policy_version 14670 (0.0024) +[2024-06-17 23:41:21,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 240402432. Throughput: 0: 40736.8. Samples: 240543720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 23:41:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:41:24,211][12883] Updated weights for policy 0, policy_version 14680 (0.0028) +[2024-06-17 23:41:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 240631808. Throughput: 0: 40603.5. Samples: 240784100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 23:41:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:41:28,109][12883] Updated weights for policy 0, policy_version 14690 (0.0048) +[2024-06-17 23:41:31,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40687.0, 300 sec: 40376.8). Total num frames: 240828416. Throughput: 0: 40602.8. Samples: 240905480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 23:41:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:41:32,069][12883] Updated weights for policy 0, policy_version 14700 (0.0032) +[2024-06-17 23:41:36,144][12883] Updated weights for policy 0, policy_version 14710 (0.0048) +[2024-06-17 23:41:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40691.2, 300 sec: 40321.3). Total num frames: 241025024. Throughput: 0: 40800.0. Samples: 241153540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-17 23:41:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:41:39,993][12883] Updated weights for policy 0, policy_version 14720 (0.0039) +[2024-06-17 23:41:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 241238016. Throughput: 0: 40609.9. Samples: 241391760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-17 23:41:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:41:43,537][12862] Signal inference workers to stop experience collection... (3400 times) +[2024-06-17 23:41:43,537][12862] Signal inference workers to resume experience collection... (3400 times) +[2024-06-17 23:41:43,551][12883] InferenceWorker_p0-w0: stopping experience collection (3400 times) +[2024-06-17 23:41:43,552][12883] InferenceWorker_p0-w0: resuming experience collection (3400 times) +[2024-06-17 23:41:44,542][12883] Updated weights for policy 0, policy_version 14730 (0.0033) +[2024-06-17 23:41:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 40265.7). Total num frames: 241418240. Throughput: 0: 40683.9. Samples: 241514180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:41:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:41:47,973][12883] Updated weights for policy 0, policy_version 14740 (0.0043) +[2024-06-17 23:41:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40140.7, 300 sec: 40432.4). Total num frames: 241631232. Throughput: 0: 40460.0. Samples: 241756900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:41:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:41:52,545][12883] Updated weights for policy 0, policy_version 14750 (0.0034) +[2024-06-17 23:41:56,536][12883] Updated weights for policy 0, policy_version 14760 (0.0038) +[2024-06-17 23:41:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 40321.3). Total num frames: 241844224. Throughput: 0: 40585.6. Samples: 241999200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-17 23:41:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:42:00,672][12883] Updated weights for policy 0, policy_version 14770 (0.0034) +[2024-06-17 23:42:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 242024448. Throughput: 0: 40555.2. Samples: 242120700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-17 23:42:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:42:04,585][12883] Updated weights for policy 0, policy_version 14780 (0.0035) +[2024-06-17 23:42:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.8, 300 sec: 40265.7). Total num frames: 242237440. Throughput: 0: 40350.6. Samples: 242359500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-17 23:42:06,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:42:08,724][12883] Updated weights for policy 0, policy_version 14790 (0.0033) +[2024-06-17 23:42:11,994][12645] Fps is (10 sec: 42597.4, 60 sec: 40686.7, 300 sec: 40376.8). Total num frames: 242450432. Throughput: 0: 40547.8. Samples: 242608760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-17 23:42:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:42:12,614][12883] Updated weights for policy 0, policy_version 14800 (0.0038) +[2024-06-17 23:42:16,733][12883] Updated weights for policy 0, policy_version 14810 (0.0032) +[2024-06-17 23:42:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40413.9, 300 sec: 40487.9). Total num frames: 242647040. Throughput: 0: 40477.8. Samples: 242726980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:42:16,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:42:20,543][12883] Updated weights for policy 0, policy_version 14820 (0.0031) +[2024-06-17 23:42:21,994][12645] Fps is (10 sec: 40961.1, 60 sec: 40960.0, 300 sec: 40376.9). Total num frames: 242860032. Throughput: 0: 40425.4. Samples: 242972680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:42:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:42:24,740][12883] Updated weights for policy 0, policy_version 14830 (0.0044) +[2024-06-17 23:42:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40413.8, 300 sec: 40376.8). Total num frames: 243056640. Throughput: 0: 40647.9. Samples: 243220920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) +[2024-06-17 23:42:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:42:28,607][12883] Updated weights for policy 0, policy_version 14840 (0.0038) +[2024-06-17 23:42:31,994][12645] Fps is (10 sec: 37682.4, 60 sec: 40140.6, 300 sec: 40321.3). Total num frames: 243236864. Throughput: 0: 40460.8. Samples: 243334920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) +[2024-06-17 23:42:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:42:33,278][12883] Updated weights for policy 0, policy_version 14850 (0.0035) +[2024-06-17 23:42:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.8, 300 sec: 40376.8). Total num frames: 243449856. Throughput: 0: 40524.4. Samples: 243580500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) +[2024-06-17 23:42:36,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-17 23:42:37,027][12883] Updated weights for policy 0, policy_version 14860 (0.0037) +[2024-06-17 23:42:37,034][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014860_243466240.pth... +[2024-06-17 23:42:37,097][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014267_233750528.pth +[2024-06-17 23:42:41,323][12883] Updated weights for policy 0, policy_version 14870 (0.0032) +[2024-06-17 23:42:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40413.8, 300 sec: 40432.4). Total num frames: 243662848. Throughput: 0: 40588.4. Samples: 243825680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 23:42:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:42:45,060][12883] Updated weights for policy 0, policy_version 14880 (0.0056) +[2024-06-17 23:42:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 40432.4). Total num frames: 243875840. Throughput: 0: 40561.7. Samples: 243945980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 23:42:46,996][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:42:49,341][12883] Updated weights for policy 0, policy_version 14890 (0.0037) +[2024-06-17 23:42:51,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40414.0, 300 sec: 40432.4). Total num frames: 244056064. Throughput: 0: 40715.7. Samples: 244191700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:42:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:42:52,965][12883] Updated weights for policy 0, policy_version 14900 (0.0033) +[2024-06-17 23:42:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.9, 300 sec: 40432.4). Total num frames: 244252672. Throughput: 0: 40647.8. Samples: 244437900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:42:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:42:57,252][12883] Updated weights for policy 0, policy_version 14910 (0.0044) +[2024-06-17 23:43:00,965][12883] Updated weights for policy 0, policy_version 14920 (0.0040) +[2024-06-17 23:43:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 40543.4). Total num frames: 244498432. Throughput: 0: 40725.2. Samples: 244559620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) +[2024-06-17 23:43:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:43:05,321][12883] Updated weights for policy 0, policy_version 14930 (0.0049) +[2024-06-17 23:43:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.9, 300 sec: 40376.8). Total num frames: 244662272. Throughput: 0: 40556.0. Samples: 244797700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) +[2024-06-17 23:43:06,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:43:09,514][12862] Signal inference workers to stop experience collection... (3450 times) +[2024-06-17 23:43:09,547][12883] InferenceWorker_p0-w0: stopping experience collection (3450 times) +[2024-06-17 23:43:09,574][12862] Signal inference workers to resume experience collection... (3450 times) +[2024-06-17 23:43:09,575][12883] InferenceWorker_p0-w0: resuming experience collection (3450 times) +[2024-06-17 23:43:09,579][12883] Updated weights for policy 0, policy_version 14940 (0.0045) +[2024-06-17 23:43:11,996][12645] Fps is (10 sec: 37675.3, 60 sec: 40412.6, 300 sec: 40432.1). Total num frames: 244875264. Throughput: 0: 40309.7. Samples: 245034940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-17 23:43:11,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:43:13,460][12883] Updated weights for policy 0, policy_version 14950 (0.0035) +[2024-06-17 23:43:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40376.8). Total num frames: 245055488. Throughput: 0: 40480.2. Samples: 245156520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-17 23:43:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:43:17,511][12883] Updated weights for policy 0, policy_version 14960 (0.0036) +[2024-06-17 23:43:21,525][12883] Updated weights for policy 0, policy_version 14970 (0.0035) +[2024-06-17 23:43:21,994][12645] Fps is (10 sec: 40968.6, 60 sec: 40413.8, 300 sec: 40488.2). Total num frames: 245284864. Throughput: 0: 40428.9. Samples: 245399800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-17 23:43:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:43:25,402][12883] Updated weights for policy 0, policy_version 14980 (0.0035) +[2024-06-17 23:43:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40414.0, 300 sec: 40432.4). Total num frames: 245481472. Throughput: 0: 40426.8. Samples: 245644880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:43:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:43:29,370][12883] Updated weights for policy 0, policy_version 14990 (0.0030) +[2024-06-17 23:43:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 245678080. Throughput: 0: 40358.2. Samples: 245762100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:43:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:43:33,481][12883] Updated weights for policy 0, policy_version 15000 (0.0034) +[2024-06-17 23:43:36,996][12645] Fps is (10 sec: 39312.3, 60 sec: 40412.4, 300 sec: 40376.5). Total num frames: 245874688. Throughput: 0: 40287.2. Samples: 246004720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) +[2024-06-17 23:43:36,997][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:43:37,394][12883] Updated weights for policy 0, policy_version 15010 (0.0028) +[2024-06-17 23:43:41,582][12883] Updated weights for policy 0, policy_version 15020 (0.0041) +[2024-06-17 23:43:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 246104064. Throughput: 0: 40193.7. Samples: 246246620. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) +[2024-06-17 23:43:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:43:45,354][12883] Updated weights for policy 0, policy_version 15030 (0.0035) +[2024-06-17 23:43:46,994][12645] Fps is (10 sec: 40969.0, 60 sec: 40140.8, 300 sec: 40432.4). Total num frames: 246284288. Throughput: 0: 40285.8. Samples: 246372480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:43:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:43:49,754][12883] Updated weights for policy 0, policy_version 15040 (0.0039) +[2024-06-17 23:43:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40686.8, 300 sec: 40432.4). Total num frames: 246497280. Throughput: 0: 40306.6. Samples: 246611500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:43:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:43:53,619][12883] Updated weights for policy 0, policy_version 15050 (0.0048) +[2024-06-17 23:43:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 246693888. Throughput: 0: 40544.7. Samples: 246859360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:43:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:43:57,873][12883] Updated weights for policy 0, policy_version 15060 (0.0038) +[2024-06-17 23:44:01,648][12883] Updated weights for policy 0, policy_version 15070 (0.0032) +[2024-06-17 23:44:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40140.8, 300 sec: 40487.9). Total num frames: 246906880. Throughput: 0: 40504.0. Samples: 246979200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:44:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:44:05,793][12883] Updated weights for policy 0, policy_version 15080 (0.0029) +[2024-06-17 23:44:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40959.9, 300 sec: 40487.9). Total num frames: 247119872. Throughput: 0: 40535.1. Samples: 247223880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:44:06,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-17 23:44:09,758][12883] Updated weights for policy 0, policy_version 15090 (0.0028) +[2024-06-17 23:44:11,996][12645] Fps is (10 sec: 37674.5, 60 sec: 40140.7, 300 sec: 40432.1). Total num frames: 247283712. Throughput: 0: 40545.9. Samples: 247469540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 23:44:11,997][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:44:13,906][12883] Updated weights for policy 0, policy_version 15100 (0.0043) +[2024-06-17 23:44:16,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40960.0, 300 sec: 40432.4). Total num frames: 247513088. Throughput: 0: 40379.2. Samples: 247579160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-17 23:44:16,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:44:17,898][12883] Updated weights for policy 0, policy_version 15110 (0.0029) +[2024-06-17 23:44:21,994][12645] Fps is (10 sec: 42608.6, 60 sec: 40414.0, 300 sec: 40432.7). Total num frames: 247709696. Throughput: 0: 40476.3. Samples: 247826060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 23:44:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:44:22,074][12883] Updated weights for policy 0, policy_version 15120 (0.0034) +[2024-06-17 23:44:23,851][12862] Signal inference workers to stop experience collection... (3500 times) +[2024-06-17 23:44:23,851][12862] Signal inference workers to resume experience collection... (3500 times) +[2024-06-17 23:44:23,864][12883] InferenceWorker_p0-w0: stopping experience collection (3500 times) +[2024-06-17 23:44:23,890][12883] InferenceWorker_p0-w0: resuming experience collection (3500 times) +[2024-06-17 23:44:26,230][12883] Updated weights for policy 0, policy_version 15130 (0.0030) +[2024-06-17 23:44:26,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40140.7, 300 sec: 40376.8). Total num frames: 247889920. Throughput: 0: 40513.8. Samples: 248069740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 23:44:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:44:29,975][12883] Updated weights for policy 0, policy_version 15140 (0.0034) +[2024-06-17 23:44:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40960.0, 300 sec: 40543.5). Total num frames: 248135680. Throughput: 0: 40373.8. Samples: 248189300. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) +[2024-06-17 23:44:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:44:34,233][12883] Updated weights for policy 0, policy_version 15150 (0.0025) +[2024-06-17 23:44:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40142.4, 300 sec: 40432.4). Total num frames: 248283136. Throughput: 0: 40453.5. Samples: 248431900. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) +[2024-06-17 23:44:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:44:37,067][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015155_248299520.pth... +[2024-06-17 23:44:37,130][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014563_238600192.pth +[2024-06-17 23:44:38,173][12883] Updated weights for policy 0, policy_version 15160 (0.0044) +[2024-06-17 23:44:41,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40140.8, 300 sec: 40322.2). Total num frames: 248512512. Throughput: 0: 40259.0. Samples: 248671020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-17 23:44:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:44:42,833][12883] Updated weights for policy 0, policy_version 15170 (0.0060) +[2024-06-17 23:44:46,662][12883] Updated weights for policy 0, policy_version 15180 (0.0046) +[2024-06-17 23:44:46,994][12645] Fps is (10 sec: 44235.6, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 248725504. Throughput: 0: 40408.7. Samples: 248797600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-17 23:44:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:44:50,971][12883] Updated weights for policy 0, policy_version 15190 (0.0044) +[2024-06-17 23:44:51,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39867.7, 300 sec: 40432.4). Total num frames: 248889344. Throughput: 0: 40333.8. Samples: 249038900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-17 23:44:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:44:54,803][12883] Updated weights for policy 0, policy_version 15200 (0.0039) +[2024-06-17 23:44:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40413.8, 300 sec: 40432.4). Total num frames: 249118720. Throughput: 0: 40056.7. Samples: 249272000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:44:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:45:00,057][12883] Updated weights for policy 0, policy_version 15210 (0.0030) +[2024-06-17 23:45:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 249331712. Throughput: 0: 40399.9. Samples: 249397160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:45:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:45:02,859][12883] Updated weights for policy 0, policy_version 15220 (0.0036) +[2024-06-17 23:45:06,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39594.7, 300 sec: 40265.7). Total num frames: 249495552. Throughput: 0: 40182.1. Samples: 249634260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 19.0) +[2024-06-17 23:45:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:45:07,733][12883] Updated weights for policy 0, policy_version 15230 (0.0045) +[2024-06-17 23:45:10,831][12883] Updated weights for policy 0, policy_version 15240 (0.0034) +[2024-06-17 23:45:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40961.6, 300 sec: 40487.9). Total num frames: 249741312. Throughput: 0: 40099.2. Samples: 249874200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 19.0) +[2024-06-17 23:45:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:45:15,694][12883] Updated weights for policy 0, policy_version 15250 (0.0030) +[2024-06-17 23:45:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40140.8, 300 sec: 40487.9). Total num frames: 249921536. Throughput: 0: 40335.6. Samples: 250004400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 23:45:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:45:18,623][12883] Updated weights for policy 0, policy_version 15260 (0.0029) +[2024-06-17 23:45:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 250118144. Throughput: 0: 40135.1. Samples: 250237980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-17 23:45:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:45:23,766][12883] Updated weights for policy 0, policy_version 15270 (0.0031) +[2024-06-17 23:45:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 250331136. Throughput: 0: 40275.5. Samples: 250483420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:45:26,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:45:27,214][12883] Updated weights for policy 0, policy_version 15280 (0.0044) +[2024-06-17 23:45:31,809][12883] Updated weights for policy 0, policy_version 15290 (0.0029) +[2024-06-17 23:45:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.7, 300 sec: 40433.2). Total num frames: 250511360. Throughput: 0: 40122.4. Samples: 250603100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:45:31,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:45:35,335][12883] Updated weights for policy 0, policy_version 15300 (0.0043) +[2024-06-17 23:45:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 40487.9). Total num frames: 250740736. Throughput: 0: 40213.8. Samples: 250848520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:45:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:45:39,972][12883] Updated weights for policy 0, policy_version 15310 (0.0034) +[2024-06-17 23:45:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 250937344. Throughput: 0: 40450.7. Samples: 251092280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 23:45:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:45:43,523][12883] Updated weights for policy 0, policy_version 15320 (0.0048) +[2024-06-17 23:45:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.9, 300 sec: 40376.8). Total num frames: 251133952. Throughput: 0: 40286.2. Samples: 251210040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 23:45:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:45:47,919][12883] Updated weights for policy 0, policy_version 15330 (0.0043) +[2024-06-17 23:45:49,668][12862] Signal inference workers to stop experience collection... (3550 times) +[2024-06-17 23:45:49,708][12883] InferenceWorker_p0-w0: stopping experience collection (3550 times) +[2024-06-17 23:45:49,717][12862] Signal inference workers to resume experience collection... (3550 times) +[2024-06-17 23:45:49,727][12883] InferenceWorker_p0-w0: resuming experience collection (3550 times) +[2024-06-17 23:45:51,623][12883] Updated weights for policy 0, policy_version 15340 (0.0033) +[2024-06-17 23:45:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 40543.5). Total num frames: 251346944. Throughput: 0: 40398.3. Samples: 251452180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) +[2024-06-17 23:45:51,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:45:55,799][12883] Updated weights for policy 0, policy_version 15350 (0.0040) +[2024-06-17 23:45:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 40376.8). Total num frames: 251510784. Throughput: 0: 40467.1. Samples: 251695220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) +[2024-06-17 23:45:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:45:59,902][12883] Updated weights for policy 0, policy_version 15360 (0.0036) +[2024-06-17 23:46:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40140.8, 300 sec: 40376.9). Total num frames: 251740160. Throughput: 0: 40207.6. Samples: 251813740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-17 23:46:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:46:03,823][12883] Updated weights for policy 0, policy_version 15370 (0.0037) +[2024-06-17 23:46:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 251936768. Throughput: 0: 40410.1. Samples: 252056440. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-17 23:46:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:46:07,845][12883] Updated weights for policy 0, policy_version 15380 (0.0045) +[2024-06-17 23:46:11,968][12883] Updated weights for policy 0, policy_version 15390 (0.0038) +[2024-06-17 23:46:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.8, 300 sec: 40432.4). Total num frames: 252149760. Throughput: 0: 40267.1. Samples: 252295440. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-17 23:46:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:46:16,501][12883] Updated weights for policy 0, policy_version 15400 (0.0034) +[2024-06-17 23:46:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.8, 300 sec: 40432.4). Total num frames: 252329984. Throughput: 0: 40152.4. Samples: 252409960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:46:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:46:20,258][12883] Updated weights for policy 0, policy_version 15410 (0.0043) +[2024-06-17 23:46:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40140.7, 300 sec: 40321.3). Total num frames: 252526592. Throughput: 0: 39949.4. Samples: 252646240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:46:22,003][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:46:24,780][12883] Updated weights for policy 0, policy_version 15420 (0.0041) +[2024-06-17 23:46:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 40376.8). Total num frames: 252739584. Throughput: 0: 39795.2. Samples: 252883060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:46:27,003][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:46:28,565][12883] Updated weights for policy 0, policy_version 15430 (0.0057) +[2024-06-17 23:46:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.7, 300 sec: 40321.3). Total num frames: 252919808. Throughput: 0: 39928.0. Samples: 253006800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:46:31,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:46:32,798][12883] Updated weights for policy 0, policy_version 15440 (0.0037) +[2024-06-17 23:46:36,908][12883] Updated weights for policy 0, policy_version 15450 (0.0042) +[2024-06-17 23:46:36,996][12645] Fps is (10 sec: 39312.2, 60 sec: 39866.3, 300 sec: 40321.0). Total num frames: 253132800. Throughput: 0: 39874.0. Samples: 253246600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:46:36,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:46:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015450_253132800.pth... +[2024-06-17 23:46:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014860_243466240.pth +[2024-06-17 23:46:41,165][12883] Updated weights for policy 0, policy_version 15460 (0.0032) +[2024-06-17 23:46:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.7, 300 sec: 40376.8). Total num frames: 253329408. Throughput: 0: 39698.2. Samples: 253481640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:46:41,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:46:45,176][12883] Updated weights for policy 0, policy_version 15470 (0.0038) +[2024-06-17 23:46:46,994][12645] Fps is (10 sec: 37692.2, 60 sec: 39594.7, 300 sec: 40265.8). Total num frames: 253509632. Throughput: 0: 39724.5. Samples: 253601340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-17 23:46:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:46:49,108][12883] Updated weights for policy 0, policy_version 15480 (0.0054) +[2024-06-17 23:46:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39594.6, 300 sec: 40265.8). Total num frames: 253722624. Throughput: 0: 39713.7. Samples: 253843560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:46:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:46:53,308][12883] Updated weights for policy 0, policy_version 15490 (0.0051) +[2024-06-17 23:46:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 253919232. Throughput: 0: 39895.2. Samples: 254090720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:46:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:46:57,431][12883] Updated weights for policy 0, policy_version 15500 (0.0051) +[2024-06-17 23:47:01,265][12883] Updated weights for policy 0, policy_version 15510 (0.0041) +[2024-06-17 23:47:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 254132224. Throughput: 0: 39923.1. Samples: 254206500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:47:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:47:05,440][12883] Updated weights for policy 0, policy_version 15520 (0.0043) +[2024-06-17 23:47:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40140.9, 300 sec: 40321.3). Total num frames: 254345216. Throughput: 0: 40048.5. Samples: 254448420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:47:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:47:09,681][12883] Updated weights for policy 0, policy_version 15530 (0.0032) +[2024-06-17 23:47:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.7, 300 sec: 40265.8). Total num frames: 254525440. Throughput: 0: 40251.9. Samples: 254694400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-17 23:47:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:47:13,455][12883] Updated weights for policy 0, policy_version 15540 (0.0031) +[2024-06-17 23:47:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40265.8). Total num frames: 254738432. Throughput: 0: 40115.6. Samples: 254812000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-17 23:47:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:47:17,573][12883] Updated weights for policy 0, policy_version 15550 (0.0040) +[2024-06-17 23:47:21,523][12883] Updated weights for policy 0, policy_version 15560 (0.0043) +[2024-06-17 23:47:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40413.8, 300 sec: 40321.3). Total num frames: 254951424. Throughput: 0: 40331.8. Samples: 255061440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-17 23:47:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:47:25,558][12883] Updated weights for policy 0, policy_version 15570 (0.0058) +[2024-06-17 23:47:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 255131648. Throughput: 0: 40345.0. Samples: 255297160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-17 23:47:26,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:47:29,601][12883] Updated weights for policy 0, policy_version 15580 (0.0040) +[2024-06-17 23:47:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 40376.8). Total num frames: 255361024. Throughput: 0: 40387.9. Samples: 255418800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-17 23:47:31,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:47:34,087][12883] Updated weights for policy 0, policy_version 15590 (0.0030) +[2024-06-17 23:47:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39596.2, 300 sec: 40154.7). Total num frames: 255508480. Throughput: 0: 40419.7. Samples: 255662440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:47:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:47:38,130][12883] Updated weights for policy 0, policy_version 15600 (0.0046) +[2024-06-17 23:47:38,877][12862] Signal inference workers to stop experience collection... (3600 times) +[2024-06-17 23:47:38,933][12883] InferenceWorker_p0-w0: stopping experience collection (3600 times) +[2024-06-17 23:47:38,932][12862] Signal inference workers to resume experience collection... (3600 times) +[2024-06-17 23:47:38,951][12883] InferenceWorker_p0-w0: resuming experience collection (3600 times) +[2024-06-17 23:47:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 255737856. Throughput: 0: 40248.8. Samples: 255901920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:47:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:47:42,078][12883] Updated weights for policy 0, policy_version 15610 (0.0034) +[2024-06-17 23:47:46,423][12883] Updated weights for policy 0, policy_version 15620 (0.0041) +[2024-06-17 23:47:46,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40413.7, 300 sec: 40265.7). Total num frames: 255934464. Throughput: 0: 40378.5. Samples: 256023540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:47:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:47:50,467][12883] Updated weights for policy 0, policy_version 15630 (0.0045) +[2024-06-17 23:47:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40413.9, 300 sec: 40321.3). Total num frames: 256147456. Throughput: 0: 40350.2. Samples: 256264180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-17 23:47:51,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:47:54,542][12883] Updated weights for policy 0, policy_version 15640 (0.0040) +[2024-06-17 23:47:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 256344064. Throughput: 0: 40141.4. Samples: 256500760. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) +[2024-06-17 23:47:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:47:58,519][12883] Updated weights for policy 0, policy_version 15650 (0.0038) +[2024-06-17 23:48:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40140.7, 300 sec: 40265.7). Total num frames: 256540672. Throughput: 0: 40218.5. Samples: 256621840. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) +[2024-06-17 23:48:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:48:02,969][12883] Updated weights for policy 0, policy_version 15660 (0.0034) +[2024-06-17 23:48:06,570][12883] Updated weights for policy 0, policy_version 15670 (0.0040) +[2024-06-17 23:48:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 40266.1). Total num frames: 256753664. Throughput: 0: 39994.8. Samples: 256861200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:48:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:48:11,174][12883] Updated weights for policy 0, policy_version 15680 (0.0035) +[2024-06-17 23:48:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.8, 300 sec: 40321.3). Total num frames: 256950272. Throughput: 0: 40245.6. Samples: 257108220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:48:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:48:14,382][12883] Updated weights for policy 0, policy_version 15690 (0.0038) +[2024-06-17 23:48:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 257146880. Throughput: 0: 40121.0. Samples: 257224240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:48:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:48:18,987][12883] Updated weights for policy 0, policy_version 15700 (0.0038) +[2024-06-17 23:48:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40140.8, 300 sec: 40265.7). Total num frames: 257359872. Throughput: 0: 40190.2. Samples: 257471000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-17 23:48:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:48:22,649][12883] Updated weights for policy 0, policy_version 15710 (0.0052) +[2024-06-17 23:48:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 257540096. Throughput: 0: 40156.1. Samples: 257708940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-17 23:48:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:48:27,349][12883] Updated weights for policy 0, policy_version 15720 (0.0041) +[2024-06-17 23:48:30,558][12883] Updated weights for policy 0, policy_version 15730 (0.0030) +[2024-06-17 23:48:31,996][12645] Fps is (10 sec: 39313.0, 60 sec: 39866.3, 300 sec: 40265.8). Total num frames: 257753088. Throughput: 0: 40088.3. Samples: 257827600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-17 23:48:31,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:48:35,275][12883] Updated weights for policy 0, policy_version 15740 (0.0033) +[2024-06-17 23:48:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 257949696. Throughput: 0: 40209.4. Samples: 258073600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-17 23:48:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:48:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015744_257949696.pth... +[2024-06-17 23:48:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015155_248299520.pth +[2024-06-17 23:48:38,549][12883] Updated weights for policy 0, policy_version 15750 (0.0035) +[2024-06-17 23:48:41,994][12645] Fps is (10 sec: 39330.4, 60 sec: 40140.9, 300 sec: 40210.2). Total num frames: 258146304. Throughput: 0: 40277.8. Samples: 258313260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:48:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:48:43,466][12883] Updated weights for policy 0, policy_version 15760 (0.0048) +[2024-06-17 23:48:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 258342912. Throughput: 0: 40288.0. Samples: 258434800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:48:46,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:48:47,374][12883] Updated weights for policy 0, policy_version 15770 (0.0044) +[2024-06-17 23:48:51,643][12883] Updated weights for policy 0, policy_version 15780 (0.0030) +[2024-06-17 23:48:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 258555904. Throughput: 0: 40365.3. Samples: 258677640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:48:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:48:55,355][12883] Updated weights for policy 0, policy_version 15790 (0.0047) +[2024-06-17 23:48:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 258736128. Throughput: 0: 40244.5. Samples: 258919220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-17 23:48:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:48:59,673][12883] Updated weights for policy 0, policy_version 15800 (0.0033) +[2024-06-17 23:49:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40414.0, 300 sec: 40154.7). Total num frames: 258965504. Throughput: 0: 40301.7. Samples: 259037820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-17 23:49:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:49:03,497][12883] Updated weights for policy 0, policy_version 15810 (0.0046) +[2024-06-17 23:49:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.7, 300 sec: 40210.5). Total num frames: 259145728. Throughput: 0: 40168.0. Samples: 259278560. Policy #0 lag: (min: 1.0, avg: 12.9, max: 24.0) +[2024-06-17 23:49:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:49:07,667][12883] Updated weights for policy 0, policy_version 15820 (0.0037) +[2024-06-17 23:49:11,112][12862] Signal inference workers to stop experience collection... (3650 times) +[2024-06-17 23:49:11,113][12862] Signal inference workers to resume experience collection... (3650 times) +[2024-06-17 23:49:11,135][12883] InferenceWorker_p0-w0: stopping experience collection (3650 times) +[2024-06-17 23:49:11,136][12883] InferenceWorker_p0-w0: resuming experience collection (3650 times) +[2024-06-17 23:49:11,421][12883] Updated weights for policy 0, policy_version 15830 (0.0033) +[2024-06-17 23:49:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 259375104. Throughput: 0: 40211.9. Samples: 259518480. Policy #0 lag: (min: 1.0, avg: 12.9, max: 24.0) +[2024-06-17 23:49:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:49:16,256][12883] Updated weights for policy 0, policy_version 15840 (0.0042) +[2024-06-17 23:49:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 259571712. Throughput: 0: 40334.4. Samples: 259642560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-17 23:49:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:49:19,421][12883] Updated weights for policy 0, policy_version 15850 (0.0036) +[2024-06-17 23:49:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 259751936. Throughput: 0: 40210.6. Samples: 259883080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-17 23:49:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:49:24,225][12883] Updated weights for policy 0, policy_version 15860 (0.0041) +[2024-06-17 23:49:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40687.0, 300 sec: 40154.7). Total num frames: 259981312. Throughput: 0: 40276.9. Samples: 260125720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-17 23:49:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:49:27,499][12883] Updated weights for policy 0, policy_version 15870 (0.0034) +[2024-06-17 23:49:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40142.3, 300 sec: 40265.8). Total num frames: 260161536. Throughput: 0: 40306.4. Samples: 260248580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-17 23:49:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:49:32,128][12883] Updated weights for policy 0, policy_version 15880 (0.0038) +[2024-06-17 23:49:35,799][12883] Updated weights for policy 0, policy_version 15890 (0.0042) +[2024-06-17 23:49:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 260374528. Throughput: 0: 40213.8. Samples: 260487260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-17 23:49:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:49:40,117][12883] Updated weights for policy 0, policy_version 15900 (0.0041) +[2024-06-17 23:49:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 260571136. Throughput: 0: 40106.6. Samples: 260724020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-17 23:49:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:49:44,328][12883] Updated weights for policy 0, policy_version 15910 (0.0036) +[2024-06-17 23:49:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40140.9, 300 sec: 40210.3). Total num frames: 260751360. Throughput: 0: 40103.2. Samples: 260842460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-17 23:49:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:49:48,407][12883] Updated weights for policy 0, policy_version 15920 (0.0038) +[2024-06-17 23:49:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 260964352. Throughput: 0: 40167.2. Samples: 261086080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:49:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:49:52,327][12883] Updated weights for policy 0, policy_version 15930 (0.0036) +[2024-06-17 23:49:56,963][12883] Updated weights for policy 0, policy_version 15940 (0.0038) +[2024-06-17 23:49:56,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40413.8, 300 sec: 40099.1). Total num frames: 261160960. Throughput: 0: 40305.3. Samples: 261332220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-17 23:49:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:50:00,323][12883] Updated weights for policy 0, policy_version 15950 (0.0038) +[2024-06-17 23:50:01,994][12645] Fps is (10 sec: 42597.4, 60 sec: 40413.7, 300 sec: 40321.3). Total num frames: 261390336. Throughput: 0: 40160.4. Samples: 261449780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-17 23:50:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:50:04,881][12883] Updated weights for policy 0, policy_version 15960 (0.0035) +[2024-06-17 23:50:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40413.7, 300 sec: 40099.1). Total num frames: 261570560. Throughput: 0: 40265.6. Samples: 261695040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-17 23:50:06,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:50:08,252][12883] Updated weights for policy 0, policy_version 15970 (0.0031) +[2024-06-17 23:50:11,994][12645] Fps is (10 sec: 37683.9, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 261767168. Throughput: 0: 40132.8. Samples: 261931700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-17 23:50:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:50:12,841][12883] Updated weights for policy 0, policy_version 15980 (0.0037) +[2024-06-17 23:50:16,238][12883] Updated weights for policy 0, policy_version 15990 (0.0048) +[2024-06-17 23:50:16,994][12645] Fps is (10 sec: 40961.1, 60 sec: 40140.9, 300 sec: 40210.2). Total num frames: 261980160. Throughput: 0: 40148.0. Samples: 262055240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:50:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:50:21,439][12883] Updated weights for policy 0, policy_version 16000 (0.0050) +[2024-06-17 23:50:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40099.2). Total num frames: 262160384. Throughput: 0: 40039.6. Samples: 262289040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-17 23:50:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:50:24,560][12883] Updated weights for policy 0, policy_version 16010 (0.0033) +[2024-06-17 23:50:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40140.7, 300 sec: 40265.7). Total num frames: 262389760. Throughput: 0: 40249.7. Samples: 262535260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:50:26,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:50:29,423][12883] Updated weights for policy 0, policy_version 16020 (0.0039) +[2024-06-17 23:50:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40140.8, 300 sec: 40099.2). Total num frames: 262569984. Throughput: 0: 40357.2. Samples: 262658540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:50:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:50:32,883][12883] Updated weights for policy 0, policy_version 16030 (0.0045) +[2024-06-17 23:50:36,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 262782976. Throughput: 0: 40235.5. Samples: 262896680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:50:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:50:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016039_262782976.pth... +[2024-06-17 23:50:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015450_253132800.pth +[2024-06-17 23:50:37,340][12883] Updated weights for policy 0, policy_version 16040 (0.0035) +[2024-06-17 23:50:39,426][12862] Signal inference workers to stop experience collection... (3700 times) +[2024-06-17 23:50:39,448][12883] InferenceWorker_p0-w0: stopping experience collection (3700 times) +[2024-06-17 23:50:39,485][12862] Signal inference workers to resume experience collection... (3700 times) +[2024-06-17 23:50:39,485][12883] InferenceWorker_p0-w0: resuming experience collection (3700 times) +[2024-06-17 23:50:41,048][12883] Updated weights for policy 0, policy_version 16050 (0.0033) +[2024-06-17 23:50:41,996][12645] Fps is (10 sec: 42588.9, 60 sec: 40412.4, 300 sec: 40209.9). Total num frames: 262995968. Throughput: 0: 40106.1. Samples: 263137080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:50:41,997][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:50:45,373][12883] Updated weights for policy 0, policy_version 16060 (0.0048) +[2024-06-17 23:50:46,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40140.6, 300 sec: 40043.6). Total num frames: 263159808. Throughput: 0: 40256.4. Samples: 263261320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:50:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:50:49,239][12883] Updated weights for policy 0, policy_version 16070 (0.0038) +[2024-06-17 23:50:51,996][12645] Fps is (10 sec: 39321.7, 60 sec: 40412.3, 300 sec: 40265.5). Total num frames: 263389184. Throughput: 0: 40096.9. Samples: 263499480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-17 23:50:51,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:50:53,499][12883] Updated weights for policy 0, policy_version 16080 (0.0041) +[2024-06-17 23:50:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40140.8, 300 sec: 40099.1). Total num frames: 263569408. Throughput: 0: 40273.7. Samples: 263744020. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-17 23:50:56,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:50:57,859][12883] Updated weights for policy 0, policy_version 16090 (0.0048) +[2024-06-17 23:51:01,511][12883] Updated weights for policy 0, policy_version 16100 (0.0030) +[2024-06-17 23:51:01,994][12645] Fps is (10 sec: 39330.7, 60 sec: 39867.9, 300 sec: 40154.7). Total num frames: 263782400. Throughput: 0: 40024.5. Samples: 263856340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 23:51:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:51:05,951][12883] Updated weights for policy 0, policy_version 16110 (0.0041) +[2024-06-17 23:51:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 40687.1, 300 sec: 40210.2). Total num frames: 264011776. Throughput: 0: 40302.6. Samples: 264102660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-17 23:51:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:51:09,516][12883] Updated weights for policy 0, policy_version 16120 (0.0050) +[2024-06-17 23:51:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 264175616. Throughput: 0: 40169.0. Samples: 264342860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 23:51:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:51:13,956][12883] Updated weights for policy 0, policy_version 16130 (0.0051) +[2024-06-17 23:51:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 264404992. Throughput: 0: 40094.6. Samples: 264462800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 23:51:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:51:17,618][12883] Updated weights for policy 0, policy_version 16140 (0.0028) +[2024-06-17 23:51:21,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39867.7, 300 sec: 40043.6). Total num frames: 264552448. Throughput: 0: 40189.7. Samples: 264705220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-17 23:51:21,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-17 23:51:22,488][12883] Updated weights for policy 0, policy_version 16150 (0.0028) +[2024-06-17 23:51:26,260][12883] Updated weights for policy 0, policy_version 16160 (0.0037) +[2024-06-17 23:51:26,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 264781824. Throughput: 0: 40156.5. Samples: 264944040. Policy #0 lag: (min: 2.0, avg: 10.4, max: 25.0) +[2024-06-17 23:51:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:51:30,589][12883] Updated weights for policy 0, policy_version 16170 (0.0040) +[2024-06-17 23:51:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 40686.9, 300 sec: 40266.1). Total num frames: 265011200. Throughput: 0: 40218.3. Samples: 265071140. Policy #0 lag: (min: 2.0, avg: 10.4, max: 25.0) +[2024-06-17 23:51:31,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:51:34,443][12883] Updated weights for policy 0, policy_version 16180 (0.0026) +[2024-06-17 23:51:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.7, 300 sec: 40154.7). Total num frames: 265175040. Throughput: 0: 40219.8. Samples: 265309280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:51:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:51:38,707][12883] Updated weights for policy 0, policy_version 16190 (0.0040) +[2024-06-17 23:51:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40142.3, 300 sec: 40321.3). Total num frames: 265404416. Throughput: 0: 39920.9. Samples: 265540460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:51:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:51:42,451][12883] Updated weights for policy 0, policy_version 16200 (0.0046) +[2024-06-17 23:51:46,793][12883] Updated weights for policy 0, policy_version 16210 (0.0038) +[2024-06-17 23:51:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40414.0, 300 sec: 40210.2). Total num frames: 265584640. Throughput: 0: 40227.9. Samples: 265666600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-17 23:51:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:51:50,961][12883] Updated weights for policy 0, policy_version 16220 (0.0033) +[2024-06-17 23:51:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39869.2, 300 sec: 40210.2). Total num frames: 265781248. Throughput: 0: 40127.6. Samples: 265908400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-17 23:51:52,000][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:51:54,852][12883] Updated weights for policy 0, policy_version 16230 (0.0047) +[2024-06-17 23:51:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 40265.7). Total num frames: 266010624. Throughput: 0: 40061.2. Samples: 266145620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-17 23:51:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:51:58,934][12883] Updated weights for policy 0, policy_version 16240 (0.0042) +[2024-06-17 23:52:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 266174464. Throughput: 0: 40268.5. Samples: 266274880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-17 23:52:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:52:02,823][12862] Signal inference workers to stop experience collection... (3750 times) +[2024-06-17 23:52:02,882][12883] InferenceWorker_p0-w0: stopping experience collection (3750 times) +[2024-06-17 23:52:02,891][12862] Signal inference workers to resume experience collection... (3750 times) +[2024-06-17 23:52:02,895][12883] InferenceWorker_p0-w0: resuming experience collection (3750 times) +[2024-06-17 23:52:03,027][12883] Updated weights for policy 0, policy_version 16250 (0.0039) +[2024-06-17 23:52:06,988][12883] Updated weights for policy 0, policy_version 16260 (0.0034) +[2024-06-17 23:52:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.8, 300 sec: 40265.8). Total num frames: 266403840. Throughput: 0: 40094.8. Samples: 266509480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-17 23:52:06,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-17 23:52:11,081][12883] Updated weights for policy 0, policy_version 16270 (0.0045) +[2024-06-17 23:52:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 266600448. Throughput: 0: 40102.0. Samples: 266748620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 23:52:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:52:15,734][12883] Updated weights for policy 0, policy_version 16280 (0.0054) +[2024-06-17 23:52:16,994][12645] Fps is (10 sec: 37682.3, 60 sec: 39594.6, 300 sec: 40099.1). Total num frames: 266780672. Throughput: 0: 39914.2. Samples: 266867280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 23:52:16,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:52:19,289][12883] Updated weights for policy 0, policy_version 16290 (0.0035) +[2024-06-17 23:52:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 266993664. Throughput: 0: 39968.1. Samples: 267107840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:52:21,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-17 23:52:23,879][12883] Updated weights for policy 0, policy_version 16300 (0.0032) +[2024-06-17 23:52:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40140.9, 300 sec: 40099.2). Total num frames: 267190272. Throughput: 0: 40263.1. Samples: 267352300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:52:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:52:27,647][12883] Updated weights for policy 0, policy_version 16310 (0.0035) +[2024-06-17 23:52:31,842][12883] Updated weights for policy 0, policy_version 16320 (0.0050) +[2024-06-17 23:52:31,996][12645] Fps is (10 sec: 39312.7, 60 sec: 39593.3, 300 sec: 40265.5). Total num frames: 267386880. Throughput: 0: 39901.6. Samples: 267462260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-17 23:52:31,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:52:35,727][12883] Updated weights for policy 0, policy_version 16330 (0.0037) +[2024-06-17 23:52:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40686.8, 300 sec: 40265.8). Total num frames: 267616256. Throughput: 0: 40078.1. Samples: 267711920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-17 23:52:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:52:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016334_267616256.pth... +[2024-06-17 23:52:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015744_257949696.pth +[2024-06-17 23:52:39,977][12883] Updated weights for policy 0, policy_version 16340 (0.0041) +[2024-06-17 23:52:41,994][12645] Fps is (10 sec: 40968.6, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 267796480. Throughput: 0: 40114.2. Samples: 267950760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-17 23:52:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:52:43,954][12883] Updated weights for policy 0, policy_version 16350 (0.0034) +[2024-06-17 23:52:46,994][12645] Fps is (10 sec: 36045.3, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 267976704. Throughput: 0: 39838.7. Samples: 268067620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 23:52:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:52:47,939][12883] Updated weights for policy 0, policy_version 16360 (0.0026) +[2024-06-17 23:52:51,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 268189696. Throughput: 0: 40127.6. Samples: 268315220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-17 23:52:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:52:52,039][12883] Updated weights for policy 0, policy_version 16370 (0.0030) +[2024-06-17 23:52:56,188][12883] Updated weights for policy 0, policy_version 16380 (0.0039) +[2024-06-17 23:52:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.7, 300 sec: 40154.7). Total num frames: 268386304. Throughput: 0: 40207.0. Samples: 268557940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 23:52:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:52:59,830][12883] Updated weights for policy 0, policy_version 16390 (0.0042) +[2024-06-17 23:53:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 268599296. Throughput: 0: 40081.5. Samples: 268670940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 23:53:01,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:53:04,201][12883] Updated weights for policy 0, policy_version 16400 (0.0033) +[2024-06-17 23:53:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 268812288. Throughput: 0: 40324.8. Samples: 268922460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-17 23:53:06,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:53:07,949][12883] Updated weights for policy 0, policy_version 16410 (0.0035) +[2024-06-17 23:53:11,994][12645] Fps is (10 sec: 39320.6, 60 sec: 39867.6, 300 sec: 40154.7). Total num frames: 268992512. Throughput: 0: 40291.8. Samples: 269165440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-17 23:53:12,003][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:53:12,363][12883] Updated weights for policy 0, policy_version 16420 (0.0044) +[2024-06-17 23:53:16,089][12883] Updated weights for policy 0, policy_version 16430 (0.0033) +[2024-06-17 23:53:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40414.0, 300 sec: 40154.7). Total num frames: 269205504. Throughput: 0: 40422.9. Samples: 269281200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-17 23:53:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:53:20,903][12883] Updated weights for policy 0, policy_version 16440 (0.0036) +[2024-06-17 23:53:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40413.8, 300 sec: 40265.7). Total num frames: 269418496. Throughput: 0: 40429.8. Samples: 269531260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 23:53:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:53:24,027][12883] Updated weights for policy 0, policy_version 16450 (0.0030) +[2024-06-17 23:53:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40155.0). Total num frames: 269598720. Throughput: 0: 40300.0. Samples: 269764260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-17 23:53:26,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-17 23:53:28,977][12883] Updated weights for policy 0, policy_version 16460 (0.0036) +[2024-06-17 23:53:31,996][12645] Fps is (10 sec: 40951.0, 60 sec: 40686.9, 300 sec: 40265.4). Total num frames: 269828096. Throughput: 0: 40497.1. Samples: 269890080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-17 23:53:31,997][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:53:32,165][12883] Updated weights for policy 0, policy_version 16470 (0.0037) +[2024-06-17 23:53:36,854][12883] Updated weights for policy 0, policy_version 16480 (0.0028) +[2024-06-17 23:53:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39867.8, 300 sec: 40210.2). Total num frames: 270008320. Throughput: 0: 40408.3. Samples: 270133600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-17 23:53:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:53:40,419][12883] Updated weights for policy 0, policy_version 16490 (0.0032) +[2024-06-17 23:53:41,994][12645] Fps is (10 sec: 39330.5, 60 sec: 40413.9, 300 sec: 40265.8). Total num frames: 270221312. Throughput: 0: 40298.3. Samples: 270371360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-17 23:53:41,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:53:44,943][12883] Updated weights for policy 0, policy_version 16500 (0.0037) +[2024-06-17 23:53:45,710][12862] Signal inference workers to stop experience collection... (3800 times) +[2024-06-17 23:53:45,762][12883] InferenceWorker_p0-w0: stopping experience collection (3800 times) +[2024-06-17 23:53:45,772][12862] Signal inference workers to resume experience collection... (3800 times) +[2024-06-17 23:53:45,779][12883] InferenceWorker_p0-w0: resuming experience collection (3800 times) +[2024-06-17 23:53:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40210.2). Total num frames: 270417920. Throughput: 0: 40623.5. Samples: 270499000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-17 23:53:47,000][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:53:48,380][12883] Updated weights for policy 0, policy_version 16510 (0.0038) +[2024-06-17 23:53:51,996][12645] Fps is (10 sec: 37674.7, 60 sec: 40139.2, 300 sec: 40209.9). Total num frames: 270598144. Throughput: 0: 40298.0. Samples: 270735960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-17 23:53:51,997][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:53:53,643][12883] Updated weights for policy 0, policy_version 16520 (0.0033) +[2024-06-17 23:53:56,653][12883] Updated weights for policy 0, policy_version 16530 (0.0037) +[2024-06-17 23:53:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40265.7). Total num frames: 270843904. Throughput: 0: 40117.9. Samples: 270970740. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) +[2024-06-17 23:53:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:54:01,661][12883] Updated weights for policy 0, policy_version 16540 (0.0042) +[2024-06-17 23:54:01,994][12645] Fps is (10 sec: 39330.4, 60 sec: 39867.7, 300 sec: 40154.7). Total num frames: 270991360. Throughput: 0: 40242.2. Samples: 271092100. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) +[2024-06-17 23:54:01,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:54:04,641][12883] Updated weights for policy 0, policy_version 16550 (0.0041) +[2024-06-17 23:54:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 271237120. Throughput: 0: 40044.5. Samples: 271333260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) +[2024-06-17 23:54:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:54:09,691][12883] Updated weights for policy 0, policy_version 16560 (0.0033) +[2024-06-17 23:54:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 271417344. Throughput: 0: 40247.5. Samples: 271575400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 23:54:11,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:54:12,955][12883] Updated weights for policy 0, policy_version 16570 (0.0056) +[2024-06-17 23:54:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 271613952. Throughput: 0: 40158.4. Samples: 271697120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-17 23:54:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:54:18,098][12883] Updated weights for policy 0, policy_version 16580 (0.0034) +[2024-06-17 23:54:21,562][12883] Updated weights for policy 0, policy_version 16590 (0.0029) +[2024-06-17 23:54:21,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 271826944. Throughput: 0: 40018.8. Samples: 271934440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-17 23:54:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:54:25,888][12883] Updated weights for policy 0, policy_version 16600 (0.0040) +[2024-06-17 23:54:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 272007168. Throughput: 0: 40246.7. Samples: 272182460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-17 23:54:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-17 23:54:29,568][12883] Updated weights for policy 0, policy_version 16610 (0.0034) +[2024-06-17 23:54:31,994][12645] Fps is (10 sec: 39320.7, 60 sec: 39869.1, 300 sec: 40154.7). Total num frames: 272220160. Throughput: 0: 40003.8. Samples: 272299180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-17 23:54:31,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:54:33,778][12883] Updated weights for policy 0, policy_version 16620 (0.0041) +[2024-06-17 23:54:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 272433152. Throughput: 0: 40074.5. Samples: 272539220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 23:54:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:54:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016628_272433152.pth... +[2024-06-17 23:54:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016039_262782976.pth +[2024-06-17 23:54:37,572][12883] Updated weights for policy 0, policy_version 16630 (0.0040) +[2024-06-17 23:54:41,994][12645] Fps is (10 sec: 39322.5, 60 sec: 39867.8, 300 sec: 40210.2). Total num frames: 272613376. Throughput: 0: 40248.6. Samples: 272781920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-17 23:54:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:54:42,089][12883] Updated weights for policy 0, policy_version 16640 (0.0055) +[2024-06-17 23:54:45,602][12883] Updated weights for policy 0, policy_version 16650 (0.0046) +[2024-06-17 23:54:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 272826368. Throughput: 0: 40222.2. Samples: 272902100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-17 23:54:46,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:54:50,359][12883] Updated weights for policy 0, policy_version 16660 (0.0031) +[2024-06-17 23:54:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40142.3, 300 sec: 40154.7). Total num frames: 273006592. Throughput: 0: 40353.3. Samples: 273149160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-17 23:54:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:54:53,434][12883] Updated weights for policy 0, policy_version 16670 (0.0039) +[2024-06-17 23:54:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 273219584. Throughput: 0: 40387.2. Samples: 273392820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-17 23:54:56,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:54:58,119][12883] Updated weights for policy 0, policy_version 16680 (0.0028) +[2024-06-17 23:55:01,234][12883] Updated weights for policy 0, policy_version 16690 (0.0046) +[2024-06-17 23:55:01,996][12645] Fps is (10 sec: 44227.3, 60 sec: 40958.5, 300 sec: 40265.5). Total num frames: 273448960. Throughput: 0: 40434.5. Samples: 273516760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-17 23:55:01,996][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:55:06,112][12883] Updated weights for policy 0, policy_version 16700 (0.0042) +[2024-06-17 23:55:06,995][12645] Fps is (10 sec: 39318.6, 60 sec: 39594.1, 300 sec: 40154.6). Total num frames: 273612800. Throughput: 0: 40425.9. Samples: 273753640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-17 23:55:06,995][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:55:09,662][12883] Updated weights for policy 0, policy_version 16710 (0.0044) +[2024-06-17 23:55:11,994][12645] Fps is (10 sec: 36052.7, 60 sec: 39867.8, 300 sec: 40099.1). Total num frames: 273809408. Throughput: 0: 40345.2. Samples: 273998000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:55:11,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:55:12,457][12862] Signal inference workers to stop experience collection... (3850 times) +[2024-06-17 23:55:12,512][12883] InferenceWorker_p0-w0: stopping experience collection (3850 times) +[2024-06-17 23:55:12,516][12862] Signal inference workers to resume experience collection... (3850 times) +[2024-06-17 23:55:12,530][12883] InferenceWorker_p0-w0: resuming experience collection (3850 times) +[2024-06-17 23:55:14,759][12883] Updated weights for policy 0, policy_version 16720 (0.0025) +[2024-06-17 23:55:16,994][12645] Fps is (10 sec: 44240.1, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 274055168. Throughput: 0: 40472.5. Samples: 274120440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-17 23:55:16,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:55:17,636][12883] Updated weights for policy 0, policy_version 16730 (0.0048) +[2024-06-17 23:55:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 274251776. Throughput: 0: 40557.7. Samples: 274364320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:55:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:55:22,577][12883] Updated weights for policy 0, policy_version 16740 (0.0033) +[2024-06-17 23:55:25,861][12883] Updated weights for policy 0, policy_version 16750 (0.0035) +[2024-06-17 23:55:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40686.8, 300 sec: 40265.8). Total num frames: 274448384. Throughput: 0: 40359.0. Samples: 274598080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:55:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:55:30,604][12883] Updated weights for policy 0, policy_version 16760 (0.0037) +[2024-06-17 23:55:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 274644992. Throughput: 0: 40489.7. Samples: 274724140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-17 23:55:31,995][12645] Avg episode reward: [(0, '0.011')] +[2024-06-17 23:55:33,884][12883] Updated weights for policy 0, policy_version 16770 (0.0040) +[2024-06-17 23:55:36,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39867.7, 300 sec: 40099.5). Total num frames: 274825216. Throughput: 0: 40441.4. Samples: 274969020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 23:55:36,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-17 23:55:38,484][12883] Updated weights for policy 0, policy_version 16780 (0.0047) +[2024-06-17 23:55:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 40960.0, 300 sec: 40376.9). Total num frames: 275070976. Throughput: 0: 40306.8. Samples: 275206620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-17 23:55:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:55:42,002][12883] Updated weights for policy 0, policy_version 16790 (0.0026) +[2024-06-17 23:55:46,369][12883] Updated weights for policy 0, policy_version 16800 (0.0038) +[2024-06-17 23:55:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40686.9, 300 sec: 40266.1). Total num frames: 275267584. Throughput: 0: 40430.8. Samples: 275336060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:55:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:55:50,401][12883] Updated weights for policy 0, policy_version 16810 (0.0042) +[2024-06-17 23:55:51,994][12645] Fps is (10 sec: 37682.5, 60 sec: 40686.9, 300 sec: 40265.7). Total num frames: 275447808. Throughput: 0: 40374.0. Samples: 275570440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:55:51,995][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:55:54,318][12883] Updated weights for policy 0, policy_version 16820 (0.0027) +[2024-06-17 23:55:56,997][12645] Fps is (10 sec: 40947.3, 60 sec: 40957.9, 300 sec: 40320.9). Total num frames: 275677184. Throughput: 0: 40394.5. Samples: 275815880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:55:56,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:55:58,623][12883] Updated weights for policy 0, policy_version 16830 (0.0034) +[2024-06-17 23:56:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40142.3, 300 sec: 40154.7). Total num frames: 275857408. Throughput: 0: 40413.1. Samples: 275939020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:56:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:56:02,938][12883] Updated weights for policy 0, policy_version 16840 (0.0036) +[2024-06-17 23:56:06,994][12645] Fps is (10 sec: 37695.0, 60 sec: 40687.5, 300 sec: 40265.8). Total num frames: 276054016. Throughput: 0: 40313.0. Samples: 276178400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:56:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:56:07,153][12883] Updated weights for policy 0, policy_version 16850 (0.0040) +[2024-06-17 23:56:11,060][12883] Updated weights for policy 0, policy_version 16860 (0.0039) +[2024-06-17 23:56:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 276250624. Throughput: 0: 40467.6. Samples: 276419120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-17 23:56:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:56:15,236][12883] Updated weights for policy 0, policy_version 16870 (0.0041) +[2024-06-17 23:56:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.8, 300 sec: 40376.8). Total num frames: 276463616. Throughput: 0: 40430.7. Samples: 276543520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-17 23:56:16,995][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:56:19,318][12883] Updated weights for policy 0, policy_version 16880 (0.0040) +[2024-06-17 23:56:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40141.0, 300 sec: 40265.8). Total num frames: 276660224. Throughput: 0: 40390.3. Samples: 276786580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:56:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:56:23,157][12883] Updated weights for policy 0, policy_version 16890 (0.0034) +[2024-06-17 23:56:26,994][12645] Fps is (10 sec: 39322.6, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 276856832. Throughput: 0: 40605.4. Samples: 277033860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:56:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:56:27,019][12862] Signal inference workers to stop experience collection... (3900 times) +[2024-06-17 23:56:27,020][12862] Signal inference workers to resume experience collection... (3900 times) +[2024-06-17 23:56:27,046][12883] InferenceWorker_p0-w0: stopping experience collection (3900 times) +[2024-06-17 23:56:27,078][12883] InferenceWorker_p0-w0: resuming experience collection (3900 times) +[2024-06-17 23:56:27,166][12883] Updated weights for policy 0, policy_version 16900 (0.0034) +[2024-06-17 23:56:30,876][12883] Updated weights for policy 0, policy_version 16910 (0.0050) +[2024-06-17 23:56:31,996][12645] Fps is (10 sec: 40950.3, 60 sec: 40412.5, 300 sec: 40321.0). Total num frames: 277069824. Throughput: 0: 40291.8. Samples: 277149280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-17 23:56:31,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:56:35,272][12883] Updated weights for policy 0, policy_version 16920 (0.0037) +[2024-06-17 23:56:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 277266432. Throughput: 0: 40596.6. Samples: 277397280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:56:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:56:37,081][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016924_277282816.pth... +[2024-06-17 23:56:37,134][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016334_267616256.pth +[2024-06-17 23:56:38,931][12883] Updated weights for policy 0, policy_version 16930 (0.0036) +[2024-06-17 23:56:41,994][12645] Fps is (10 sec: 39330.8, 60 sec: 39867.7, 300 sec: 40265.8). Total num frames: 277463040. Throughput: 0: 40411.8. Samples: 277634280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-17 23:56:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:56:43,769][12883] Updated weights for policy 0, policy_version 16940 (0.0034) +[2024-06-17 23:56:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40413.9, 300 sec: 40376.8). Total num frames: 277692416. Throughput: 0: 40422.2. Samples: 277758020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:56:46,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-17 23:56:47,294][12883] Updated weights for policy 0, policy_version 16950 (0.0033) +[2024-06-17 23:56:51,963][12883] Updated weights for policy 0, policy_version 16960 (0.0041) +[2024-06-17 23:56:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40414.0, 300 sec: 40210.2). Total num frames: 277872640. Throughput: 0: 40316.9. Samples: 277992660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-17 23:56:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:56:55,530][12883] Updated weights for policy 0, policy_version 16970 (0.0043) +[2024-06-17 23:56:56,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40142.8, 300 sec: 40376.8). Total num frames: 278085632. Throughput: 0: 40213.7. Samples: 278228740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:56:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:57:00,026][12883] Updated weights for policy 0, policy_version 16980 (0.0036) +[2024-06-17 23:57:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 278265856. Throughput: 0: 40228.9. Samples: 278353820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:57:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:57:04,049][12883] Updated weights for policy 0, policy_version 16990 (0.0044) +[2024-06-17 23:57:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 278495232. Throughput: 0: 40168.2. Samples: 278594160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-17 23:57:06,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:57:08,616][12883] Updated weights for policy 0, policy_version 17000 (0.0039) +[2024-06-17 23:57:11,951][12883] Updated weights for policy 0, policy_version 17010 (0.0035) +[2024-06-17 23:57:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40686.9, 300 sec: 40376.9). Total num frames: 278691840. Throughput: 0: 39920.3. Samples: 278830280. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-17 23:57:11,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:57:16,568][12883] Updated weights for policy 0, policy_version 17020 (0.0052) +[2024-06-17 23:57:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.9, 300 sec: 40265.7). Total num frames: 278872064. Throughput: 0: 40030.4. Samples: 278950560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-17 23:57:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:57:20,122][12883] Updated weights for policy 0, policy_version 17030 (0.0050) +[2024-06-17 23:57:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.8, 300 sec: 40321.3). Total num frames: 279085056. Throughput: 0: 39889.3. Samples: 279192300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:57:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:57:24,583][12883] Updated weights for policy 0, policy_version 17040 (0.0035) +[2024-06-17 23:57:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.8, 300 sec: 40266.1). Total num frames: 279265280. Throughput: 0: 40063.1. Samples: 279437120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:57:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:57:28,052][12883] Updated weights for policy 0, policy_version 17050 (0.0041) +[2024-06-17 23:57:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40142.2, 300 sec: 40210.2). Total num frames: 279478272. Throughput: 0: 39919.4. Samples: 279554400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-17 23:57:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:57:33,058][12883] Updated weights for policy 0, policy_version 17060 (0.0052) +[2024-06-17 23:57:36,940][12883] Updated weights for policy 0, policy_version 17070 (0.0044) +[2024-06-17 23:57:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 40265.8). Total num frames: 279674880. Throughput: 0: 39984.5. Samples: 279791960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-17 23:57:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:57:40,980][12883] Updated weights for policy 0, policy_version 17080 (0.0049) +[2024-06-17 23:57:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39867.7, 300 sec: 40265.8). Total num frames: 279855104. Throughput: 0: 40015.2. Samples: 280029420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-17 23:57:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:57:45,000][12883] Updated weights for policy 0, policy_version 17090 (0.0039) +[2024-06-17 23:57:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39594.6, 300 sec: 40265.7). Total num frames: 280068096. Throughput: 0: 39940.9. Samples: 280151160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 23:57:46,995][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:57:49,202][12883] Updated weights for policy 0, policy_version 17100 (0.0034) +[2024-06-17 23:57:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.6, 300 sec: 40210.2). Total num frames: 280248320. Throughput: 0: 40078.2. Samples: 280397680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 23:57:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:57:52,973][12883] Updated weights for policy 0, policy_version 17110 (0.0039) +[2024-06-17 23:57:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39594.8, 300 sec: 40210.2). Total num frames: 280461312. Throughput: 0: 40218.8. Samples: 280640120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-17 23:57:56,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:57:57,178][12883] Updated weights for policy 0, policy_version 17120 (0.0035) +[2024-06-17 23:58:01,093][12883] Updated weights for policy 0, policy_version 17130 (0.0035) +[2024-06-17 23:58:01,994][12645] Fps is (10 sec: 45875.5, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 280707072. Throughput: 0: 40216.9. Samples: 280760320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:58:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:58:05,324][12883] Updated weights for policy 0, policy_version 17140 (0.0036) +[2024-06-17 23:58:06,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39594.6, 300 sec: 40265.8). Total num frames: 280870912. Throughput: 0: 40140.4. Samples: 280998620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-17 23:58:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:58:07,579][12862] Signal inference workers to stop experience collection... (3950 times) +[2024-06-17 23:58:07,626][12883] InferenceWorker_p0-w0: stopping experience collection (3950 times) +[2024-06-17 23:58:07,689][12862] Signal inference workers to resume experience collection... (3950 times) +[2024-06-17 23:58:07,689][12883] InferenceWorker_p0-w0: resuming experience collection (3950 times) +[2024-06-17 23:58:09,370][12883] Updated weights for policy 0, policy_version 17150 (0.0026) +[2024-06-17 23:58:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 40265.8). Total num frames: 281083904. Throughput: 0: 40041.7. Samples: 281239000. Policy #0 lag: (min: 2.0, avg: 14.1, max: 29.0) +[2024-06-17 23:58:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:58:13,467][12883] Updated weights for policy 0, policy_version 17160 (0.0032) +[2024-06-17 23:58:16,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 281264128. Throughput: 0: 40263.3. Samples: 281366240. Policy #0 lag: (min: 2.0, avg: 14.1, max: 29.0) +[2024-06-17 23:58:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:58:17,373][12883] Updated weights for policy 0, policy_version 17170 (0.0043) +[2024-06-17 23:58:21,591][12883] Updated weights for policy 0, policy_version 17180 (0.0044) +[2024-06-17 23:58:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 281493504. Throughput: 0: 40377.3. Samples: 281608940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 23:58:21,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:58:25,468][12883] Updated weights for policy 0, policy_version 17190 (0.0045) +[2024-06-17 23:58:26,994][12645] Fps is (10 sec: 45875.0, 60 sec: 40960.0, 300 sec: 40321.6). Total num frames: 281722880. Throughput: 0: 40382.3. Samples: 281846620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 23:58:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:58:29,527][12883] Updated weights for policy 0, policy_version 17200 (0.0051) +[2024-06-17 23:58:32,000][12645] Fps is (10 sec: 39297.2, 60 sec: 40136.7, 300 sec: 40264.9). Total num frames: 281886720. Throughput: 0: 40491.8. Samples: 281973540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-17 23:58:32,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-17 23:58:33,469][12883] Updated weights for policy 0, policy_version 17210 (0.0041) +[2024-06-17 23:58:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40413.9, 300 sec: 40265.8). Total num frames: 282099712. Throughput: 0: 40423.6. Samples: 282216740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) +[2024-06-17 23:58:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:58:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017218_282099712.pth... +[2024-06-17 23:58:37,052][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016628_272433152.pth +[2024-06-17 23:58:38,068][12883] Updated weights for policy 0, policy_version 17220 (0.0040) +[2024-06-17 23:58:41,614][12883] Updated weights for policy 0, policy_version 17230 (0.0051) +[2024-06-17 23:58:41,994][12645] Fps is (10 sec: 44263.9, 60 sec: 41233.0, 300 sec: 40376.8). Total num frames: 282329088. Throughput: 0: 40315.0. Samples: 282454300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) +[2024-06-17 23:58:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:58:46,086][12883] Updated weights for policy 0, policy_version 17240 (0.0042) +[2024-06-17 23:58:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 40377.1). Total num frames: 282509312. Throughput: 0: 40479.1. Samples: 282581880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 23:58:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-17 23:58:49,555][12883] Updated weights for policy 0, policy_version 17250 (0.0047) +[2024-06-17 23:58:51,994][12645] Fps is (10 sec: 37684.1, 60 sec: 40960.2, 300 sec: 40210.3). Total num frames: 282705920. Throughput: 0: 40461.1. Samples: 282819360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 23:58:52,000][12645] Avg episode reward: [(0, '0.013')] +[2024-06-17 23:58:54,300][12883] Updated weights for policy 0, policy_version 17260 (0.0028) +[2024-06-17 23:58:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40960.0, 300 sec: 40432.4). Total num frames: 282918912. Throughput: 0: 40740.9. Samples: 283072340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-17 23:58:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:58:57,611][12883] Updated weights for policy 0, policy_version 17270 (0.0031) +[2024-06-17 23:59:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39867.8, 300 sec: 40210.2). Total num frames: 283099136. Throughput: 0: 40504.0. Samples: 283188920. Policy #0 lag: (min: 0.0, avg: 13.2, max: 25.0) +[2024-06-17 23:59:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:59:02,383][12883] Updated weights for policy 0, policy_version 17280 (0.0031) +[2024-06-17 23:59:05,598][12883] Updated weights for policy 0, policy_version 17290 (0.0048) +[2024-06-17 23:59:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 40376.9). Total num frames: 283328512. Throughput: 0: 40552.0. Samples: 283433780. Policy #0 lag: (min: 0.0, avg: 13.2, max: 25.0) +[2024-06-17 23:59:06,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-17 23:59:10,375][12883] Updated weights for policy 0, policy_version 17300 (0.0028) +[2024-06-17 23:59:11,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40412.3, 300 sec: 40321.0). Total num frames: 283508736. Throughput: 0: 40820.1. Samples: 283683620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-17 23:59:11,997][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:59:13,797][12883] Updated weights for policy 0, policy_version 17310 (0.0026) +[2024-06-17 23:59:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40686.8, 300 sec: 40265.7). Total num frames: 283705344. Throughput: 0: 40670.0. Samples: 283803440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-17 23:59:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:59:18,604][12883] Updated weights for policy 0, policy_version 17320 (0.0036) +[2024-06-17 23:59:21,742][12883] Updated weights for policy 0, policy_version 17330 (0.0040) +[2024-06-17 23:59:22,000][12645] Fps is (10 sec: 42581.1, 60 sec: 40682.7, 300 sec: 40431.5). Total num frames: 283934720. Throughput: 0: 40673.9. Samples: 284047320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-17 23:59:22,001][12645] Avg episode reward: [(0, '0.003')] +[2024-06-17 23:59:25,504][12862] Signal inference workers to stop experience collection... (4000 times) +[2024-06-17 23:59:25,552][12883] InferenceWorker_p0-w0: stopping experience collection (4000 times) +[2024-06-17 23:59:25,558][12862] Signal inference workers to resume experience collection... (4000 times) +[2024-06-17 23:59:25,579][12883] InferenceWorker_p0-w0: resuming experience collection (4000 times) +[2024-06-17 23:59:26,519][12883] Updated weights for policy 0, policy_version 17340 (0.0051) +[2024-06-17 23:59:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 284114944. Throughput: 0: 40834.7. Samples: 284291860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 23:59:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-17 23:59:29,726][12883] Updated weights for policy 0, policy_version 17350 (0.0035) +[2024-06-17 23:59:31,994][12645] Fps is (10 sec: 42624.9, 60 sec: 41237.3, 300 sec: 40432.4). Total num frames: 284360704. Throughput: 0: 40516.9. Samples: 284405140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-17 23:59:31,995][12645] Avg episode reward: [(0, '0.010')] +[2024-06-17 23:59:34,564][12883] Updated weights for policy 0, policy_version 17360 (0.0038) +[2024-06-17 23:59:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.8, 300 sec: 40376.8). Total num frames: 284524544. Throughput: 0: 40755.0. Samples: 284653340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-17 23:59:36,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-17 23:59:38,083][12883] Updated weights for policy 0, policy_version 17370 (0.0039) +[2024-06-17 23:59:41,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 284721152. Throughput: 0: 40514.5. Samples: 284895500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-17 23:59:41,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-17 23:59:42,624][12883] Updated weights for policy 0, policy_version 17380 (0.0046) +[2024-06-17 23:59:46,017][12883] Updated weights for policy 0, policy_version 17390 (0.0053) +[2024-06-17 23:59:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 284950528. Throughput: 0: 40621.6. Samples: 285016900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-17 23:59:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-17 23:59:50,534][12883] Updated weights for policy 0, policy_version 17400 (0.0042) +[2024-06-17 23:59:51,996][12645] Fps is (10 sec: 40951.4, 60 sec: 40412.3, 300 sec: 40376.6). Total num frames: 285130752. Throughput: 0: 40535.3. Samples: 285257960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-17 23:59:51,996][12645] Avg episode reward: [(0, '0.007')] +[2024-06-17 23:59:54,121][12883] Updated weights for policy 0, policy_version 17410 (0.0036) +[2024-06-17 23:59:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.8, 300 sec: 40321.6). Total num frames: 285343744. Throughput: 0: 40420.6. Samples: 285502460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-17 23:59:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-17 23:59:58,457][12883] Updated weights for policy 0, policy_version 17420 (0.0035) +[2024-06-18 00:00:01,994][12645] Fps is (10 sec: 42607.5, 60 sec: 40959.9, 300 sec: 40488.0). Total num frames: 285556736. Throughput: 0: 40524.4. Samples: 285627040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 00:00:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:00:02,147][12883] Updated weights for policy 0, policy_version 17430 (0.0041) +[2024-06-18 00:00:06,912][12883] Updated weights for policy 0, policy_version 17440 (0.0046) +[2024-06-18 00:00:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.8, 300 sec: 40432.4). Total num frames: 285736960. Throughput: 0: 40506.1. Samples: 285869840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 00:00:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:00:10,379][12883] Updated weights for policy 0, policy_version 17450 (0.0034) +[2024-06-18 00:00:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40688.4, 300 sec: 40321.3). Total num frames: 285949952. Throughput: 0: 40219.6. Samples: 286101740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 00:00:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 00:00:14,820][12883] Updated weights for policy 0, policy_version 17460 (0.0037) +[2024-06-18 00:00:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 286146560. Throughput: 0: 40468.1. Samples: 286226200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 00:00:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:00:18,483][12883] Updated weights for policy 0, policy_version 17470 (0.0025) +[2024-06-18 00:00:21,994][12645] Fps is (10 sec: 36045.0, 60 sec: 39598.8, 300 sec: 40210.2). Total num frames: 286310400. Throughput: 0: 40269.8. Samples: 286465480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 00:00:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:00:22,932][12883] Updated weights for policy 0, policy_version 17480 (0.0043) +[2024-06-18 00:00:26,284][12883] Updated weights for policy 0, policy_version 17490 (0.0043) +[2024-06-18 00:00:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 40376.9). Total num frames: 286556160. Throughput: 0: 40245.4. Samples: 286706540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) +[2024-06-18 00:00:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:00:30,761][12883] Updated weights for policy 0, policy_version 17500 (0.0027) +[2024-06-18 00:00:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 39867.8, 300 sec: 40432.4). Total num frames: 286752768. Throughput: 0: 40452.1. Samples: 286837240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) +[2024-06-18 00:00:31,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:00:34,898][12883] Updated weights for policy 0, policy_version 17510 (0.0046) +[2024-06-18 00:00:36,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 286932992. Throughput: 0: 40361.4. Samples: 287074140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) +[2024-06-18 00:00:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:00:37,121][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017514_286949376.pth... +[2024-06-18 00:00:37,194][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016924_277282816.pth +[2024-06-18 00:00:39,185][12883] Updated weights for policy 0, policy_version 17520 (0.0050) +[2024-06-18 00:00:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 287162368. Throughput: 0: 40245.8. Samples: 287313520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) +[2024-06-18 00:00:41,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:00:43,171][12883] Updated weights for policy 0, policy_version 17530 (0.0040) +[2024-06-18 00:00:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40140.9, 300 sec: 40376.9). Total num frames: 287358976. Throughput: 0: 40233.4. Samples: 287437540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) +[2024-06-18 00:00:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:00:47,023][12883] Updated weights for policy 0, policy_version 17540 (0.0043) +[2024-06-18 00:00:51,324][12883] Updated weights for policy 0, policy_version 17550 (0.0060) +[2024-06-18 00:00:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40688.4, 300 sec: 40321.7). Total num frames: 287571968. Throughput: 0: 40223.9. Samples: 287679920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 00:00:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:00:55,171][12883] Updated weights for policy 0, policy_version 17560 (0.0036) +[2024-06-18 00:00:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 287784960. Throughput: 0: 40541.4. Samples: 287926100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 00:00:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:00:59,167][12883] Updated weights for policy 0, policy_version 17570 (0.0036) +[2024-06-18 00:01:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 287948800. Throughput: 0: 40454.2. Samples: 288046640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 00:01:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:01:03,296][12883] Updated weights for policy 0, policy_version 17580 (0.0040) +[2024-06-18 00:01:04,886][12862] Signal inference workers to stop experience collection... (4050 times) +[2024-06-18 00:01:04,886][12862] Signal inference workers to resume experience collection... (4050 times) +[2024-06-18 00:01:04,903][12883] InferenceWorker_p0-w0: stopping experience collection (4050 times) +[2024-06-18 00:01:04,903][12883] InferenceWorker_p0-w0: resuming experience collection (4050 times) +[2024-06-18 00:01:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40686.9, 300 sec: 40432.4). Total num frames: 288178176. Throughput: 0: 40468.4. Samples: 288286560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) +[2024-06-18 00:01:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:01:07,779][12883] Updated weights for policy 0, policy_version 17590 (0.0037) +[2024-06-18 00:01:11,358][12883] Updated weights for policy 0, policy_version 17600 (0.0038) +[2024-06-18 00:01:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.9, 300 sec: 40376.9). Total num frames: 288374784. Throughput: 0: 40433.3. Samples: 288526040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) +[2024-06-18 00:01:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:01:15,738][12883] Updated weights for policy 0, policy_version 17610 (0.0031) +[2024-06-18 00:01:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 288555008. Throughput: 0: 40335.2. Samples: 288652320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:01:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:01:19,408][12883] Updated weights for policy 0, policy_version 17620 (0.0042) +[2024-06-18 00:01:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 40487.9). Total num frames: 288800768. Throughput: 0: 40481.4. Samples: 288895800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:01:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:01:23,714][12883] Updated weights for policy 0, policy_version 17630 (0.0026) +[2024-06-18 00:01:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.9, 300 sec: 40377.2). Total num frames: 288980992. Throughput: 0: 40759.2. Samples: 289147680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:01:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:01:27,377][12883] Updated weights for policy 0, policy_version 17640 (0.0026) +[2024-06-18 00:01:31,531][12883] Updated weights for policy 0, policy_version 17650 (0.0044) +[2024-06-18 00:01:31,996][12645] Fps is (10 sec: 37674.7, 60 sec: 40412.3, 300 sec: 40376.5). Total num frames: 289177600. Throughput: 0: 40627.7. Samples: 289265880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 00:01:31,997][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:01:35,263][12883] Updated weights for policy 0, policy_version 17660 (0.0029) +[2024-06-18 00:01:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41506.2, 300 sec: 40543.4). Total num frames: 289423360. Throughput: 0: 40708.9. Samples: 289511820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 00:01:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:01:39,553][12883] Updated weights for policy 0, policy_version 17670 (0.0035) +[2024-06-18 00:01:41,994][12645] Fps is (10 sec: 40969.3, 60 sec: 40413.9, 300 sec: 40321.3). Total num frames: 289587200. Throughput: 0: 40716.4. Samples: 289758340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 00:01:41,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:01:43,159][12883] Updated weights for policy 0, policy_version 17680 (0.0033) +[2024-06-18 00:01:46,994][12645] Fps is (10 sec: 36044.5, 60 sec: 40413.8, 300 sec: 40376.8). Total num frames: 289783808. Throughput: 0: 40703.9. Samples: 289878320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 00:01:46,995][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:01:47,722][12883] Updated weights for policy 0, policy_version 17690 (0.0037) +[2024-06-18 00:01:51,080][12883] Updated weights for policy 0, policy_version 17700 (0.0055) +[2024-06-18 00:01:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 290013184. Throughput: 0: 40876.5. Samples: 290126000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 00:01:51,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:01:56,188][12883] Updated weights for policy 0, policy_version 17710 (0.0042) +[2024-06-18 00:01:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.8, 300 sec: 40487.9). Total num frames: 290209792. Throughput: 0: 41011.5. Samples: 290371560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 00:01:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:01:59,429][12883] Updated weights for policy 0, policy_version 17720 (0.0045) +[2024-06-18 00:02:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.2, 300 sec: 40432.4). Total num frames: 290422784. Throughput: 0: 40818.3. Samples: 290489140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 00:02:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:02:04,105][12883] Updated weights for policy 0, policy_version 17730 (0.0037) +[2024-06-18 00:02:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 40432.4). Total num frames: 290619392. Throughput: 0: 40942.2. Samples: 290738200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 00:02:06,995][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:02:07,487][12883] Updated weights for policy 0, policy_version 17740 (0.0042) +[2024-06-18 00:02:11,994][12645] Fps is (10 sec: 36044.7, 60 sec: 40140.9, 300 sec: 40376.9). Total num frames: 290783232. Throughput: 0: 40806.7. Samples: 290983980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 00:02:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:02:12,289][12883] Updated weights for policy 0, policy_version 17750 (0.0041) +[2024-06-18 00:02:15,566][12883] Updated weights for policy 0, policy_version 17760 (0.0047) +[2024-06-18 00:02:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 40543.4). Total num frames: 291045376. Throughput: 0: 40748.2. Samples: 291099460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 00:02:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:02:20,286][12883] Updated weights for policy 0, policy_version 17770 (0.0044) +[2024-06-18 00:02:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 291225600. Throughput: 0: 40848.5. Samples: 291350000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 00:02:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:02:23,353][12883] Updated weights for policy 0, policy_version 17780 (0.0034) +[2024-06-18 00:02:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 40413.8, 300 sec: 40432.4). Total num frames: 291405824. Throughput: 0: 40858.2. Samples: 291596960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 00:02:26,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:02:28,233][12883] Updated weights for policy 0, policy_version 17790 (0.0042) +[2024-06-18 00:02:31,591][12883] Updated weights for policy 0, policy_version 17800 (0.0037) +[2024-06-18 00:02:31,994][12645] Fps is (10 sec: 44235.9, 60 sec: 41507.6, 300 sec: 40654.5). Total num frames: 291667968. Throughput: 0: 40757.3. Samples: 291712400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 00:02:31,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:02:36,134][12883] Updated weights for policy 0, policy_version 17810 (0.0038) +[2024-06-18 00:02:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.7, 300 sec: 40543.4). Total num frames: 291815424. Throughput: 0: 40675.4. Samples: 291956400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 00:02:36,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:02:37,134][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017812_291831808.pth... +[2024-06-18 00:02:37,198][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017218_282099712.pth +[2024-06-18 00:02:39,612][12883] Updated weights for policy 0, policy_version 17820 (0.0034) +[2024-06-18 00:02:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 40599.0). Total num frames: 292044800. Throughput: 0: 40568.5. Samples: 292197140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 00:02:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:02:43,887][12862] Signal inference workers to stop experience collection... (4100 times) +[2024-06-18 00:02:43,889][12862] Signal inference workers to resume experience collection... (4100 times) +[2024-06-18 00:02:43,923][12883] InferenceWorker_p0-w0: stopping experience collection (4100 times) +[2024-06-18 00:02:43,923][12883] InferenceWorker_p0-w0: resuming experience collection (4100 times) +[2024-06-18 00:02:44,043][12883] Updated weights for policy 0, policy_version 17830 (0.0041) +[2024-06-18 00:02:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40960.1, 300 sec: 40654.5). Total num frames: 292241408. Throughput: 0: 40853.7. Samples: 292327560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) +[2024-06-18 00:02:46,996][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:02:47,672][12883] Updated weights for policy 0, policy_version 17840 (0.0033) +[2024-06-18 00:02:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 292438016. Throughput: 0: 40640.9. Samples: 292567040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) +[2024-06-18 00:02:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:02:52,045][12883] Updated weights for policy 0, policy_version 17850 (0.0039) +[2024-06-18 00:02:55,761][12883] Updated weights for policy 0, policy_version 17860 (0.0029) +[2024-06-18 00:02:56,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40959.9, 300 sec: 40543.4). Total num frames: 292667392. Throughput: 0: 40367.3. Samples: 292800520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) +[2024-06-18 00:02:56,995][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:03:00,150][12883] Updated weights for policy 0, policy_version 17870 (0.0050) +[2024-06-18 00:03:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 40487.9). Total num frames: 292814848. Throughput: 0: 40535.7. Samples: 292923560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) +[2024-06-18 00:03:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:03:04,078][12883] Updated weights for policy 0, policy_version 17880 (0.0042) +[2024-06-18 00:03:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 293060608. Throughput: 0: 40362.1. Samples: 293166300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) +[2024-06-18 00:03:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:03:08,200][12883] Updated weights for policy 0, policy_version 17890 (0.0032) +[2024-06-18 00:03:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41233.0, 300 sec: 40654.5). Total num frames: 293257216. Throughput: 0: 40359.6. Samples: 293413140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:03:11,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:03:12,186][12883] Updated weights for policy 0, policy_version 17900 (0.0028) +[2024-06-18 00:03:16,137][12883] Updated weights for policy 0, policy_version 17910 (0.0032) +[2024-06-18 00:03:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39867.8, 300 sec: 40487.9). Total num frames: 293437440. Throughput: 0: 40411.6. Samples: 293530920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:03:16,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:03:20,174][12883] Updated weights for policy 0, policy_version 17920 (0.0035) +[2024-06-18 00:03:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.8, 300 sec: 40487.9). Total num frames: 293666816. Throughput: 0: 40385.8. Samples: 293773760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:03:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:03:24,076][12883] Updated weights for policy 0, policy_version 17930 (0.0028) +[2024-06-18 00:03:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 40544.3). Total num frames: 293847040. Throughput: 0: 40628.1. Samples: 294025400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) +[2024-06-18 00:03:26,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:03:28,241][12883] Updated weights for policy 0, policy_version 17940 (0.0033) +[2024-06-18 00:03:31,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39867.8, 300 sec: 40543.5). Total num frames: 294060032. Throughput: 0: 40280.9. Samples: 294140200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) +[2024-06-18 00:03:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:03:32,350][12883] Updated weights for policy 0, policy_version 17950 (0.0036) +[2024-06-18 00:03:36,209][12883] Updated weights for policy 0, policy_version 17960 (0.0032) +[2024-06-18 00:03:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41233.1, 300 sec: 40543.5). Total num frames: 294289408. Throughput: 0: 40560.4. Samples: 294392260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) +[2024-06-18 00:03:36,996][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:03:40,423][12883] Updated weights for policy 0, policy_version 17970 (0.0038) +[2024-06-18 00:03:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40487.9). Total num frames: 294453248. Throughput: 0: 40747.3. Samples: 294634140. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) +[2024-06-18 00:03:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:03:44,500][12883] Updated weights for policy 0, policy_version 17980 (0.0041) +[2024-06-18 00:03:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 294682624. Throughput: 0: 40632.3. Samples: 294752020. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) +[2024-06-18 00:03:46,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:03:48,661][12883] Updated weights for policy 0, policy_version 17990 (0.0039) +[2024-06-18 00:03:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40686.8, 300 sec: 40543.4). Total num frames: 294879232. Throughput: 0: 40737.3. Samples: 294999480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 00:03:51,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:03:52,639][12883] Updated weights for policy 0, policy_version 18000 (0.0047) +[2024-06-18 00:03:56,483][12883] Updated weights for policy 0, policy_version 18010 (0.0043) +[2024-06-18 00:03:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40414.0, 300 sec: 40654.5). Total num frames: 295092224. Throughput: 0: 40571.6. Samples: 295238860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 00:03:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:04:00,832][12862] Signal inference workers to stop experience collection... (4150 times) +[2024-06-18 00:04:00,877][12883] InferenceWorker_p0-w0: stopping experience collection (4150 times) +[2024-06-18 00:04:00,878][12862] Signal inference workers to resume experience collection... (4150 times) +[2024-06-18 00:04:00,887][12883] InferenceWorker_p0-w0: resuming experience collection (4150 times) +[2024-06-18 00:04:00,894][12883] Updated weights for policy 0, policy_version 18020 (0.0043) +[2024-06-18 00:04:01,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 40543.5). Total num frames: 295288832. Throughput: 0: 40650.3. Samples: 295360180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 00:04:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:04:04,716][12883] Updated weights for policy 0, policy_version 18030 (0.0032) +[2024-06-18 00:04:06,994][12645] Fps is (10 sec: 37682.6, 60 sec: 40140.8, 300 sec: 40543.7). Total num frames: 295469056. Throughput: 0: 40598.2. Samples: 295600680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 00:04:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:04:08,894][12883] Updated weights for policy 0, policy_version 18040 (0.0038) +[2024-06-18 00:04:11,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40140.8, 300 sec: 40543.5). Total num frames: 295665664. Throughput: 0: 40464.8. Samples: 295846320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 00:04:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:04:12,859][12883] Updated weights for policy 0, policy_version 18050 (0.0033) +[2024-06-18 00:04:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40433.2). Total num frames: 295862272. Throughput: 0: 40546.0. Samples: 295964780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 00:04:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:04:17,661][12883] Updated weights for policy 0, policy_version 18060 (0.0041) +[2024-06-18 00:04:21,018][12883] Updated weights for policy 0, policy_version 18070 (0.0034) +[2024-06-18 00:04:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.9, 300 sec: 40543.5). Total num frames: 296075264. Throughput: 0: 40291.6. Samples: 296205380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 00:04:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:04:25,617][12883] Updated weights for policy 0, policy_version 18080 (0.0035) +[2024-06-18 00:04:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40959.9, 300 sec: 40487.9). Total num frames: 296304640. Throughput: 0: 40344.8. Samples: 296449660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 00:04:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:04:29,198][12883] Updated weights for policy 0, policy_version 18090 (0.0034) +[2024-06-18 00:04:31,996][12645] Fps is (10 sec: 42589.2, 60 sec: 40685.4, 300 sec: 40598.7). Total num frames: 296501248. Throughput: 0: 40360.8. Samples: 296568340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 00:04:31,996][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:04:33,407][12883] Updated weights for policy 0, policy_version 18100 (0.0039) +[2024-06-18 00:04:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 296697856. Throughput: 0: 40406.4. Samples: 296817760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 00:04:36,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:04:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018109_296697856.pth... +[2024-06-18 00:04:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017514_286949376.pth +[2024-06-18 00:04:37,224][12883] Updated weights for policy 0, policy_version 18110 (0.0040) +[2024-06-18 00:04:41,112][12883] Updated weights for policy 0, policy_version 18120 (0.0055) +[2024-06-18 00:04:41,994][12645] Fps is (10 sec: 39329.9, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 296894464. Throughput: 0: 40411.9. Samples: 297057400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:04:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:04:45,283][12883] Updated weights for policy 0, policy_version 18130 (0.0044) +[2024-06-18 00:04:47,000][12645] Fps is (10 sec: 42571.9, 60 sec: 40682.8, 300 sec: 40654.0). Total num frames: 297123840. Throughput: 0: 40683.7. Samples: 297191200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:04:47,000][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:04:49,105][12883] Updated weights for policy 0, policy_version 18140 (0.0042) +[2024-06-18 00:04:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40140.9, 300 sec: 40487.9). Total num frames: 297287680. Throughput: 0: 40676.1. Samples: 297431100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:04:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 00:04:53,476][12883] Updated weights for policy 0, policy_version 18150 (0.0043) +[2024-06-18 00:04:56,890][12883] Updated weights for policy 0, policy_version 18160 (0.0041) +[2024-06-18 00:04:56,994][12645] Fps is (10 sec: 40985.4, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 297533440. Throughput: 0: 40550.7. Samples: 297671100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 00:04:57,000][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:05:01,304][12883] Updated weights for policy 0, policy_version 18170 (0.0032) +[2024-06-18 00:05:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 297730048. Throughput: 0: 40704.2. Samples: 297796460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 00:05:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:05:04,587][12883] Updated weights for policy 0, policy_version 18180 (0.0043) +[2024-06-18 00:05:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40687.1, 300 sec: 40543.5). Total num frames: 297910272. Throughput: 0: 40778.3. Samples: 298040400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:05:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:05:09,613][12883] Updated weights for policy 0, policy_version 18190 (0.0045) +[2024-06-18 00:05:11,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41233.2, 300 sec: 40654.6). Total num frames: 298139648. Throughput: 0: 40647.3. Samples: 298278780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:05:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:05:12,480][12883] Updated weights for policy 0, policy_version 18200 (0.0034) +[2024-06-18 00:05:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40687.2, 300 sec: 40654.5). Total num frames: 298303488. Throughput: 0: 40811.9. Samples: 298404780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:05:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:05:17,481][12862] Signal inference workers to stop experience collection... (4200 times) +[2024-06-18 00:05:17,545][12883] InferenceWorker_p0-w0: stopping experience collection (4200 times) +[2024-06-18 00:05:17,601][12862] Signal inference workers to resume experience collection... (4200 times) +[2024-06-18 00:05:17,601][12883] InferenceWorker_p0-w0: resuming experience collection (4200 times) +[2024-06-18 00:05:17,731][12883] Updated weights for policy 0, policy_version 18210 (0.0030) +[2024-06-18 00:05:20,596][12883] Updated weights for policy 0, policy_version 18220 (0.0036) +[2024-06-18 00:05:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 40599.0). Total num frames: 298532864. Throughput: 0: 40528.4. Samples: 298641540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 00:05:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:05:25,687][12883] Updated weights for policy 0, policy_version 18230 (0.0034) +[2024-06-18 00:05:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40687.1, 300 sec: 40654.5). Total num frames: 298745856. Throughput: 0: 40785.4. Samples: 298892740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 00:05:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:05:29,049][12883] Updated weights for policy 0, policy_version 18240 (0.0028) +[2024-06-18 00:05:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40415.3, 300 sec: 40654.5). Total num frames: 298926080. Throughput: 0: 40599.3. Samples: 299017920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:05:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:05:33,673][12883] Updated weights for policy 0, policy_version 18250 (0.0056) +[2024-06-18 00:05:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 40654.6). Total num frames: 299155456. Throughput: 0: 40556.5. Samples: 299256140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:05:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:05:37,023][12883] Updated weights for policy 0, policy_version 18260 (0.0036) +[2024-06-18 00:05:41,715][12883] Updated weights for policy 0, policy_version 18270 (0.0030) +[2024-06-18 00:05:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 299335680. Throughput: 0: 40668.8. Samples: 299501200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:05:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:05:45,038][12883] Updated weights for policy 0, policy_version 18280 (0.0042) +[2024-06-18 00:05:46,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40418.0, 300 sec: 40599.0). Total num frames: 299548672. Throughput: 0: 40508.4. Samples: 299619340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 00:05:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:05:49,725][12883] Updated weights for policy 0, policy_version 18290 (0.0041) +[2024-06-18 00:05:51,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41233.2, 300 sec: 40599.0). Total num frames: 299761664. Throughput: 0: 40656.9. Samples: 299869960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 00:05:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:05:53,044][12883] Updated weights for policy 0, policy_version 18300 (0.0033) +[2024-06-18 00:05:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 299958272. Throughput: 0: 40755.8. Samples: 300112800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 00:05:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:05:57,555][12883] Updated weights for policy 0, policy_version 18310 (0.0042) +[2024-06-18 00:06:01,073][12883] Updated weights for policy 0, policy_version 18320 (0.0035) +[2024-06-18 00:06:01,994][12645] Fps is (10 sec: 39320.6, 60 sec: 40413.8, 300 sec: 40599.0). Total num frames: 300154880. Throughput: 0: 40570.4. Samples: 300230460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:06:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:06:06,150][12883] Updated weights for policy 0, policy_version 18330 (0.0044) +[2024-06-18 00:06:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 300367872. Throughput: 0: 40631.1. Samples: 300469940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:06:06,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:06:09,867][12883] Updated weights for policy 0, policy_version 18340 (0.0033) +[2024-06-18 00:06:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.6, 300 sec: 40599.0). Total num frames: 300531712. Throughput: 0: 40564.8. Samples: 300718160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 00:06:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:06:14,086][12883] Updated weights for policy 0, policy_version 18350 (0.0032) +[2024-06-18 00:06:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 40543.5). Total num frames: 300761088. Throughput: 0: 40397.0. Samples: 300835780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 00:06:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:06:17,862][12883] Updated weights for policy 0, policy_version 18360 (0.0052) +[2024-06-18 00:06:21,903][12883] Updated weights for policy 0, policy_version 18370 (0.0030) +[2024-06-18 00:06:21,999][12645] Fps is (10 sec: 44214.4, 60 sec: 40683.4, 300 sec: 40653.8). Total num frames: 300974080. Throughput: 0: 40625.1. Samples: 301084480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 00:06:21,999][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 00:06:26,040][12883] Updated weights for policy 0, policy_version 18380 (0.0033) +[2024-06-18 00:06:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 40710.4). Total num frames: 301187072. Throughput: 0: 40545.4. Samples: 301325740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:06:26,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:06:29,841][12883] Updated weights for policy 0, policy_version 18390 (0.0037) +[2024-06-18 00:06:31,994][12645] Fps is (10 sec: 39342.1, 60 sec: 40687.0, 300 sec: 40487.9). Total num frames: 301367296. Throughput: 0: 40577.5. Samples: 301445320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:06:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:06:33,917][12883] Updated weights for policy 0, policy_version 18400 (0.0046) +[2024-06-18 00:06:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 301563904. Throughput: 0: 40511.5. Samples: 301692980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:06:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:06:37,095][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018407_301580288.pth... +[2024-06-18 00:06:37,162][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017812_291831808.pth +[2024-06-18 00:06:37,714][12883] Updated weights for policy 0, policy_version 18410 (0.0034) +[2024-06-18 00:06:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40414.0, 300 sec: 40599.0). Total num frames: 301760512. Throughput: 0: 40605.0. Samples: 301940020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:06:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:06:42,237][12883] Updated weights for policy 0, policy_version 18420 (0.0035) +[2024-06-18 00:06:45,587][12883] Updated weights for policy 0, policy_version 18430 (0.0039) +[2024-06-18 00:06:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40413.9, 300 sec: 40543.4). Total num frames: 301973504. Throughput: 0: 40691.1. Samples: 302061560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:06:46,998][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:06:50,115][12883] Updated weights for policy 0, policy_version 18440 (0.0047) +[2024-06-18 00:06:51,996][12645] Fps is (10 sec: 42588.6, 60 sec: 40412.3, 300 sec: 40598.7). Total num frames: 302186496. Throughput: 0: 40858.4. Samples: 302308660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 00:06:51,997][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:06:53,945][12883] Updated weights for policy 0, policy_version 18450 (0.0035) +[2024-06-18 00:06:56,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40140.9, 300 sec: 40487.9). Total num frames: 302366720. Throughput: 0: 40729.9. Samples: 302551000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 00:06:56,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:06:58,060][12883] Updated weights for policy 0, policy_version 18460 (0.0039) +[2024-06-18 00:06:59,024][12862] Signal inference workers to stop experience collection... (4250 times) +[2024-06-18 00:06:59,056][12883] InferenceWorker_p0-w0: stopping experience collection (4250 times) +[2024-06-18 00:06:59,083][12862] Signal inference workers to resume experience collection... (4250 times) +[2024-06-18 00:06:59,084][12883] InferenceWorker_p0-w0: resuming experience collection (4250 times) +[2024-06-18 00:07:01,838][12883] Updated weights for policy 0, policy_version 18470 (0.0029) +[2024-06-18 00:07:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 40960.1, 300 sec: 40654.5). Total num frames: 302612480. Throughput: 0: 40727.0. Samples: 302668500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 00:07:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:07:06,364][12883] Updated weights for policy 0, policy_version 18480 (0.0042) +[2024-06-18 00:07:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 302792704. Throughput: 0: 40681.5. Samples: 302914940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:07:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:07:10,284][12883] Updated weights for policy 0, policy_version 18490 (0.0038) +[2024-06-18 00:07:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40960.0, 300 sec: 40487.9). Total num frames: 302989312. Throughput: 0: 40745.7. Samples: 303159300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:07:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:07:14,568][12883] Updated weights for policy 0, policy_version 18500 (0.0039) +[2024-06-18 00:07:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 303202304. Throughput: 0: 40742.2. Samples: 303278720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 00:07:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:07:18,041][12883] Updated weights for policy 0, policy_version 18510 (0.0045) +[2024-06-18 00:07:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40144.3, 300 sec: 40599.0). Total num frames: 303382528. Throughput: 0: 40631.2. Samples: 303521380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 00:07:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:07:22,335][12883] Updated weights for policy 0, policy_version 18520 (0.0035) +[2024-06-18 00:07:25,790][12883] Updated weights for policy 0, policy_version 18530 (0.0037) +[2024-06-18 00:07:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40413.8, 300 sec: 40487.9). Total num frames: 303611904. Throughput: 0: 40592.3. Samples: 303766680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 00:07:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 00:07:30,750][12883] Updated weights for policy 0, policy_version 18540 (0.0059) +[2024-06-18 00:07:31,996][12645] Fps is (10 sec: 44226.6, 60 sec: 40958.4, 300 sec: 40709.8). Total num frames: 303824896. Throughput: 0: 40788.7. Samples: 303897140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 00:07:31,997][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:07:34,280][12883] Updated weights for policy 0, policy_version 18550 (0.0031) +[2024-06-18 00:07:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40686.8, 300 sec: 40543.4). Total num frames: 304005120. Throughput: 0: 40656.1. Samples: 304138100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 00:07:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:07:38,798][12883] Updated weights for policy 0, policy_version 18560 (0.0043) +[2024-06-18 00:07:41,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41233.0, 300 sec: 40654.5). Total num frames: 304234496. Throughput: 0: 40615.5. Samples: 304378700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 00:07:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:07:42,357][12883] Updated weights for policy 0, policy_version 18570 (0.0027) +[2024-06-18 00:07:46,592][12883] Updated weights for policy 0, policy_version 18580 (0.0034) +[2024-06-18 00:07:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 304431104. Throughput: 0: 40836.4. Samples: 304506140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:07:46,995][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:07:50,166][12883] Updated weights for policy 0, policy_version 18590 (0.0037) +[2024-06-18 00:07:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40688.5, 300 sec: 40543.5). Total num frames: 304627712. Throughput: 0: 40676.5. Samples: 304745380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:07:51,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 00:07:54,593][12883] Updated weights for policy 0, policy_version 18600 (0.0027) +[2024-06-18 00:07:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 40821.1). Total num frames: 304857088. Throughput: 0: 40614.8. Samples: 304986960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 00:07:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:07:58,573][12883] Updated weights for policy 0, policy_version 18610 (0.0042) +[2024-06-18 00:08:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39867.7, 300 sec: 40487.9). Total num frames: 305004544. Throughput: 0: 40939.0. Samples: 305120980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 00:08:01,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:08:02,543][12883] Updated weights for policy 0, policy_version 18620 (0.0034) +[2024-06-18 00:08:06,641][12883] Updated weights for policy 0, policy_version 18630 (0.0037) +[2024-06-18 00:08:06,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40687.0, 300 sec: 40599.0). Total num frames: 305233920. Throughput: 0: 40823.0. Samples: 305358420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 00:08:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:08:10,704][12883] Updated weights for policy 0, policy_version 18640 (0.0035) +[2024-06-18 00:08:11,994][12645] Fps is (10 sec: 45876.0, 60 sec: 41233.2, 300 sec: 40765.6). Total num frames: 305463296. Throughput: 0: 40817.5. Samples: 305603460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 00:08:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:08:14,525][12883] Updated weights for policy 0, policy_version 18650 (0.0032) +[2024-06-18 00:08:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.8, 300 sec: 40543.5). Total num frames: 305627136. Throughput: 0: 40599.8. Samples: 305724040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 00:08:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:08:18,670][12883] Updated weights for policy 0, policy_version 18660 (0.0032) +[2024-06-18 00:08:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 40710.1). Total num frames: 305856512. Throughput: 0: 40678.8. Samples: 305968640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 00:08:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:08:22,232][12862] Signal inference workers to stop experience collection... (4300 times) +[2024-06-18 00:08:22,286][12883] InferenceWorker_p0-w0: stopping experience collection (4300 times) +[2024-06-18 00:08:22,302][12862] Signal inference workers to resume experience collection... (4300 times) +[2024-06-18 00:08:22,302][12883] InferenceWorker_p0-w0: resuming experience collection (4300 times) +[2024-06-18 00:08:22,453][12883] Updated weights for policy 0, policy_version 18670 (0.0051) +[2024-06-18 00:08:26,724][12883] Updated weights for policy 0, policy_version 18680 (0.0047) +[2024-06-18 00:08:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 40654.5). Total num frames: 306053120. Throughput: 0: 40876.0. Samples: 306218120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:08:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:08:30,351][12883] Updated weights for policy 0, policy_version 18690 (0.0027) +[2024-06-18 00:08:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40142.3, 300 sec: 40487.9). Total num frames: 306233344. Throughput: 0: 40762.3. Samples: 306340440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:08:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:08:34,766][12883] Updated weights for policy 0, policy_version 18700 (0.0048) +[2024-06-18 00:08:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.2, 300 sec: 40765.6). Total num frames: 306479104. Throughput: 0: 40831.1. Samples: 306582780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) +[2024-06-18 00:08:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:08:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018706_306479104.pth... +[2024-06-18 00:08:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018109_296697856.pth +[2024-06-18 00:08:38,789][12883] Updated weights for policy 0, policy_version 18710 (0.0049) +[2024-06-18 00:08:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 306675712. Throughput: 0: 40980.9. Samples: 306831100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) +[2024-06-18 00:08:41,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:08:42,522][12883] Updated weights for policy 0, policy_version 18720 (0.0036) +[2024-06-18 00:08:46,805][12883] Updated weights for policy 0, policy_version 18730 (0.0045) +[2024-06-18 00:08:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 40654.6). Total num frames: 306872320. Throughput: 0: 40609.8. Samples: 306948420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) +[2024-06-18 00:08:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:08:50,762][12883] Updated weights for policy 0, policy_version 18740 (0.0047) +[2024-06-18 00:08:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 307085312. Throughput: 0: 40861.4. Samples: 307197180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 00:08:51,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:08:54,827][12883] Updated weights for policy 0, policy_version 18750 (0.0038) +[2024-06-18 00:08:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 307298304. Throughput: 0: 40821.7. Samples: 307440440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 00:08:56,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:08:58,830][12883] Updated weights for policy 0, policy_version 18760 (0.0038) +[2024-06-18 00:09:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 40710.1). Total num frames: 307478528. Throughput: 0: 40797.8. Samples: 307559940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:09:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:09:03,018][12883] Updated weights for policy 0, policy_version 18770 (0.0038) +[2024-06-18 00:09:06,788][12883] Updated weights for policy 0, policy_version 18780 (0.0038) +[2024-06-18 00:09:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 40821.2). Total num frames: 307707904. Throughput: 0: 40913.8. Samples: 307809760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:09:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:09:11,331][12883] Updated weights for policy 0, policy_version 18790 (0.0041) +[2024-06-18 00:09:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40413.9, 300 sec: 40765.7). Total num frames: 307888128. Throughput: 0: 40712.1. Samples: 308050160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:09:11,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:09:14,777][12883] Updated weights for policy 0, policy_version 18800 (0.0046) +[2024-06-18 00:09:16,996][12645] Fps is (10 sec: 39312.9, 60 sec: 41231.6, 300 sec: 40765.3). Total num frames: 308101120. Throughput: 0: 40610.9. Samples: 308168020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:09:16,996][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:09:19,427][12883] Updated weights for policy 0, policy_version 18810 (0.0040) +[2024-06-18 00:09:21,996][12645] Fps is (10 sec: 40951.5, 60 sec: 40685.6, 300 sec: 40654.3). Total num frames: 308297728. Throughput: 0: 40717.4. Samples: 308415140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:09:21,996][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:09:22,650][12883] Updated weights for policy 0, policy_version 18820 (0.0034) +[2024-06-18 00:09:26,994][12645] Fps is (10 sec: 37691.3, 60 sec: 40413.8, 300 sec: 40599.3). Total num frames: 308477952. Throughput: 0: 40609.3. Samples: 308658520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:09:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:09:27,335][12883] Updated weights for policy 0, policy_version 18830 (0.0043) +[2024-06-18 00:09:28,742][12862] Signal inference workers to stop experience collection... (4350 times) +[2024-06-18 00:09:28,742][12862] Signal inference workers to resume experience collection... (4350 times) +[2024-06-18 00:09:28,758][12883] InferenceWorker_p0-w0: stopping experience collection (4350 times) +[2024-06-18 00:09:28,759][12883] InferenceWorker_p0-w0: resuming experience collection (4350 times) +[2024-06-18 00:09:30,352][12883] Updated weights for policy 0, policy_version 18840 (0.0044) +[2024-06-18 00:09:31,994][12645] Fps is (10 sec: 39329.5, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 308690944. Throughput: 0: 40689.0. Samples: 308779420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:09:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:09:35,411][12883] Updated weights for policy 0, policy_version 18850 (0.0040) +[2024-06-18 00:09:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 308903936. Throughput: 0: 40588.5. Samples: 309023660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:09:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:09:38,837][12883] Updated weights for policy 0, policy_version 18860 (0.0027) +[2024-06-18 00:09:41,996][12645] Fps is (10 sec: 39312.4, 60 sec: 40139.3, 300 sec: 40544.0). Total num frames: 309084160. Throughput: 0: 40509.1. Samples: 309263440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:09:41,997][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:09:43,347][12883] Updated weights for policy 0, policy_version 18870 (0.0031) +[2024-06-18 00:09:46,727][12883] Updated weights for policy 0, policy_version 18880 (0.0043) +[2024-06-18 00:09:46,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 309329920. Throughput: 0: 40586.6. Samples: 309386340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:09:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:09:51,347][12883] Updated weights for policy 0, policy_version 18890 (0.0028) +[2024-06-18 00:09:51,994][12645] Fps is (10 sec: 42608.1, 60 sec: 40413.8, 300 sec: 40599.0). Total num frames: 309510144. Throughput: 0: 40480.9. Samples: 309631400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:09:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:09:55,088][12883] Updated weights for policy 0, policy_version 18900 (0.0045) +[2024-06-18 00:09:56,994][12645] Fps is (10 sec: 36045.3, 60 sec: 39867.7, 300 sec: 40543.5). Total num frames: 309690368. Throughput: 0: 40459.5. Samples: 309870840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 00:09:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:09:59,343][12883] Updated weights for policy 0, policy_version 18910 (0.0043) +[2024-06-18 00:10:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 309919744. Throughput: 0: 40420.7. Samples: 309986860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 00:10:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:10:03,088][12883] Updated weights for policy 0, policy_version 18920 (0.0030) +[2024-06-18 00:10:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39867.8, 300 sec: 40543.5). Total num frames: 310099968. Throughput: 0: 40512.1. Samples: 310238100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 00:10:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:10:07,993][12883] Updated weights for policy 0, policy_version 18930 (0.0042) +[2024-06-18 00:10:11,281][12883] Updated weights for policy 0, policy_version 18940 (0.0024) +[2024-06-18 00:10:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 310329344. Throughput: 0: 40320.5. Samples: 310472940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 00:10:11,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:10:15,926][12883] Updated weights for policy 0, policy_version 18950 (0.0040) +[2024-06-18 00:10:16,994][12645] Fps is (10 sec: 44235.9, 60 sec: 40688.4, 300 sec: 40710.1). Total num frames: 310542336. Throughput: 0: 40568.3. Samples: 310605000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 00:10:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:10:19,195][12883] Updated weights for policy 0, policy_version 18960 (0.0038) +[2024-06-18 00:10:21,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39869.0, 300 sec: 40487.9). Total num frames: 310689792. Throughput: 0: 40550.9. Samples: 310848460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 00:10:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:10:23,961][12883] Updated weights for policy 0, policy_version 18970 (0.0051) +[2024-06-18 00:10:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 310935552. Throughput: 0: 40519.0. Samples: 311086700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 00:10:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:10:27,508][12883] Updated weights for policy 0, policy_version 18980 (0.0049) +[2024-06-18 00:10:31,891][12883] Updated weights for policy 0, policy_version 18990 (0.0034) +[2024-06-18 00:10:31,994][12645] Fps is (10 sec: 44237.8, 60 sec: 40687.0, 300 sec: 40599.0). Total num frames: 311132160. Throughput: 0: 40740.6. Samples: 311219660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 00:10:31,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:10:35,818][12883] Updated weights for policy 0, policy_version 19000 (0.0051) +[2024-06-18 00:10:36,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40140.7, 300 sec: 40599.0). Total num frames: 311312384. Throughput: 0: 40668.8. Samples: 311461500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 00:10:36,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:10:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019001_311312384.pth... +[2024-06-18 00:10:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018407_301580288.pth +[2024-06-18 00:10:39,746][12883] Updated weights for policy 0, policy_version 19010 (0.0026) +[2024-06-18 00:10:40,598][12862] Signal inference workers to stop experience collection... (4400 times) +[2024-06-18 00:10:40,646][12883] InferenceWorker_p0-w0: stopping experience collection (4400 times) +[2024-06-18 00:10:40,654][12862] Signal inference workers to resume experience collection... (4400 times) +[2024-06-18 00:10:40,661][12883] InferenceWorker_p0-w0: resuming experience collection (4400 times) +[2024-06-18 00:10:41,995][12645] Fps is (10 sec: 42591.3, 60 sec: 41233.6, 300 sec: 40709.9). Total num frames: 311558144. Throughput: 0: 40659.4. Samples: 311700580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 00:10:41,996][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:10:43,783][12883] Updated weights for policy 0, policy_version 19020 (0.0042) +[2024-06-18 00:10:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 40140.9, 300 sec: 40599.0). Total num frames: 311738368. Throughput: 0: 40989.8. Samples: 311831400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 00:10:46,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:10:47,587][12883] Updated weights for policy 0, policy_version 19030 (0.0030) +[2024-06-18 00:10:51,994][12645] Fps is (10 sec: 37689.1, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 311934976. Throughput: 0: 40851.5. Samples: 312076420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 00:10:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:10:52,125][12883] Updated weights for policy 0, policy_version 19040 (0.0046) +[2024-06-18 00:10:55,584][12883] Updated weights for policy 0, policy_version 19050 (0.0047) +[2024-06-18 00:10:56,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42052.2, 300 sec: 40876.7). Total num frames: 312213504. Throughput: 0: 40868.9. Samples: 312312040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 00:10:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:10:59,975][12883] Updated weights for policy 0, policy_version 19060 (0.0034) +[2024-06-18 00:11:01,996][12645] Fps is (10 sec: 42588.7, 60 sec: 40685.3, 300 sec: 40654.2). Total num frames: 312360960. Throughput: 0: 40934.9. Samples: 312447160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 00:11:01,997][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 00:11:03,523][12883] Updated weights for policy 0, policy_version 19070 (0.0032) +[2024-06-18 00:11:06,994][12645] Fps is (10 sec: 36044.9, 60 sec: 41233.0, 300 sec: 40821.2). Total num frames: 312573952. Throughput: 0: 40782.3. Samples: 312683660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 00:11:06,994][12645] Avg episode reward: [(0, '0.038')] +[2024-06-18 00:11:07,004][12862] Saving new best policy, reward=0.038! +[2024-06-18 00:11:07,773][12883] Updated weights for policy 0, policy_version 19080 (0.0035) +[2024-06-18 00:11:11,444][12883] Updated weights for policy 0, policy_version 19090 (0.0040) +[2024-06-18 00:11:11,994][12645] Fps is (10 sec: 42608.6, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 312786944. Throughput: 0: 40994.3. Samples: 312931440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 00:11:11,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 00:11:16,327][12883] Updated weights for policy 0, policy_version 19100 (0.0028) +[2024-06-18 00:11:16,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40140.9, 300 sec: 40599.7). Total num frames: 312950784. Throughput: 0: 40824.0. Samples: 313056740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) +[2024-06-18 00:11:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:11:19,436][12883] Updated weights for policy 0, policy_version 19110 (0.0046) +[2024-06-18 00:11:21,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 40710.1). Total num frames: 313196544. Throughput: 0: 40694.2. Samples: 313292740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) +[2024-06-18 00:11:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:11:24,565][12883] Updated weights for policy 0, policy_version 19120 (0.0035) +[2024-06-18 00:11:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 313376768. Throughput: 0: 41033.0. Samples: 313547000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) +[2024-06-18 00:11:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:11:27,442][12883] Updated weights for policy 0, policy_version 19130 (0.0029) +[2024-06-18 00:11:31,994][12645] Fps is (10 sec: 36045.2, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 313556992. Throughput: 0: 40598.2. Samples: 313658320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 00:11:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:11:32,640][12883] Updated weights for policy 0, policy_version 19140 (0.0041) +[2024-06-18 00:11:35,465][12883] Updated weights for policy 0, policy_version 19150 (0.0037) +[2024-06-18 00:11:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 40932.2). Total num frames: 313835520. Throughput: 0: 40753.3. Samples: 313910320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 00:11:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:11:40,713][12883] Updated weights for policy 0, policy_version 19160 (0.0037) +[2024-06-18 00:11:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40414.9, 300 sec: 40710.1). Total num frames: 313982976. Throughput: 0: 41181.7. Samples: 314165220. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) +[2024-06-18 00:11:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:11:43,606][12883] Updated weights for policy 0, policy_version 19170 (0.0037) +[2024-06-18 00:11:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41233.0, 300 sec: 40765.9). Total num frames: 314212352. Throughput: 0: 40559.0. Samples: 314272220. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) +[2024-06-18 00:11:46,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:11:48,631][12883] Updated weights for policy 0, policy_version 19180 (0.0041) +[2024-06-18 00:11:51,434][12883] Updated weights for policy 0, policy_version 19190 (0.0032) +[2024-06-18 00:11:51,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41506.2, 300 sec: 40876.7). Total num frames: 314425344. Throughput: 0: 40957.4. Samples: 314526740. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) +[2024-06-18 00:11:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:11:52,201][12862] Signal inference workers to stop experience collection... (4450 times) +[2024-06-18 00:11:52,250][12883] InferenceWorker_p0-w0: stopping experience collection (4450 times) +[2024-06-18 00:11:52,259][12862] Signal inference workers to resume experience collection... (4450 times) +[2024-06-18 00:11:52,264][12883] InferenceWorker_p0-w0: resuming experience collection (4450 times) +[2024-06-18 00:11:56,495][12883] Updated weights for policy 0, policy_version 19200 (0.0040) +[2024-06-18 00:11:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39594.7, 300 sec: 40599.0). Total num frames: 314589184. Throughput: 0: 41084.8. Samples: 314780260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:11:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:11:59,294][12883] Updated weights for policy 0, policy_version 19210 (0.0039) +[2024-06-18 00:12:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40961.5, 300 sec: 40765.6). Total num frames: 314818560. Throughput: 0: 40861.7. Samples: 314895520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:12:01,995][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 00:12:04,341][12883] Updated weights for policy 0, policy_version 19220 (0.0038) +[2024-06-18 00:12:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 315031552. Throughput: 0: 41195.6. Samples: 315146540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:12:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:12:07,505][12883] Updated weights for policy 0, policy_version 19230 (0.0031) +[2024-06-18 00:12:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 315211776. Throughput: 0: 41041.3. Samples: 315393860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 00:12:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:12:12,295][12883] Updated weights for policy 0, policy_version 19240 (0.0031) +[2024-06-18 00:12:15,455][12883] Updated weights for policy 0, policy_version 19250 (0.0030) +[2024-06-18 00:12:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 40876.7). Total num frames: 315441152. Throughput: 0: 41164.8. Samples: 315510740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 00:12:16,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 00:12:20,162][12883] Updated weights for policy 0, policy_version 19260 (0.0033) +[2024-06-18 00:12:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 315637760. Throughput: 0: 41074.1. Samples: 315758660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 00:12:21,995][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:12:23,185][12883] Updated weights for policy 0, policy_version 19270 (0.0041) +[2024-06-18 00:12:26,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40686.9, 300 sec: 40654.8). Total num frames: 315817984. Throughput: 0: 40867.6. Samples: 316004260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:12:26,996][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:12:28,426][12883] Updated weights for policy 0, policy_version 19280 (0.0044) +[2024-06-18 00:12:31,028][12883] Updated weights for policy 0, policy_version 19290 (0.0037) +[2024-06-18 00:12:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 40876.7). Total num frames: 316063744. Throughput: 0: 41123.6. Samples: 316122780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:12:31,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:12:36,346][12883] Updated weights for policy 0, policy_version 19300 (0.0033) +[2024-06-18 00:12:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 39867.8, 300 sec: 40654.5). Total num frames: 316227584. Throughput: 0: 40929.7. Samples: 316368580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 00:12:36,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:12:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019302_316243968.pth... +[2024-06-18 00:12:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018706_306479104.pth +[2024-06-18 00:12:39,455][12883] Updated weights for policy 0, policy_version 19310 (0.0032) +[2024-06-18 00:12:41,994][12645] Fps is (10 sec: 37682.2, 60 sec: 40959.9, 300 sec: 40710.1). Total num frames: 316440576. Throughput: 0: 40751.9. Samples: 316614100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 00:12:41,995][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:12:44,195][12883] Updated weights for policy 0, policy_version 19320 (0.0036) +[2024-06-18 00:12:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 316669952. Throughput: 0: 40865.5. Samples: 316734460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 00:12:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:12:47,367][12883] Updated weights for policy 0, policy_version 19330 (0.0025) +[2024-06-18 00:12:51,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 316850176. Throughput: 0: 40842.2. Samples: 316984440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:12:51,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:12:52,081][12883] Updated weights for policy 0, policy_version 19340 (0.0041) +[2024-06-18 00:12:55,646][12883] Updated weights for policy 0, policy_version 19350 (0.0040) +[2024-06-18 00:12:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 40932.2). Total num frames: 317079552. Throughput: 0: 40528.9. Samples: 317217660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:12:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:13:00,085][12883] Updated weights for policy 0, policy_version 19360 (0.0033) +[2024-06-18 00:13:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 317276160. Throughput: 0: 40843.6. Samples: 317348700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:13:01,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:13:03,454][12883] Updated weights for policy 0, policy_version 19370 (0.0036) +[2024-06-18 00:13:06,994][12645] Fps is (10 sec: 36044.6, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 317440000. Throughput: 0: 40649.0. Samples: 317587860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 00:13:06,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:13:08,503][12883] Updated weights for policy 0, policy_version 19380 (0.0043) +[2024-06-18 00:13:11,358][12883] Updated weights for policy 0, policy_version 19390 (0.0037) +[2024-06-18 00:13:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 317685760. Throughput: 0: 40447.2. Samples: 317824380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 00:13:11,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:13:16,591][12883] Updated weights for policy 0, policy_version 19400 (0.0031) +[2024-06-18 00:13:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 317865984. Throughput: 0: 40716.4. Samples: 317955020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 00:13:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:13:20,074][12883] Updated weights for policy 0, policy_version 19410 (0.0044) +[2024-06-18 00:13:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 318062592. Throughput: 0: 40568.8. Samples: 318194180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:13:21,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 00:13:24,559][12883] Updated weights for policy 0, policy_version 19420 (0.0043) +[2024-06-18 00:13:27,000][12645] Fps is (10 sec: 42571.5, 60 sec: 41228.8, 300 sec: 40875.8). Total num frames: 318291968. Throughput: 0: 40465.2. Samples: 318435280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:13:27,000][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:13:28,010][12883] Updated weights for policy 0, policy_version 19430 (0.0040) +[2024-06-18 00:13:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.7, 300 sec: 40654.5). Total num frames: 318472192. Throughput: 0: 40606.1. Samples: 318561740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) +[2024-06-18 00:13:31,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:13:32,960][12883] Updated weights for policy 0, policy_version 19440 (0.0034) +[2024-06-18 00:13:32,996][12862] Signal inference workers to stop experience collection... (4500 times) +[2024-06-18 00:13:33,045][12883] InferenceWorker_p0-w0: stopping experience collection (4500 times) +[2024-06-18 00:13:33,110][12862] Signal inference workers to resume experience collection... (4500 times) +[2024-06-18 00:13:33,110][12883] InferenceWorker_p0-w0: resuming experience collection (4500 times) +[2024-06-18 00:13:35,931][12883] Updated weights for policy 0, policy_version 19450 (0.0033) +[2024-06-18 00:13:36,994][12645] Fps is (10 sec: 39346.3, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 318685184. Throughput: 0: 40346.2. Samples: 318800020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) +[2024-06-18 00:13:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:13:40,719][12883] Updated weights for policy 0, policy_version 19460 (0.0030) +[2024-06-18 00:13:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.2, 300 sec: 40821.2). Total num frames: 318914560. Throughput: 0: 40709.3. Samples: 319049580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) +[2024-06-18 00:13:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:13:43,954][12883] Updated weights for policy 0, policy_version 19470 (0.0027) +[2024-06-18 00:13:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 319111168. Throughput: 0: 40528.9. Samples: 319172500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 00:13:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:13:48,746][12883] Updated weights for policy 0, policy_version 19480 (0.0050) +[2024-06-18 00:13:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 319307776. Throughput: 0: 40616.2. Samples: 319415580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 00:13:51,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:13:52,011][12883] Updated weights for policy 0, policy_version 19490 (0.0042) +[2024-06-18 00:13:56,703][12883] Updated weights for policy 0, policy_version 19500 (0.0035) +[2024-06-18 00:13:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 319504384. Throughput: 0: 41022.2. Samples: 319670380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 00:13:56,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 00:14:00,252][12883] Updated weights for policy 0, policy_version 19510 (0.0043) +[2024-06-18 00:14:01,994][12645] Fps is (10 sec: 39320.6, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 319700992. Throughput: 0: 40795.0. Samples: 319790800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) +[2024-06-18 00:14:01,995][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:14:04,362][12883] Updated weights for policy 0, policy_version 19520 (0.0035) +[2024-06-18 00:14:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 40765.6). Total num frames: 319913984. Throughput: 0: 40848.1. Samples: 320032340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) +[2024-06-18 00:14:06,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:14:08,597][12883] Updated weights for policy 0, policy_version 19530 (0.0034) +[2024-06-18 00:14:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40686.9, 300 sec: 40765.9). Total num frames: 320126976. Throughput: 0: 40987.5. Samples: 320279460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) +[2024-06-18 00:14:11,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:14:12,194][12883] Updated weights for policy 0, policy_version 19540 (0.0036) +[2024-06-18 00:14:16,527][12883] Updated weights for policy 0, policy_version 19550 (0.0023) +[2024-06-18 00:14:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40765.9). Total num frames: 320323584. Throughput: 0: 40907.6. Samples: 320402580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:14:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:14:20,223][12883] Updated weights for policy 0, policy_version 19560 (0.0029) +[2024-06-18 00:14:21,995][12645] Fps is (10 sec: 39315.6, 60 sec: 40959.0, 300 sec: 40821.0). Total num frames: 320520192. Throughput: 0: 41097.7. Samples: 320649480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:14:21,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:14:24,527][12883] Updated weights for policy 0, policy_version 19570 (0.0047) +[2024-06-18 00:14:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40691.1, 300 sec: 40821.1). Total num frames: 320733184. Throughput: 0: 40969.3. Samples: 320893200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 00:14:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:14:28,212][12883] Updated weights for policy 0, policy_version 19580 (0.0035) +[2024-06-18 00:14:31,994][12645] Fps is (10 sec: 40965.9, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 320929792. Throughput: 0: 40967.4. Samples: 321016040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 00:14:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:14:32,522][12883] Updated weights for policy 0, policy_version 19590 (0.0034) +[2024-06-18 00:14:36,284][12883] Updated weights for policy 0, policy_version 19600 (0.0030) +[2024-06-18 00:14:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 40932.5). Total num frames: 321159168. Throughput: 0: 41000.6. Samples: 321260620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 00:14:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:14:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019602_321159168.pth... +[2024-06-18 00:14:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019001_311312384.pth +[2024-06-18 00:14:40,293][12883] Updated weights for policy 0, policy_version 19610 (0.0045) +[2024-06-18 00:14:41,996][12645] Fps is (10 sec: 42589.5, 60 sec: 40685.5, 300 sec: 40765.3). Total num frames: 321355776. Throughput: 0: 40833.6. Samples: 321507980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 00:14:41,996][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:14:44,020][12883] Updated weights for policy 0, policy_version 19620 (0.0034) +[2024-06-18 00:14:46,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 321536000. Throughput: 0: 40833.9. Samples: 321628320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 00:14:46,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:14:48,085][12883] Updated weights for policy 0, policy_version 19630 (0.0036) +[2024-06-18 00:14:51,994][12645] Fps is (10 sec: 40969.3, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 321765376. Throughput: 0: 41046.7. Samples: 321879440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 00:14:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:14:52,016][12883] Updated weights for policy 0, policy_version 19640 (0.0036) +[2024-06-18 00:14:56,061][12883] Updated weights for policy 0, policy_version 19650 (0.0046) +[2024-06-18 00:14:56,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 321961984. Throughput: 0: 40913.3. Samples: 322120560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 00:14:56,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:15:00,056][12883] Updated weights for policy 0, policy_version 19660 (0.0038) +[2024-06-18 00:15:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 322158592. Throughput: 0: 40912.0. Samples: 322243620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 00:15:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:15:04,254][12883] Updated weights for policy 0, policy_version 19670 (0.0034) +[2024-06-18 00:15:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 322371584. Throughput: 0: 40881.3. Samples: 322489080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 00:15:06,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:15:08,537][12883] Updated weights for policy 0, policy_version 19680 (0.0044) +[2024-06-18 00:15:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 40765.6). Total num frames: 322568192. Throughput: 0: 40843.7. Samples: 322731160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 00:15:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:15:12,334][12883] Updated weights for policy 0, policy_version 19690 (0.0041) +[2024-06-18 00:15:16,597][12883] Updated weights for policy 0, policy_version 19700 (0.0034) +[2024-06-18 00:15:16,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40687.0, 300 sec: 40932.3). Total num frames: 322764800. Throughput: 0: 40793.1. Samples: 322851720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 00:15:16,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:15:20,390][12883] Updated weights for policy 0, policy_version 19710 (0.0029) +[2024-06-18 00:15:21,999][12645] Fps is (10 sec: 40939.5, 60 sec: 40957.7, 300 sec: 40820.5). Total num frames: 322977792. Throughput: 0: 40738.7. Samples: 323094060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 00:15:21,999][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:15:22,559][12862] Signal inference workers to stop experience collection... (4550 times) +[2024-06-18 00:15:22,559][12862] Signal inference workers to resume experience collection... (4550 times) +[2024-06-18 00:15:22,577][12883] InferenceWorker_p0-w0: stopping experience collection (4550 times) +[2024-06-18 00:15:22,577][12883] InferenceWorker_p0-w0: resuming experience collection (4550 times) +[2024-06-18 00:15:24,693][12883] Updated weights for policy 0, policy_version 19720 (0.0035) +[2024-06-18 00:15:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 323174400. Throughput: 0: 40766.8. Samples: 323342400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:15:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:15:28,671][12883] Updated weights for policy 0, policy_version 19730 (0.0032) +[2024-06-18 00:15:31,994][12645] Fps is (10 sec: 42619.7, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 323403776. Throughput: 0: 40734.2. Samples: 323461360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:15:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:15:32,640][12883] Updated weights for policy 0, policy_version 19740 (0.0030) +[2024-06-18 00:15:36,503][12883] Updated weights for policy 0, policy_version 19750 (0.0045) +[2024-06-18 00:15:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40414.0, 300 sec: 40765.8). Total num frames: 323584000. Throughput: 0: 40716.8. Samples: 323711700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 00:15:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:15:40,887][12883] Updated weights for policy 0, policy_version 19760 (0.0039) +[2024-06-18 00:15:41,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40688.3, 300 sec: 40876.7). Total num frames: 323796992. Throughput: 0: 40662.7. Samples: 323950380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 00:15:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:15:44,585][12883] Updated weights for policy 0, policy_version 19770 (0.0043) +[2024-06-18 00:15:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.8, 300 sec: 40821.1). Total num frames: 323977216. Throughput: 0: 40530.5. Samples: 324067500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 00:15:46,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:15:49,092][12883] Updated weights for policy 0, policy_version 19780 (0.0029) +[2024-06-18 00:15:51,994][12645] Fps is (10 sec: 37683.8, 60 sec: 40140.8, 300 sec: 40543.5). Total num frames: 324173824. Throughput: 0: 40447.3. Samples: 324309200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 00:15:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:15:52,709][12883] Updated weights for policy 0, policy_version 19790 (0.0047) +[2024-06-18 00:15:56,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40414.0, 300 sec: 40765.9). Total num frames: 324386816. Throughput: 0: 40609.8. Samples: 324558600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 00:15:56,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:15:57,097][12883] Updated weights for policy 0, policy_version 19800 (0.0038) +[2024-06-18 00:16:01,076][12883] Updated weights for policy 0, policy_version 19810 (0.0040) +[2024-06-18 00:16:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 324616192. Throughput: 0: 40573.2. Samples: 324677520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 00:16:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:16:05,045][12883] Updated weights for policy 0, policy_version 19820 (0.0042) +[2024-06-18 00:16:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40687.0, 300 sec: 40765.6). Total num frames: 324812800. Throughput: 0: 40692.0. Samples: 324925000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:16:06,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:16:09,002][12883] Updated weights for policy 0, policy_version 19830 (0.0040) +[2024-06-18 00:16:11,997][12645] Fps is (10 sec: 37671.1, 60 sec: 40411.6, 300 sec: 40820.7). Total num frames: 324993024. Throughput: 0: 40545.1. Samples: 325167060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:16:11,998][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:16:12,978][12883] Updated weights for policy 0, policy_version 19840 (0.0051) +[2024-06-18 00:16:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.8, 300 sec: 40654.6). Total num frames: 325189632. Throughput: 0: 40523.1. Samples: 325284900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:16:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:16:17,286][12883] Updated weights for policy 0, policy_version 19850 (0.0042) +[2024-06-18 00:16:21,101][12883] Updated weights for policy 0, policy_version 19860 (0.0035) +[2024-06-18 00:16:21,994][12645] Fps is (10 sec: 44251.4, 60 sec: 40963.4, 300 sec: 40876.7). Total num frames: 325435392. Throughput: 0: 40458.2. Samples: 325532320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) +[2024-06-18 00:16:21,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:16:25,477][12883] Updated weights for policy 0, policy_version 19870 (0.0042) +[2024-06-18 00:16:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 325615616. Throughput: 0: 40416.2. Samples: 325769100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) +[2024-06-18 00:16:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:16:29,230][12883] Updated weights for policy 0, policy_version 19880 (0.0047) +[2024-06-18 00:16:31,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40140.7, 300 sec: 40599.0). Total num frames: 325812224. Throughput: 0: 40447.6. Samples: 325887640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 00:16:31,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:16:33,483][12883] Updated weights for policy 0, policy_version 19890 (0.0034) +[2024-06-18 00:16:36,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40413.8, 300 sec: 40765.6). Total num frames: 326008832. Throughput: 0: 40534.9. Samples: 326133280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 00:16:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:16:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019898_326008832.pth... +[2024-06-18 00:16:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019302_316243968.pth +[2024-06-18 00:16:37,605][12883] Updated weights for policy 0, policy_version 19900 (0.0028) +[2024-06-18 00:16:41,519][12883] Updated weights for policy 0, policy_version 19910 (0.0031) +[2024-06-18 00:16:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 326221824. Throughput: 0: 40448.8. Samples: 326378800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 00:16:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:16:45,499][12883] Updated weights for policy 0, policy_version 19920 (0.0038) +[2024-06-18 00:16:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 326434816. Throughput: 0: 40589.0. Samples: 326504020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:16:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:16:49,475][12883] Updated weights for policy 0, policy_version 19930 (0.0031) +[2024-06-18 00:16:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 326615040. Throughput: 0: 40516.8. Samples: 326748260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:16:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:16:53,614][12883] Updated weights for policy 0, policy_version 19940 (0.0032) +[2024-06-18 00:16:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40686.8, 300 sec: 40710.1). Total num frames: 326828032. Throughput: 0: 40455.4. Samples: 326987420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:16:56,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:16:57,505][12883] Updated weights for policy 0, policy_version 19950 (0.0034) +[2024-06-18 00:17:01,755][12883] Updated weights for policy 0, policy_version 19960 (0.0038) +[2024-06-18 00:17:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 327041024. Throughput: 0: 40601.7. Samples: 327111980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:17:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:17:05,825][12883] Updated weights for policy 0, policy_version 19970 (0.0033) +[2024-06-18 00:17:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.9, 300 sec: 40710.1). Total num frames: 327221248. Throughput: 0: 40473.9. Samples: 327353640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:17:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:17:09,585][12883] Updated weights for policy 0, policy_version 19980 (0.0042) +[2024-06-18 00:17:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40962.2, 300 sec: 40710.1). Total num frames: 327450624. Throughput: 0: 40558.6. Samples: 327594240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:17:11,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:17:14,010][12862] Signal inference workers to stop experience collection... (4600 times) +[2024-06-18 00:17:14,020][12862] Signal inference workers to resume experience collection... (4600 times) +[2024-06-18 00:17:14,024][12883] InferenceWorker_p0-w0: stopping experience collection (4600 times) +[2024-06-18 00:17:14,034][12883] InferenceWorker_p0-w0: resuming experience collection (4600 times) +[2024-06-18 00:17:14,169][12883] Updated weights for policy 0, policy_version 19990 (0.0030) +[2024-06-18 00:17:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 327647232. Throughput: 0: 40753.5. Samples: 327721540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 00:17:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:17:17,499][12883] Updated weights for policy 0, policy_version 20000 (0.0023) +[2024-06-18 00:17:21,989][12883] Updated weights for policy 0, policy_version 20010 (0.0038) +[2024-06-18 00:17:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.8, 300 sec: 40765.6). Total num frames: 327843840. Throughput: 0: 40761.5. Samples: 327967540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 00:17:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:17:25,748][12883] Updated weights for policy 0, policy_version 20020 (0.0039) +[2024-06-18 00:17:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 328073216. Throughput: 0: 40485.4. Samples: 328200640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 00:17:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:17:30,053][12883] Updated weights for policy 0, policy_version 20030 (0.0032) +[2024-06-18 00:17:31,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40140.9, 300 sec: 40654.5). Total num frames: 328220672. Throughput: 0: 40626.3. Samples: 328332200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:17:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:17:33,745][12883] Updated weights for policy 0, policy_version 20040 (0.0037) +[2024-06-18 00:17:36,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 328450048. Throughput: 0: 40378.3. Samples: 328565280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:17:36,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:17:37,930][12883] Updated weights for policy 0, policy_version 20050 (0.0028) +[2024-06-18 00:17:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 328663040. Throughput: 0: 40734.3. Samples: 328820460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:17:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:17:41,995][12883] Updated weights for policy 0, policy_version 20060 (0.0033) +[2024-06-18 00:17:45,833][12883] Updated weights for policy 0, policy_version 20070 (0.0037) +[2024-06-18 00:17:46,996][12645] Fps is (10 sec: 39312.9, 60 sec: 40139.3, 300 sec: 40654.2). Total num frames: 328843264. Throughput: 0: 40491.8. Samples: 328934200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) +[2024-06-18 00:17:46,997][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:17:50,138][12883] Updated weights for policy 0, policy_version 20080 (0.0039) +[2024-06-18 00:17:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 40710.1). Total num frames: 329089024. Throughput: 0: 40506.1. Samples: 329176420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) +[2024-06-18 00:17:51,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 00:17:53,808][12883] Updated weights for policy 0, policy_version 20090 (0.0038) +[2024-06-18 00:17:56,994][12645] Fps is (10 sec: 39330.8, 60 sec: 40140.9, 300 sec: 40543.5). Total num frames: 329236480. Throughput: 0: 40798.8. Samples: 329430180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 00:17:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:17:58,229][12883] Updated weights for policy 0, policy_version 20100 (0.0033) +[2024-06-18 00:18:01,826][12883] Updated weights for policy 0, policy_version 20110 (0.0045) +[2024-06-18 00:18:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 329482240. Throughput: 0: 40428.9. Samples: 329540840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 00:18:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:18:06,287][12883] Updated weights for policy 0, policy_version 20120 (0.0041) +[2024-06-18 00:18:06,994][12645] Fps is (10 sec: 45874.5, 60 sec: 41233.0, 300 sec: 40710.1). Total num frames: 329695232. Throughput: 0: 40574.1. Samples: 329793380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 00:18:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:18:09,953][12883] Updated weights for policy 0, policy_version 20130 (0.0042) +[2024-06-18 00:18:11,519][12862] Signal inference workers to stop experience collection... (4650 times) +[2024-06-18 00:18:11,519][12862] Signal inference workers to resume experience collection... (4650 times) +[2024-06-18 00:18:11,559][12883] InferenceWorker_p0-w0: stopping experience collection (4650 times) +[2024-06-18 00:18:11,559][12883] InferenceWorker_p0-w0: resuming experience collection (4650 times) +[2024-06-18 00:18:11,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 329859072. Throughput: 0: 40879.0. Samples: 330040200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 00:18:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:18:14,337][12883] Updated weights for policy 0, policy_version 20140 (0.0039) +[2024-06-18 00:18:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 40821.2). Total num frames: 330104832. Throughput: 0: 40614.5. Samples: 330159860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 00:18:16,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:18:17,877][12883] Updated weights for policy 0, policy_version 20150 (0.0035) +[2024-06-18 00:18:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40413.9, 300 sec: 40599.9). Total num frames: 330268672. Throughput: 0: 40844.1. Samples: 330403260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 00:18:21,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 00:18:22,408][12883] Updated weights for policy 0, policy_version 20160 (0.0043) +[2024-06-18 00:18:25,706][12883] Updated weights for policy 0, policy_version 20170 (0.0038) +[2024-06-18 00:18:26,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40140.8, 300 sec: 40710.1). Total num frames: 330481664. Throughput: 0: 40705.0. Samples: 330652180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 00:18:26,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:18:30,374][12883] Updated weights for policy 0, policy_version 20180 (0.0053) +[2024-06-18 00:18:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.0, 300 sec: 40765.6). Total num frames: 330711040. Throughput: 0: 40934.0. Samples: 330776140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 00:18:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:18:33,473][12883] Updated weights for policy 0, policy_version 20190 (0.0036) +[2024-06-18 00:18:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 330874880. Throughput: 0: 40908.5. Samples: 331017300. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 00:18:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:18:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020195_330874880.pth... +[2024-06-18 00:18:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019602_321159168.pth +[2024-06-18 00:18:38,548][12883] Updated weights for policy 0, policy_version 20200 (0.0042) +[2024-06-18 00:18:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 331104256. Throughput: 0: 40534.1. Samples: 331254220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) +[2024-06-18 00:18:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:18:42,040][12883] Updated weights for policy 0, policy_version 20210 (0.0047) +[2024-06-18 00:18:46,594][12883] Updated weights for policy 0, policy_version 20220 (0.0042) +[2024-06-18 00:18:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41234.6, 300 sec: 40710.1). Total num frames: 331317248. Throughput: 0: 40929.8. Samples: 331382680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) +[2024-06-18 00:18:46,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:18:49,976][12883] Updated weights for policy 0, policy_version 20230 (0.0043) +[2024-06-18 00:18:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 331497472. Throughput: 0: 40559.5. Samples: 331618560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) +[2024-06-18 00:18:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:18:54,582][12883] Updated weights for policy 0, policy_version 20240 (0.0046) +[2024-06-18 00:18:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 40710.1). Total num frames: 331710464. Throughput: 0: 40379.6. Samples: 331857280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) +[2024-06-18 00:18:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:18:58,577][12883] Updated weights for policy 0, policy_version 20250 (0.0040) +[2024-06-18 00:19:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 40543.5). Total num frames: 331874304. Throughput: 0: 40541.4. Samples: 331984220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) +[2024-06-18 00:19:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:19:02,983][12883] Updated weights for policy 0, policy_version 20260 (0.0043) +[2024-06-18 00:19:06,639][12883] Updated weights for policy 0, policy_version 20270 (0.0033) +[2024-06-18 00:19:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 332103680. Throughput: 0: 40428.8. Samples: 332222560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:19:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:19:11,051][12883] Updated weights for policy 0, policy_version 20280 (0.0025) +[2024-06-18 00:19:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 40960.1, 300 sec: 40654.5). Total num frames: 332316672. Throughput: 0: 40313.3. Samples: 332466280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:19:11,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:19:14,673][12883] Updated weights for policy 0, policy_version 20290 (0.0041) +[2024-06-18 00:19:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 40654.7). Total num frames: 332513280. Throughput: 0: 40364.0. Samples: 332592520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:19:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:19:18,863][12883] Updated weights for policy 0, policy_version 20300 (0.0026) +[2024-06-18 00:19:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 40654.5). Total num frames: 332726272. Throughput: 0: 40407.9. Samples: 332835660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 00:19:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:19:22,557][12883] Updated weights for policy 0, policy_version 20310 (0.0035) +[2024-06-18 00:19:26,890][12883] Updated weights for policy 0, policy_version 20320 (0.0040) +[2024-06-18 00:19:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.7, 300 sec: 40654.5). Total num frames: 332922880. Throughput: 0: 40757.2. Samples: 333088300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 00:19:26,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:19:30,470][12883] Updated weights for policy 0, policy_version 20330 (0.0042) +[2024-06-18 00:19:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 333135872. Throughput: 0: 40516.5. Samples: 333205920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 00:19:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:19:34,774][12883] Updated weights for policy 0, policy_version 20340 (0.0022) +[2024-06-18 00:19:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.0, 300 sec: 40654.8). Total num frames: 333348864. Throughput: 0: 40762.2. Samples: 333452860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 00:19:36,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:19:38,744][12883] Updated weights for policy 0, policy_version 20350 (0.0039) +[2024-06-18 00:19:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 333545472. Throughput: 0: 41028.4. Samples: 333703560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 00:19:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:19:42,818][12883] Updated weights for policy 0, policy_version 20360 (0.0033) +[2024-06-18 00:19:45,401][12862] Signal inference workers to stop experience collection... (4700 times) +[2024-06-18 00:19:45,401][12862] Signal inference workers to resume experience collection... (4700 times) +[2024-06-18 00:19:45,414][12883] InferenceWorker_p0-w0: stopping experience collection (4700 times) +[2024-06-18 00:19:45,414][12883] InferenceWorker_p0-w0: resuming experience collection (4700 times) +[2024-06-18 00:19:46,703][12883] Updated weights for policy 0, policy_version 20370 (0.0046) +[2024-06-18 00:19:47,000][12645] Fps is (10 sec: 39297.3, 60 sec: 40409.7, 300 sec: 40598.1). Total num frames: 333742080. Throughput: 0: 40785.5. Samples: 333819820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 00:19:47,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:19:50,756][12883] Updated weights for policy 0, policy_version 20380 (0.0037) +[2024-06-18 00:19:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40960.1, 300 sec: 40654.6). Total num frames: 333955072. Throughput: 0: 40997.0. Samples: 334067420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 00:19:51,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:19:54,745][12883] Updated weights for policy 0, policy_version 20390 (0.0048) +[2024-06-18 00:19:57,000][12645] Fps is (10 sec: 40959.8, 60 sec: 40682.7, 300 sec: 40653.7). Total num frames: 334151680. Throughput: 0: 40921.8. Samples: 334308020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 00:19:57,000][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:19:58,725][12883] Updated weights for policy 0, policy_version 20400 (0.0048) +[2024-06-18 00:20:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 40654.6). Total num frames: 334364672. Throughput: 0: 40908.1. Samples: 334433380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 00:20:01,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:20:02,602][12883] Updated weights for policy 0, policy_version 20410 (0.0049) +[2024-06-18 00:20:06,849][12883] Updated weights for policy 0, policy_version 20420 (0.0040) +[2024-06-18 00:20:06,994][12645] Fps is (10 sec: 40985.7, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 334561280. Throughput: 0: 41015.6. Samples: 334681360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 00:20:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:20:11,045][12883] Updated weights for policy 0, policy_version 20430 (0.0035) +[2024-06-18 00:20:11,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40686.8, 300 sec: 40654.5). Total num frames: 334757888. Throughput: 0: 40795.6. Samples: 334924100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 00:20:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:20:14,978][12883] Updated weights for policy 0, policy_version 20440 (0.0028) +[2024-06-18 00:20:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 41231.6, 300 sec: 40710.4). Total num frames: 334987264. Throughput: 0: 40925.5. Samples: 335047660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 00:20:16,997][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:20:18,937][12883] Updated weights for policy 0, policy_version 20450 (0.0035) +[2024-06-18 00:20:21,994][12645] Fps is (10 sec: 42595.7, 60 sec: 40959.5, 300 sec: 40710.0). Total num frames: 335183872. Throughput: 0: 40842.0. Samples: 335290780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 00:20:21,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:20:23,033][12883] Updated weights for policy 0, policy_version 20460 (0.0034) +[2024-06-18 00:20:26,994][12645] Fps is (10 sec: 37692.0, 60 sec: 40687.1, 300 sec: 40543.5). Total num frames: 335364096. Throughput: 0: 40834.4. Samples: 335541100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 00:20:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:20:27,103][12883] Updated weights for policy 0, policy_version 20470 (0.0037) +[2024-06-18 00:20:31,082][12883] Updated weights for policy 0, policy_version 20480 (0.0034) +[2024-06-18 00:20:31,994][12645] Fps is (10 sec: 40963.1, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 335593472. Throughput: 0: 40999.0. Samples: 335664520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 00:20:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:20:35,070][12883] Updated weights for policy 0, policy_version 20490 (0.0034) +[2024-06-18 00:20:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 335806464. Throughput: 0: 40954.5. Samples: 335910380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 00:20:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:20:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020496_335806464.pth... +[2024-06-18 00:20:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019898_326008832.pth +[2024-06-18 00:20:38,909][12883] Updated weights for policy 0, policy_version 20500 (0.0043) +[2024-06-18 00:20:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 335986688. Throughput: 0: 41070.2. Samples: 336155920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 00:20:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:20:42,981][12883] Updated weights for policy 0, policy_version 20510 (0.0038) +[2024-06-18 00:20:46,942][12883] Updated weights for policy 0, policy_version 20520 (0.0031) +[2024-06-18 00:20:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40964.2, 300 sec: 40765.6). Total num frames: 336199680. Throughput: 0: 40967.4. Samples: 336276920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 00:20:47,000][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:20:50,857][12883] Updated weights for policy 0, policy_version 20530 (0.0042) +[2024-06-18 00:20:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 336412672. Throughput: 0: 41102.3. Samples: 336530960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 00:20:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:20:54,878][12883] Updated weights for policy 0, policy_version 20540 (0.0035) +[2024-06-18 00:20:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40964.3, 300 sec: 40654.6). Total num frames: 336609280. Throughput: 0: 40952.6. Samples: 336766960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 00:20:56,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:20:59,046][12883] Updated weights for policy 0, policy_version 20550 (0.0031) +[2024-06-18 00:21:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 336822272. Throughput: 0: 40846.1. Samples: 336885640. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 00:21:01,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:21:02,847][12883] Updated weights for policy 0, policy_version 20560 (0.0037) +[2024-06-18 00:21:06,962][12883] Updated weights for policy 0, policy_version 20570 (0.0033) +[2024-06-18 00:21:06,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 40766.1). Total num frames: 337018880. Throughput: 0: 40909.9. Samples: 337131700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 00:21:06,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:21:10,629][12883] Updated weights for policy 0, policy_version 20580 (0.0034) +[2024-06-18 00:21:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 337215488. Throughput: 0: 40764.3. Samples: 337375500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 00:21:11,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:21:15,175][12883] Updated weights for policy 0, policy_version 20590 (0.0049) +[2024-06-18 00:21:17,000][12645] Fps is (10 sec: 40934.7, 60 sec: 40684.2, 300 sec: 40653.7). Total num frames: 337428480. Throughput: 0: 40752.1. Samples: 337498620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 00:21:17,001][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:21:18,994][12883] Updated weights for policy 0, policy_version 20600 (0.0046) +[2024-06-18 00:21:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40414.3, 300 sec: 40654.5). Total num frames: 337608704. Throughput: 0: 40716.8. Samples: 337742640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 00:21:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:21:23,177][12883] Updated weights for policy 0, policy_version 20610 (0.0030) +[2024-06-18 00:21:26,994][12645] Fps is (10 sec: 39346.7, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 337821696. Throughput: 0: 40709.4. Samples: 337987840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 00:21:26,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:21:27,025][12883] Updated weights for policy 0, policy_version 20620 (0.0042) +[2024-06-18 00:21:30,509][12862] Signal inference workers to stop experience collection... (4750 times) +[2024-06-18 00:21:30,509][12862] Signal inference workers to resume experience collection... (4750 times) +[2024-06-18 00:21:30,545][12883] InferenceWorker_p0-w0: stopping experience collection (4750 times) +[2024-06-18 00:21:30,545][12883] InferenceWorker_p0-w0: resuming experience collection (4750 times) +[2024-06-18 00:21:31,313][12883] Updated weights for policy 0, policy_version 20630 (0.0035) +[2024-06-18 00:21:31,998][12645] Fps is (10 sec: 44217.4, 60 sec: 40956.9, 300 sec: 40820.5). Total num frames: 338051072. Throughput: 0: 40792.4. Samples: 338112760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:21:31,999][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:21:34,919][12883] Updated weights for policy 0, policy_version 20640 (0.0034) +[2024-06-18 00:21:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 338231296. Throughput: 0: 40530.2. Samples: 338354820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:21:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:21:39,174][12883] Updated weights for policy 0, policy_version 20650 (0.0037) +[2024-06-18 00:21:41,994][12645] Fps is (10 sec: 39339.9, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 338444288. Throughput: 0: 40698.7. Samples: 338598400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:21:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:21:43,047][12883] Updated weights for policy 0, policy_version 20660 (0.0030) +[2024-06-18 00:21:46,983][12883] Updated weights for policy 0, policy_version 20670 (0.0035) +[2024-06-18 00:21:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 338657280. Throughput: 0: 40933.3. Samples: 338727640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:21:46,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:21:51,101][12883] Updated weights for policy 0, policy_version 20680 (0.0043) +[2024-06-18 00:21:51,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 338853888. Throughput: 0: 40915.5. Samples: 338972900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:21:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:21:55,000][12883] Updated weights for policy 0, policy_version 20690 (0.0036) +[2024-06-18 00:21:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 339050496. Throughput: 0: 40838.8. Samples: 339213240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) +[2024-06-18 00:21:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:21:58,961][12883] Updated weights for policy 0, policy_version 20700 (0.0042) +[2024-06-18 00:22:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.7, 300 sec: 40710.1). Total num frames: 339230720. Throughput: 0: 40831.4. Samples: 339335780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) +[2024-06-18 00:22:01,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:22:02,799][12883] Updated weights for policy 0, policy_version 20710 (0.0039) +[2024-06-18 00:22:06,983][12883] Updated weights for policy 0, policy_version 20720 (0.0037) +[2024-06-18 00:22:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 339476480. Throughput: 0: 40925.1. Samples: 339584260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) +[2024-06-18 00:22:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:22:10,787][12883] Updated weights for policy 0, policy_version 20730 (0.0036) +[2024-06-18 00:22:11,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 339689472. Throughput: 0: 40797.2. Samples: 339823720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 00:22:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:22:15,002][12883] Updated weights for policy 0, policy_version 20740 (0.0028) +[2024-06-18 00:22:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40691.2, 300 sec: 40765.6). Total num frames: 339869696. Throughput: 0: 40916.2. Samples: 339953800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 00:22:16,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:22:18,722][12883] Updated weights for policy 0, policy_version 20750 (0.0040) +[2024-06-18 00:22:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.2, 300 sec: 40654.5). Total num frames: 340066304. Throughput: 0: 40844.4. Samples: 340192820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 00:22:21,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 00:22:23,591][12883] Updated weights for policy 0, policy_version 20760 (0.0034) +[2024-06-18 00:22:26,669][12883] Updated weights for policy 0, policy_version 20770 (0.0036) +[2024-06-18 00:22:26,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41504.5, 300 sec: 40987.4). Total num frames: 340312064. Throughput: 0: 40972.6. Samples: 340442260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 00:22:26,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:22:31,429][12883] Updated weights for policy 0, policy_version 20780 (0.0039) +[2024-06-18 00:22:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40690.0, 300 sec: 40821.1). Total num frames: 340492288. Throughput: 0: 40886.1. Samples: 340567520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 00:22:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:22:34,677][12883] Updated weights for policy 0, policy_version 20790 (0.0042) +[2024-06-18 00:22:36,994][12645] Fps is (10 sec: 39330.0, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 340705280. Throughput: 0: 40839.1. Samples: 340810660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 00:22:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:22:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020795_340705280.pth... +[2024-06-18 00:22:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020195_330874880.pth +[2024-06-18 00:22:39,419][12883] Updated weights for policy 0, policy_version 20800 (0.0029) +[2024-06-18 00:22:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40686.8, 300 sec: 40821.5). Total num frames: 340885504. Throughput: 0: 40947.9. Samples: 341055900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 00:22:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:22:42,548][12862] Signal inference workers to stop experience collection... (4800 times) +[2024-06-18 00:22:42,596][12883] InferenceWorker_p0-w0: stopping experience collection (4800 times) +[2024-06-18 00:22:42,667][12862] Signal inference workers to resume experience collection... (4800 times) +[2024-06-18 00:22:42,668][12883] InferenceWorker_p0-w0: resuming experience collection (4800 times) +[2024-06-18 00:22:42,799][12883] Updated weights for policy 0, policy_version 20810 (0.0044) +[2024-06-18 00:22:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 341082112. Throughput: 0: 40770.6. Samples: 341170460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 00:22:46,995][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:22:47,530][12883] Updated weights for policy 0, policy_version 20820 (0.0043) +[2024-06-18 00:22:51,209][12883] Updated weights for policy 0, policy_version 20830 (0.0031) +[2024-06-18 00:22:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 341311488. Throughput: 0: 40924.4. Samples: 341425860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 00:22:51,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:22:55,336][12883] Updated weights for policy 0, policy_version 20840 (0.0035) +[2024-06-18 00:22:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 341508096. Throughput: 0: 40814.2. Samples: 341660360. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) +[2024-06-18 00:22:56,998][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:22:59,443][12883] Updated weights for policy 0, policy_version 20850 (0.0042) +[2024-06-18 00:23:01,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 341688320. Throughput: 0: 40623.9. Samples: 341781880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) +[2024-06-18 00:23:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:23:03,852][12883] Updated weights for policy 0, policy_version 20860 (0.0026) +[2024-06-18 00:23:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.8, 300 sec: 40821.2). Total num frames: 341901312. Throughput: 0: 40593.2. Samples: 342019520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) +[2024-06-18 00:23:06,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:23:07,373][12883] Updated weights for policy 0, policy_version 20870 (0.0040) +[2024-06-18 00:23:11,865][12883] Updated weights for policy 0, policy_version 20880 (0.0036) +[2024-06-18 00:23:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 342097920. Throughput: 0: 40515.8. Samples: 342265380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 00:23:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:23:15,301][12883] Updated weights for policy 0, policy_version 20890 (0.0039) +[2024-06-18 00:23:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 342327296. Throughput: 0: 40424.4. Samples: 342386620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 00:23:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:23:19,755][12883] Updated weights for policy 0, policy_version 20900 (0.0030) +[2024-06-18 00:23:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 342523904. Throughput: 0: 40649.5. Samples: 342639880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 00:23:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:23:23,125][12883] Updated weights for policy 0, policy_version 20910 (0.0045) +[2024-06-18 00:23:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40142.2, 300 sec: 40710.1). Total num frames: 342720512. Throughput: 0: 40635.1. Samples: 342884480. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) +[2024-06-18 00:23:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:23:27,696][12883] Updated weights for policy 0, policy_version 20920 (0.0040) +[2024-06-18 00:23:31,144][12883] Updated weights for policy 0, policy_version 20930 (0.0033) +[2024-06-18 00:23:31,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40685.5, 300 sec: 40876.4). Total num frames: 342933504. Throughput: 0: 40715.3. Samples: 343002740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) +[2024-06-18 00:23:31,997][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:23:35,553][12883] Updated weights for policy 0, policy_version 20940 (0.0040) +[2024-06-18 00:23:36,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40414.0, 300 sec: 40765.6). Total num frames: 343130112. Throughput: 0: 40434.3. Samples: 343245400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) +[2024-06-18 00:23:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:23:39,167][12883] Updated weights for policy 0, policy_version 20950 (0.0029) +[2024-06-18 00:23:41,994][12645] Fps is (10 sec: 39330.5, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 343326720. Throughput: 0: 40797.4. Samples: 343496240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 00:23:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:23:43,491][12883] Updated weights for policy 0, policy_version 20960 (0.0033) +[2024-06-18 00:23:47,000][12645] Fps is (10 sec: 42571.6, 60 sec: 41228.8, 300 sec: 40875.8). Total num frames: 343556096. Throughput: 0: 40852.6. Samples: 343620500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 00:23:47,001][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:23:47,178][12883] Updated weights for policy 0, policy_version 20970 (0.0037) +[2024-06-18 00:23:51,426][12883] Updated weights for policy 0, policy_version 20980 (0.0025) +[2024-06-18 00:23:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 343736320. Throughput: 0: 41070.4. Samples: 343867680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 00:23:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:23:55,170][12883] Updated weights for policy 0, policy_version 20990 (0.0052) +[2024-06-18 00:23:56,994][12645] Fps is (10 sec: 39346.0, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 343949312. Throughput: 0: 40846.6. Samples: 344103480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 00:23:56,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:23:59,525][12883] Updated weights for policy 0, policy_version 21000 (0.0049) +[2024-06-18 00:24:01,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 344145920. Throughput: 0: 40949.8. Samples: 344229360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 00:24:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:24:03,201][12883] Updated weights for policy 0, policy_version 21010 (0.0029) +[2024-06-18 00:24:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 344358912. Throughput: 0: 40877.2. Samples: 344479360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:24:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:24:07,779][12883] Updated weights for policy 0, policy_version 21020 (0.0032) +[2024-06-18 00:24:11,135][12883] Updated weights for policy 0, policy_version 21030 (0.0045) +[2024-06-18 00:24:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41233.0, 300 sec: 40876.7). Total num frames: 344571904. Throughput: 0: 40834.7. Samples: 344722040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:24:11,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:24:15,673][12883] Updated weights for policy 0, policy_version 21040 (0.0025) +[2024-06-18 00:24:16,363][12862] Signal inference workers to stop experience collection... (4850 times) +[2024-06-18 00:24:16,408][12883] InferenceWorker_p0-w0: stopping experience collection (4850 times) +[2024-06-18 00:24:16,412][12862] Signal inference workers to resume experience collection... (4850 times) +[2024-06-18 00:24:16,427][12883] InferenceWorker_p0-w0: resuming experience collection (4850 times) +[2024-06-18 00:24:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.8, 300 sec: 40765.6). Total num frames: 344752128. Throughput: 0: 40919.7. Samples: 344844040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:24:16,995][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:24:19,447][12883] Updated weights for policy 0, policy_version 21050 (0.0039) +[2024-06-18 00:24:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 40821.2). Total num frames: 344965120. Throughput: 0: 41005.7. Samples: 345090660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:24:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:24:23,375][12883] Updated weights for policy 0, policy_version 21060 (0.0035) +[2024-06-18 00:24:26,994][12645] Fps is (10 sec: 42599.5, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 345178112. Throughput: 0: 40856.5. Samples: 345334780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:24:26,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:24:27,199][12883] Updated weights for policy 0, policy_version 21070 (0.0029) +[2024-06-18 00:24:31,430][12883] Updated weights for policy 0, policy_version 21080 (0.0032) +[2024-06-18 00:24:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40961.5, 300 sec: 40821.1). Total num frames: 345391104. Throughput: 0: 40982.5. Samples: 345464460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:24:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:24:35,353][12883] Updated weights for policy 0, policy_version 21090 (0.0026) +[2024-06-18 00:24:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 345587712. Throughput: 0: 40746.5. Samples: 345701280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:24:36,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 00:24:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021093_345587712.pth... +[2024-06-18 00:24:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020496_335806464.pth +[2024-06-18 00:24:39,543][12883] Updated weights for policy 0, policy_version 21100 (0.0029) +[2024-06-18 00:24:41,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 40822.0). Total num frames: 345784320. Throughput: 0: 41065.4. Samples: 345951420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:24:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:24:43,399][12883] Updated weights for policy 0, policy_version 21110 (0.0043) +[2024-06-18 00:24:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40418.1, 300 sec: 40765.6). Total num frames: 345980928. Throughput: 0: 40923.3. Samples: 346070900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:24:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:24:47,428][12883] Updated weights for policy 0, policy_version 21120 (0.0032) +[2024-06-18 00:24:51,218][12883] Updated weights for policy 0, policy_version 21130 (0.0025) +[2024-06-18 00:24:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 40822.0). Total num frames: 346193920. Throughput: 0: 40725.8. Samples: 346312020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:24:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:24:56,342][12883] Updated weights for policy 0, policy_version 21140 (0.0038) +[2024-06-18 00:24:56,994][12645] Fps is (10 sec: 39318.3, 60 sec: 40413.4, 300 sec: 40710.0). Total num frames: 346374144. Throughput: 0: 40831.8. Samples: 346559500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:24:56,995][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:24:59,453][12883] Updated weights for policy 0, policy_version 21150 (0.0035) +[2024-06-18 00:25:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 346603520. Throughput: 0: 40652.6. Samples: 346673400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:25:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:25:04,303][12883] Updated weights for policy 0, policy_version 21160 (0.0028) +[2024-06-18 00:25:07,000][12645] Fps is (10 sec: 44212.8, 60 sec: 40955.8, 300 sec: 40875.9). Total num frames: 346816512. Throughput: 0: 40808.2. Samples: 346927280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 00:25:07,000][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:25:07,718][12883] Updated weights for policy 0, policy_version 21170 (0.0029) +[2024-06-18 00:25:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40414.0, 300 sec: 40710.4). Total num frames: 346996736. Throughput: 0: 40802.3. Samples: 347170880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 00:25:11,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:25:12,111][12883] Updated weights for policy 0, policy_version 21180 (0.0034) +[2024-06-18 00:25:16,000][12883] Updated weights for policy 0, policy_version 21190 (0.0036) +[2024-06-18 00:25:16,994][12645] Fps is (10 sec: 39345.5, 60 sec: 40960.0, 300 sec: 40765.7). Total num frames: 347209728. Throughput: 0: 40568.0. Samples: 347290020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 00:25:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:25:19,898][12883] Updated weights for policy 0, policy_version 21200 (0.0046) +[2024-06-18 00:25:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 347406336. Throughput: 0: 40740.6. Samples: 347534600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 00:25:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:25:23,921][12883] Updated weights for policy 0, policy_version 21210 (0.0038) +[2024-06-18 00:25:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 347619328. Throughput: 0: 40576.8. Samples: 347777380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 00:25:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:25:27,906][12883] Updated weights for policy 0, policy_version 21220 (0.0029) +[2024-06-18 00:25:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 347815936. Throughput: 0: 40622.1. Samples: 347898900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 00:25:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:25:32,175][12883] Updated weights for policy 0, policy_version 21230 (0.0042) +[2024-06-18 00:25:35,995][12883] Updated weights for policy 0, policy_version 21240 (0.0038) +[2024-06-18 00:25:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 348012544. Throughput: 0: 40644.5. Samples: 348141020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:25:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:25:40,129][12883] Updated weights for policy 0, policy_version 21250 (0.0044) +[2024-06-18 00:25:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40959.9, 300 sec: 40821.2). Total num frames: 348241920. Throughput: 0: 40462.0. Samples: 348380260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:25:41,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:25:44,078][12883] Updated weights for policy 0, policy_version 21260 (0.0038) +[2024-06-18 00:25:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 348405760. Throughput: 0: 40770.7. Samples: 348508080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:25:46,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:25:48,250][12883] Updated weights for policy 0, policy_version 21270 (0.0040) +[2024-06-18 00:25:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 348635136. Throughput: 0: 40444.6. Samples: 348747040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-18 00:25:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:25:52,297][12883] Updated weights for policy 0, policy_version 21280 (0.0039) +[2024-06-18 00:25:56,178][12883] Updated weights for policy 0, policy_version 21290 (0.0047) +[2024-06-18 00:25:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40960.5, 300 sec: 40710.1). Total num frames: 348831744. Throughput: 0: 40599.0. Samples: 348997840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-18 00:25:56,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:26:00,238][12883] Updated weights for policy 0, policy_version 21300 (0.0048) +[2024-06-18 00:26:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 349028352. Throughput: 0: 40627.7. Samples: 349118260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-18 00:26:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:26:03,566][12862] Signal inference workers to stop experience collection... (4900 times) +[2024-06-18 00:26:03,607][12883] InferenceWorker_p0-w0: stopping experience collection (4900 times) +[2024-06-18 00:26:03,612][12862] Signal inference workers to resume experience collection... (4900 times) +[2024-06-18 00:26:03,626][12883] InferenceWorker_p0-w0: resuming experience collection (4900 times) +[2024-06-18 00:26:04,365][12883] Updated weights for policy 0, policy_version 21310 (0.0039) +[2024-06-18 00:26:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40418.0, 300 sec: 40765.6). Total num frames: 349241344. Throughput: 0: 40496.4. Samples: 349356940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:26:06,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:26:08,212][12883] Updated weights for policy 0, policy_version 21320 (0.0043) +[2024-06-18 00:26:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.8, 300 sec: 40710.9). Total num frames: 349437952. Throughput: 0: 40737.7. Samples: 349610580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:26:11,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:26:12,496][12883] Updated weights for policy 0, policy_version 21330 (0.0028) +[2024-06-18 00:26:16,207][12883] Updated weights for policy 0, policy_version 21340 (0.0034) +[2024-06-18 00:26:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 349650944. Throughput: 0: 40681.8. Samples: 349729580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:26:16,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:26:20,292][12883] Updated weights for policy 0, policy_version 21350 (0.0048) +[2024-06-18 00:26:21,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 349880320. Throughput: 0: 40736.5. Samples: 349974160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 00:26:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:26:24,094][12883] Updated weights for policy 0, policy_version 21360 (0.0047) +[2024-06-18 00:26:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40413.9, 300 sec: 40655.2). Total num frames: 350044160. Throughput: 0: 41031.2. Samples: 350226660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 00:26:26,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:26:28,250][12883] Updated weights for policy 0, policy_version 21370 (0.0046) +[2024-06-18 00:26:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 350273536. Throughput: 0: 40764.0. Samples: 350342460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 00:26:31,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:26:32,088][12883] Updated weights for policy 0, policy_version 21380 (0.0036) +[2024-06-18 00:26:36,086][12883] Updated weights for policy 0, policy_version 21390 (0.0031) +[2024-06-18 00:26:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 350470144. Throughput: 0: 40971.1. Samples: 350590740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 00:26:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:26:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021392_350486528.pth... +[2024-06-18 00:26:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020795_340705280.pth +[2024-06-18 00:26:40,551][12883] Updated weights for policy 0, policy_version 21400 (0.0031) +[2024-06-18 00:26:41,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 350666752. Throughput: 0: 40906.1. Samples: 350838620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 00:26:41,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:26:43,904][12883] Updated weights for policy 0, policy_version 21410 (0.0034) +[2024-06-18 00:26:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41779.2, 300 sec: 40876.7). Total num frames: 350912512. Throughput: 0: 40882.2. Samples: 350957960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 00:26:46,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:26:49,283][12883] Updated weights for policy 0, policy_version 21420 (0.0049) +[2024-06-18 00:26:51,944][12883] Updated weights for policy 0, policy_version 21430 (0.0034) +[2024-06-18 00:26:51,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41233.2, 300 sec: 40876.7). Total num frames: 351109120. Throughput: 0: 40949.0. Samples: 351199640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 00:26:51,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:26:56,994][12645] Fps is (10 sec: 34406.5, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 351256576. Throughput: 0: 40719.7. Samples: 351442960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 00:26:56,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:26:57,061][12883] Updated weights for policy 0, policy_version 21440 (0.0033) +[2024-06-18 00:26:59,873][12883] Updated weights for policy 0, policy_version 21450 (0.0038) +[2024-06-18 00:27:01,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 351485952. Throughput: 0: 40647.1. Samples: 351558700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:27:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:27:05,003][12883] Updated weights for policy 0, policy_version 21460 (0.0024) +[2024-06-18 00:27:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40687.0, 300 sec: 40654.6). Total num frames: 351682560. Throughput: 0: 40826.2. Samples: 351811340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:27:06,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:27:08,171][12883] Updated weights for policy 0, policy_version 21470 (0.0039) +[2024-06-18 00:27:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 351879168. Throughput: 0: 40507.9. Samples: 352049520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:27:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:27:13,033][12883] Updated weights for policy 0, policy_version 21480 (0.0047) +[2024-06-18 00:27:14,668][12862] Signal inference workers to stop experience collection... (4950 times) +[2024-06-18 00:27:14,703][12883] InferenceWorker_p0-w0: stopping experience collection (4950 times) +[2024-06-18 00:27:14,785][12862] Signal inference workers to resume experience collection... (4950 times) +[2024-06-18 00:27:14,785][12883] InferenceWorker_p0-w0: resuming experience collection (4950 times) +[2024-06-18 00:27:16,111][12883] Updated weights for policy 0, policy_version 21490 (0.0032) +[2024-06-18 00:27:16,996][12645] Fps is (10 sec: 42588.4, 60 sec: 40958.5, 300 sec: 40820.8). Total num frames: 352108544. Throughput: 0: 40673.9. Samples: 352172880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) +[2024-06-18 00:27:16,997][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:27:20,897][12883] Updated weights for policy 0, policy_version 21500 (0.0034) +[2024-06-18 00:27:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.7, 300 sec: 40543.8). Total num frames: 352272384. Throughput: 0: 40680.6. Samples: 352421360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) +[2024-06-18 00:27:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:27:24,417][12883] Updated weights for policy 0, policy_version 21510 (0.0042) +[2024-06-18 00:27:26,994][12645] Fps is (10 sec: 39330.6, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 352501760. Throughput: 0: 40462.8. Samples: 352659440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) +[2024-06-18 00:27:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:27:29,224][12883] Updated weights for policy 0, policy_version 21520 (0.0047) +[2024-06-18 00:27:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 352714752. Throughput: 0: 40532.5. Samples: 352781920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:27:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:27:32,437][12883] Updated weights for policy 0, policy_version 21530 (0.0052) +[2024-06-18 00:27:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 352894976. Throughput: 0: 40539.4. Samples: 353023920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:27:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:27:37,250][12883] Updated weights for policy 0, policy_version 21540 (0.0038) +[2024-06-18 00:27:40,584][12883] Updated weights for policy 0, policy_version 21550 (0.0040) +[2024-06-18 00:27:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40414.0, 300 sec: 40710.1). Total num frames: 353091584. Throughput: 0: 40496.5. Samples: 353265300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:27:41,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:27:45,419][12883] Updated weights for policy 0, policy_version 21560 (0.0042) +[2024-06-18 00:27:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.6, 300 sec: 40654.5). Total num frames: 353304576. Throughput: 0: 40703.5. Samples: 353390360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:27:46,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:27:48,970][12883] Updated weights for policy 0, policy_version 21570 (0.0033) +[2024-06-18 00:27:51,994][12645] Fps is (10 sec: 40958.9, 60 sec: 39867.6, 300 sec: 40654.5). Total num frames: 353501184. Throughput: 0: 40359.7. Samples: 353627540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:27:51,995][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:27:53,403][12883] Updated weights for policy 0, policy_version 21580 (0.0030) +[2024-06-18 00:27:56,847][12883] Updated weights for policy 0, policy_version 21590 (0.0035) +[2024-06-18 00:27:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41233.1, 300 sec: 40821.2). Total num frames: 353730560. Throughput: 0: 40500.1. Samples: 353872020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:27:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:28:01,292][12883] Updated weights for policy 0, policy_version 21600 (0.0032) +[2024-06-18 00:28:01,996][12645] Fps is (10 sec: 40951.6, 60 sec: 40412.4, 300 sec: 40709.8). Total num frames: 353910784. Throughput: 0: 40483.6. Samples: 353994640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 00:28:01,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:28:04,677][12883] Updated weights for policy 0, policy_version 21610 (0.0050) +[2024-06-18 00:28:06,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40413.7, 300 sec: 40710.1). Total num frames: 354107392. Throughput: 0: 40424.8. Samples: 354240480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 00:28:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:28:09,313][12883] Updated weights for policy 0, policy_version 21620 (0.0052) +[2024-06-18 00:28:11,994][12645] Fps is (10 sec: 42607.4, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 354336768. Throughput: 0: 40575.4. Samples: 354485340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 00:28:11,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 00:28:12,997][12883] Updated weights for policy 0, policy_version 21630 (0.0054) +[2024-06-18 00:28:16,996][12645] Fps is (10 sec: 40952.6, 60 sec: 40141.0, 300 sec: 40654.3). Total num frames: 354516992. Throughput: 0: 40517.4. Samples: 354605280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:28:16,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:28:17,503][12883] Updated weights for policy 0, policy_version 21640 (0.0038) +[2024-06-18 00:28:20,792][12883] Updated weights for policy 0, policy_version 21650 (0.0037) +[2024-06-18 00:28:21,998][12645] Fps is (10 sec: 40941.5, 60 sec: 41229.9, 300 sec: 40765.0). Total num frames: 354746368. Throughput: 0: 40622.1. Samples: 354852100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:28:21,999][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:28:25,667][12883] Updated weights for policy 0, policy_version 21660 (0.0037) +[2024-06-18 00:28:26,994][12645] Fps is (10 sec: 40967.8, 60 sec: 40413.9, 300 sec: 40654.8). Total num frames: 354926592. Throughput: 0: 40739.5. Samples: 355098580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 00:28:26,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:28:28,660][12883] Updated weights for policy 0, policy_version 21670 (0.0038) +[2024-06-18 00:28:31,996][12645] Fps is (10 sec: 39330.9, 60 sec: 40412.3, 300 sec: 40709.8). Total num frames: 355139584. Throughput: 0: 40690.1. Samples: 355221500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 00:28:31,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:28:33,738][12883] Updated weights for policy 0, policy_version 21680 (0.0045) +[2024-06-18 00:28:36,870][12883] Updated weights for policy 0, policy_version 21690 (0.0039) +[2024-06-18 00:28:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.1, 300 sec: 40821.2). Total num frames: 355368960. Throughput: 0: 40936.2. Samples: 355469660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 00:28:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:28:37,070][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021691_355385344.pth... +[2024-06-18 00:28:37,117][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021093_345587712.pth +[2024-06-18 00:28:41,638][12883] Updated weights for policy 0, policy_version 21700 (0.0044) +[2024-06-18 00:28:41,994][12645] Fps is (10 sec: 42608.3, 60 sec: 41233.1, 300 sec: 40710.9). Total num frames: 355565568. Throughput: 0: 41015.2. Samples: 355717700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 00:28:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:28:44,983][12883] Updated weights for policy 0, policy_version 21710 (0.0039) +[2024-06-18 00:28:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 355762176. Throughput: 0: 40946.4. Samples: 355837140. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 00:28:46,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:28:49,325][12883] Updated weights for policy 0, policy_version 21720 (0.0045) +[2024-06-18 00:28:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.2, 300 sec: 40710.1). Total num frames: 355958784. Throughput: 0: 40858.4. Samples: 356079100. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 00:28:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:28:52,930][12883] Updated weights for policy 0, policy_version 21730 (0.0037) +[2024-06-18 00:28:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 356155392. Throughput: 0: 41055.6. Samples: 356332840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 00:28:56,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:28:57,292][12883] Updated weights for policy 0, policy_version 21740 (0.0031) +[2024-06-18 00:29:00,819][12883] Updated weights for policy 0, policy_version 21750 (0.0028) +[2024-06-18 00:29:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40961.6, 300 sec: 40710.1). Total num frames: 356368384. Throughput: 0: 41038.6. Samples: 356451940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 00:29:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:29:02,758][12862] Signal inference workers to stop experience collection... (5000 times) +[2024-06-18 00:29:02,758][12862] Signal inference workers to resume experience collection... (5000 times) +[2024-06-18 00:29:02,807][12883] InferenceWorker_p0-w0: stopping experience collection (5000 times) +[2024-06-18 00:29:02,807][12883] InferenceWorker_p0-w0: resuming experience collection (5000 times) +[2024-06-18 00:29:05,590][12883] Updated weights for policy 0, policy_version 21760 (0.0050) +[2024-06-18 00:29:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41233.2, 300 sec: 40710.1). Total num frames: 356581376. Throughput: 0: 41047.0. Samples: 356699020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 00:29:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:29:08,969][12883] Updated weights for policy 0, policy_version 21770 (0.0039) +[2024-06-18 00:29:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 356761600. Throughput: 0: 40811.1. Samples: 356935080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 00:29:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:29:13,590][12883] Updated weights for policy 0, policy_version 21780 (0.0035) +[2024-06-18 00:29:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41234.3, 300 sec: 40765.6). Total num frames: 356990976. Throughput: 0: 40866.4. Samples: 357060400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:29:16,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:29:17,335][12883] Updated weights for policy 0, policy_version 21790 (0.0034) +[2024-06-18 00:29:21,731][12883] Updated weights for policy 0, policy_version 21800 (0.0039) +[2024-06-18 00:29:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40417.0, 300 sec: 40654.5). Total num frames: 357171200. Throughput: 0: 40736.0. Samples: 357302780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:29:21,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:29:25,235][12883] Updated weights for policy 0, policy_version 21810 (0.0041) +[2024-06-18 00:29:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 40710.1). Total num frames: 357400576. Throughput: 0: 40580.3. Samples: 357543820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 00:29:26,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:29:29,695][12883] Updated weights for policy 0, policy_version 21820 (0.0035) +[2024-06-18 00:29:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41234.6, 300 sec: 40765.6). Total num frames: 357613568. Throughput: 0: 40771.6. Samples: 357671860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 00:29:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:29:33,233][12883] Updated weights for policy 0, policy_version 21830 (0.0034) +[2024-06-18 00:29:36,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39867.6, 300 sec: 40599.0). Total num frames: 357761024. Throughput: 0: 40678.5. Samples: 357909640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 00:29:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:29:37,828][12883] Updated weights for policy 0, policy_version 21840 (0.0025) +[2024-06-18 00:29:41,093][12883] Updated weights for policy 0, policy_version 21850 (0.0034) +[2024-06-18 00:29:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 358006784. Throughput: 0: 40485.4. Samples: 358154680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 00:29:41,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:29:45,987][12883] Updated weights for policy 0, policy_version 21860 (0.0023) +[2024-06-18 00:29:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 358203392. Throughput: 0: 40805.3. Samples: 358288180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 00:29:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:29:49,113][12883] Updated weights for policy 0, policy_version 21870 (0.0040) +[2024-06-18 00:29:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 40765.7). Total num frames: 358400000. Throughput: 0: 40549.7. Samples: 358523760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 00:29:51,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:29:54,237][12883] Updated weights for policy 0, policy_version 21880 (0.0046) +[2024-06-18 00:29:56,951][12883] Updated weights for policy 0, policy_version 21890 (0.0038) +[2024-06-18 00:29:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 40821.1). Total num frames: 358645760. Throughput: 0: 40677.8. Samples: 358765580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 00:29:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:30:01,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40140.8, 300 sec: 40544.3). Total num frames: 358776832. Throughput: 0: 40597.5. Samples: 358887280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 00:30:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:30:02,210][12883] Updated weights for policy 0, policy_version 21900 (0.0032) +[2024-06-18 00:30:05,088][12883] Updated weights for policy 0, policy_version 21910 (0.0030) +[2024-06-18 00:30:06,993][12645] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 359055360. Throughput: 0: 40669.0. Samples: 359132880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 00:30:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:30:10,355][12883] Updated weights for policy 0, policy_version 21920 (0.0046) +[2024-06-18 00:30:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 359219200. Throughput: 0: 40889.9. Samples: 359383860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 00:30:11,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:30:13,343][12883] Updated weights for policy 0, policy_version 21930 (0.0040) +[2024-06-18 00:30:16,994][12645] Fps is (10 sec: 34405.6, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 359399424. Throughput: 0: 40625.8. Samples: 359500020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 00:30:16,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 00:30:18,134][12883] Updated weights for policy 0, policy_version 21940 (0.0028) +[2024-06-18 00:30:21,287][12883] Updated weights for policy 0, policy_version 21950 (0.0036) +[2024-06-18 00:30:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.0, 300 sec: 40821.1). Total num frames: 359661568. Throughput: 0: 40804.5. Samples: 359745840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 00:30:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:30:26,153][12883] Updated weights for policy 0, policy_version 21960 (0.0046) +[2024-06-18 00:30:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 359825408. Throughput: 0: 40837.2. Samples: 359992360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 00:30:27,000][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:30:29,494][12883] Updated weights for policy 0, policy_version 21970 (0.0033) +[2024-06-18 00:30:31,995][12645] Fps is (10 sec: 37676.7, 60 sec: 40412.7, 300 sec: 40765.4). Total num frames: 360038400. Throughput: 0: 40463.8. Samples: 360109120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 00:30:31,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:30:34,165][12883] Updated weights for policy 0, policy_version 21980 (0.0032) +[2024-06-18 00:30:36,996][12645] Fps is (10 sec: 42589.2, 60 sec: 41504.7, 300 sec: 40709.8). Total num frames: 360251392. Throughput: 0: 40782.8. Samples: 360359080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 00:30:36,996][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:30:37,056][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021989_360267776.pth... +[2024-06-18 00:30:37,115][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021392_350486528.pth +[2024-06-18 00:30:37,461][12883] Updated weights for policy 0, policy_version 21990 (0.0048) +[2024-06-18 00:30:42,000][12645] Fps is (10 sec: 37666.1, 60 sec: 40136.6, 300 sec: 40709.2). Total num frames: 360415232. Throughput: 0: 40657.0. Samples: 360595400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 00:30:42,001][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:30:42,579][12883] Updated weights for policy 0, policy_version 22000 (0.0039) +[2024-06-18 00:30:44,476][12862] Signal inference workers to stop experience collection... (5050 times) +[2024-06-18 00:30:44,476][12862] Signal inference workers to resume experience collection... (5050 times) +[2024-06-18 00:30:44,505][12883] InferenceWorker_p0-w0: stopping experience collection (5050 times) +[2024-06-18 00:30:44,505][12883] InferenceWorker_p0-w0: resuming experience collection (5050 times) +[2024-06-18 00:30:46,132][12883] Updated weights for policy 0, policy_version 22010 (0.0042) +[2024-06-18 00:30:46,994][12645] Fps is (10 sec: 39330.7, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 360644608. Throughput: 0: 40564.4. Samples: 360712680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 00:30:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:30:50,613][12883] Updated weights for policy 0, policy_version 22020 (0.0056) +[2024-06-18 00:30:51,994][12645] Fps is (10 sec: 42624.7, 60 sec: 40686.8, 300 sec: 40710.0). Total num frames: 360841216. Throughput: 0: 40623.3. Samples: 360960940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) +[2024-06-18 00:30:51,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:30:54,354][12883] Updated weights for policy 0, policy_version 22030 (0.0031) +[2024-06-18 00:30:56,994][12645] Fps is (10 sec: 39320.8, 60 sec: 39867.7, 300 sec: 40710.1). Total num frames: 361037824. Throughput: 0: 40255.4. Samples: 361195360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) +[2024-06-18 00:30:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:30:59,176][12883] Updated weights for policy 0, policy_version 22040 (0.0033) +[2024-06-18 00:31:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40959.9, 300 sec: 40654.5). Total num frames: 361234432. Throughput: 0: 40473.3. Samples: 361321320. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) +[2024-06-18 00:31:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:31:02,264][12883] Updated weights for policy 0, policy_version 22050 (0.0031) +[2024-06-18 00:31:06,994][12645] Fps is (10 sec: 37684.1, 60 sec: 39321.5, 300 sec: 40599.0). Total num frames: 361414656. Throughput: 0: 40464.1. Samples: 361566720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:31:06,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:31:07,147][12883] Updated weights for policy 0, policy_version 22060 (0.0047) +[2024-06-18 00:31:10,411][12883] Updated weights for policy 0, policy_version 22070 (0.0036) +[2024-06-18 00:31:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 361676800. Throughput: 0: 40281.8. Samples: 361805040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:31:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:31:15,168][12883] Updated weights for policy 0, policy_version 22080 (0.0040) +[2024-06-18 00:31:16,994][12645] Fps is (10 sec: 44235.4, 60 sec: 40959.9, 300 sec: 40599.0). Total num frames: 361857024. Throughput: 0: 40560.9. Samples: 361934300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:31:16,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:31:18,487][12883] Updated weights for policy 0, policy_version 22090 (0.0032) +[2024-06-18 00:31:21,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39594.6, 300 sec: 40654.5). Total num frames: 362037248. Throughput: 0: 40302.3. Samples: 362172600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 00:31:21,995][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 00:31:23,041][12883] Updated weights for policy 0, policy_version 22100 (0.0041) +[2024-06-18 00:31:26,643][12883] Updated weights for policy 0, policy_version 22110 (0.0036) +[2024-06-18 00:31:26,994][12645] Fps is (10 sec: 40961.3, 60 sec: 40687.0, 300 sec: 40654.5). Total num frames: 362266624. Throughput: 0: 40690.2. Samples: 362426200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 00:31:26,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 00:31:31,019][12883] Updated weights for policy 0, policy_version 22120 (0.0030) +[2024-06-18 00:31:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40415.0, 300 sec: 40654.5). Total num frames: 362463232. Throughput: 0: 40679.4. Samples: 362543260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 00:31:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:31:34,518][12883] Updated weights for policy 0, policy_version 22130 (0.0036) +[2024-06-18 00:31:36,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40413.9, 300 sec: 40709.8). Total num frames: 362676224. Throughput: 0: 40561.2. Samples: 362786280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 00:31:36,996][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:31:38,700][12883] Updated weights for policy 0, policy_version 22140 (0.0038) +[2024-06-18 00:31:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40964.4, 300 sec: 40543.5). Total num frames: 362872832. Throughput: 0: 40872.6. Samples: 363034620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 00:31:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:31:42,491][12883] Updated weights for policy 0, policy_version 22150 (0.0031) +[2024-06-18 00:31:46,493][12883] Updated weights for policy 0, policy_version 22160 (0.0031) +[2024-06-18 00:31:46,994][12645] Fps is (10 sec: 39330.3, 60 sec: 40413.8, 300 sec: 40543.4). Total num frames: 363069440. Throughput: 0: 40727.6. Samples: 363154060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 00:31:46,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:31:50,539][12883] Updated weights for policy 0, policy_version 22170 (0.0023) +[2024-06-18 00:31:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 363298816. Throughput: 0: 40769.2. Samples: 363401340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 00:31:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:31:54,671][12883] Updated weights for policy 0, policy_version 22180 (0.0027) +[2024-06-18 00:31:56,996][12645] Fps is (10 sec: 42589.2, 60 sec: 40958.6, 300 sec: 40709.8). Total num frames: 363495424. Throughput: 0: 40916.3. Samples: 363646360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 00:31:56,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:31:58,428][12883] Updated weights for policy 0, policy_version 22190 (0.0037) +[2024-06-18 00:32:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 363692032. Throughput: 0: 40789.1. Samples: 363769800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 00:32:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:32:02,978][12862] Signal inference workers to stop experience collection... (5100 times) +[2024-06-18 00:32:03,013][12883] InferenceWorker_p0-w0: stopping experience collection (5100 times) +[2024-06-18 00:32:03,043][12862] Signal inference workers to resume experience collection... (5100 times) +[2024-06-18 00:32:03,048][12883] InferenceWorker_p0-w0: resuming experience collection (5100 times) +[2024-06-18 00:32:03,052][12883] Updated weights for policy 0, policy_version 22200 (0.0041) +[2024-06-18 00:32:06,650][12883] Updated weights for policy 0, policy_version 22210 (0.0038) +[2024-06-18 00:32:06,994][12645] Fps is (10 sec: 40968.9, 60 sec: 41506.1, 300 sec: 40765.6). Total num frames: 363905024. Throughput: 0: 40968.5. Samples: 364016180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 00:32:06,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:32:10,973][12883] Updated weights for policy 0, policy_version 22220 (0.0042) +[2024-06-18 00:32:11,996][12645] Fps is (10 sec: 40950.6, 60 sec: 40412.4, 300 sec: 40654.5). Total num frames: 364101632. Throughput: 0: 40686.3. Samples: 364257180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 00:32:11,997][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:32:14,882][12883] Updated weights for policy 0, policy_version 22230 (0.0049) +[2024-06-18 00:32:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40687.1, 300 sec: 40765.6). Total num frames: 364298240. Throughput: 0: 40853.0. Samples: 364381640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 00:32:16,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:32:19,001][12883] Updated weights for policy 0, policy_version 22240 (0.0037) +[2024-06-18 00:32:21,994][12645] Fps is (10 sec: 39330.0, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 364494848. Throughput: 0: 40704.1. Samples: 364617880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:32:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:32:22,858][12883] Updated weights for policy 0, policy_version 22250 (0.0024) +[2024-06-18 00:32:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.8, 300 sec: 40599.0). Total num frames: 364691456. Throughput: 0: 40639.5. Samples: 364863400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:32:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:32:27,094][12883] Updated weights for policy 0, policy_version 22260 (0.0036) +[2024-06-18 00:32:30,789][12883] Updated weights for policy 0, policy_version 22270 (0.0044) +[2024-06-18 00:32:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40687.1, 300 sec: 40710.1). Total num frames: 364904448. Throughput: 0: 40667.2. Samples: 364984080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 00:32:31,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:32:35,306][12883] Updated weights for policy 0, policy_version 22280 (0.0028) +[2024-06-18 00:32:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40688.5, 300 sec: 40765.6). Total num frames: 365117440. Throughput: 0: 40720.1. Samples: 365233740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 00:32:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:32:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022285_365117440.pth... +[2024-06-18 00:32:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021691_355385344.pth +[2024-06-18 00:32:38,618][12883] Updated weights for policy 0, policy_version 22290 (0.0028) +[2024-06-18 00:32:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 365330432. Throughput: 0: 40532.2. Samples: 365470220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 00:32:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:32:43,248][12883] Updated weights for policy 0, policy_version 22300 (0.0048) +[2024-06-18 00:32:46,973][12883] Updated weights for policy 0, policy_version 22310 (0.0037) +[2024-06-18 00:32:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.1, 300 sec: 40765.7). Total num frames: 365527040. Throughput: 0: 40462.7. Samples: 365590620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 00:32:46,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:32:51,292][12883] Updated weights for policy 0, policy_version 22320 (0.0040) +[2024-06-18 00:32:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40413.9, 300 sec: 40654.5). Total num frames: 365723648. Throughput: 0: 40461.8. Samples: 365836960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:32:51,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:32:55,066][12883] Updated weights for policy 0, policy_version 22330 (0.0034) +[2024-06-18 00:32:56,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40142.3, 300 sec: 40654.8). Total num frames: 365903872. Throughput: 0: 40588.2. Samples: 366083560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:32:56,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:32:59,213][12883] Updated weights for policy 0, policy_version 22340 (0.0038) +[2024-06-18 00:33:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 366116864. Throughput: 0: 40481.4. Samples: 366203300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:33:01,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:33:03,074][12883] Updated weights for policy 0, policy_version 22350 (0.0031) +[2024-06-18 00:33:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 366329856. Throughput: 0: 40666.7. Samples: 366447880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 00:33:06,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:33:07,207][12883] Updated weights for policy 0, policy_version 22360 (0.0032) +[2024-06-18 00:33:11,347][12883] Updated weights for policy 0, policy_version 22370 (0.0034) +[2024-06-18 00:33:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40415.3, 300 sec: 40710.3). Total num frames: 366526464. Throughput: 0: 40669.2. Samples: 366693520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 00:33:11,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:33:14,848][12883] Updated weights for policy 0, policy_version 22380 (0.0031) +[2024-06-18 00:33:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40686.8, 300 sec: 40655.2). Total num frames: 366739456. Throughput: 0: 40694.0. Samples: 366815320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 00:33:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:33:19,443][12883] Updated weights for policy 0, policy_version 22390 (0.0037) +[2024-06-18 00:33:22,000][12645] Fps is (10 sec: 40934.7, 60 sec: 40682.8, 300 sec: 40709.2). Total num frames: 366936064. Throughput: 0: 40615.7. Samples: 367061700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:33:22,001][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:33:22,812][12883] Updated weights for policy 0, policy_version 22400 (0.0043) +[2024-06-18 00:33:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.8, 300 sec: 40654.8). Total num frames: 367132672. Throughput: 0: 40807.1. Samples: 367306540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:33:26,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:33:27,648][12883] Updated weights for policy 0, policy_version 22410 (0.0036) +[2024-06-18 00:33:30,950][12862] Signal inference workers to stop experience collection... (5150 times) +[2024-06-18 00:33:30,951][12862] Signal inference workers to resume experience collection... (5150 times) +[2024-06-18 00:33:30,980][12883] InferenceWorker_p0-w0: stopping experience collection (5150 times) +[2024-06-18 00:33:30,980][12883] InferenceWorker_p0-w0: resuming experience collection (5150 times) +[2024-06-18 00:33:31,131][12883] Updated weights for policy 0, policy_version 22420 (0.0037) +[2024-06-18 00:33:31,996][12645] Fps is (10 sec: 42615.3, 60 sec: 40958.4, 300 sec: 40654.2). Total num frames: 367362048. Throughput: 0: 40737.4. Samples: 367423900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:33:31,997][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:33:35,726][12883] Updated weights for policy 0, policy_version 22430 (0.0042) +[2024-06-18 00:33:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40413.8, 300 sec: 40599.0). Total num frames: 367542272. Throughput: 0: 40759.9. Samples: 367671160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:33:36,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:33:39,361][12883] Updated weights for policy 0, policy_version 22440 (0.0049) +[2024-06-18 00:33:41,994][12645] Fps is (10 sec: 37691.8, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 367738880. Throughput: 0: 40551.6. Samples: 367908380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:33:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:33:43,783][12883] Updated weights for policy 0, policy_version 22450 (0.0034) +[2024-06-18 00:33:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.7, 300 sec: 40654.5). Total num frames: 367951872. Throughput: 0: 40608.3. Samples: 368030680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:33:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:33:47,361][12883] Updated weights for policy 0, policy_version 22460 (0.0041) +[2024-06-18 00:33:51,707][12883] Updated weights for policy 0, policy_version 22470 (0.0028) +[2024-06-18 00:33:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 368164864. Throughput: 0: 40724.2. Samples: 368280460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 00:33:51,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 00:33:55,795][12883] Updated weights for policy 0, policy_version 22480 (0.0031) +[2024-06-18 00:33:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 368345088. Throughput: 0: 40725.4. Samples: 368526160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 00:33:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:33:59,498][12883] Updated weights for policy 0, policy_version 22490 (0.0042) +[2024-06-18 00:34:01,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40959.9, 300 sec: 40654.5). Total num frames: 368574464. Throughput: 0: 40651.1. Samples: 368644620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 00:34:01,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:34:04,478][12883] Updated weights for policy 0, policy_version 22500 (0.0041) +[2024-06-18 00:34:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 368771072. Throughput: 0: 40605.1. Samples: 368888680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:34:06,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:34:07,494][12883] Updated weights for policy 0, policy_version 22510 (0.0045) +[2024-06-18 00:34:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 368951296. Throughput: 0: 40616.5. Samples: 369134280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:34:11,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:34:12,236][12883] Updated weights for policy 0, policy_version 22520 (0.0032) +[2024-06-18 00:34:15,544][12883] Updated weights for policy 0, policy_version 22530 (0.0045) +[2024-06-18 00:34:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 369180672. Throughput: 0: 40540.7. Samples: 369248140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:34:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:34:20,021][12883] Updated weights for policy 0, policy_version 22540 (0.0041) +[2024-06-18 00:34:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40691.1, 300 sec: 40599.0). Total num frames: 369377280. Throughput: 0: 40718.6. Samples: 369503500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 00:34:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:34:23,287][12883] Updated weights for policy 0, policy_version 22550 (0.0027) +[2024-06-18 00:34:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 40599.0). Total num frames: 369590272. Throughput: 0: 40889.7. Samples: 369748420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 00:34:27,003][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:34:28,257][12883] Updated weights for policy 0, policy_version 22560 (0.0033) +[2024-06-18 00:34:31,397][12883] Updated weights for policy 0, policy_version 22570 (0.0038) +[2024-06-18 00:34:31,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40688.5, 300 sec: 40821.2). Total num frames: 369803264. Throughput: 0: 40953.1. Samples: 369873560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 00:34:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:34:36,106][12883] Updated weights for policy 0, policy_version 22580 (0.0041) +[2024-06-18 00:34:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40413.9, 300 sec: 40543.4). Total num frames: 369967104. Throughput: 0: 40983.8. Samples: 370124740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 00:34:36,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:34:37,132][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022583_369999872.pth... +[2024-06-18 00:34:37,193][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021989_360267776.pth +[2024-06-18 00:34:39,364][12883] Updated weights for policy 0, policy_version 22590 (0.0041) +[2024-06-18 00:34:41,994][12645] Fps is (10 sec: 40958.8, 60 sec: 41232.9, 300 sec: 40710.0). Total num frames: 370212864. Throughput: 0: 40656.3. Samples: 370355700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 00:34:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:34:44,235][12883] Updated weights for policy 0, policy_version 22600 (0.0023) +[2024-06-18 00:34:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 370409472. Throughput: 0: 41033.4. Samples: 370491120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 00:34:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:34:47,270][12883] Updated weights for policy 0, policy_version 22610 (0.0038) +[2024-06-18 00:34:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40413.7, 300 sec: 40487.9). Total num frames: 370589696. Throughput: 0: 41091.5. Samples: 370737800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:34:51,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:34:52,124][12883] Updated weights for policy 0, policy_version 22620 (0.0038) +[2024-06-18 00:34:55,149][12883] Updated weights for policy 0, policy_version 22630 (0.0030) +[2024-06-18 00:34:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 40876.7). Total num frames: 370835456. Throughput: 0: 40843.2. Samples: 370972220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:34:56,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:34:59,931][12883] Updated weights for policy 0, policy_version 22640 (0.0035) +[2024-06-18 00:35:00,775][12862] Signal inference workers to stop experience collection... (5200 times) +[2024-06-18 00:35:00,808][12883] InferenceWorker_p0-w0: stopping experience collection (5200 times) +[2024-06-18 00:35:00,889][12862] Signal inference workers to resume experience collection... (5200 times) +[2024-06-18 00:35:00,889][12883] InferenceWorker_p0-w0: resuming experience collection (5200 times) +[2024-06-18 00:35:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40687.0, 300 sec: 40543.4). Total num frames: 371015680. Throughput: 0: 41256.0. Samples: 371104660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:35:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:35:03,375][12883] Updated weights for policy 0, policy_version 22650 (0.0035) +[2024-06-18 00:35:06,994][12645] Fps is (10 sec: 36044.9, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 371195904. Throughput: 0: 40797.5. Samples: 371339380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:35:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:35:07,986][12883] Updated weights for policy 0, policy_version 22660 (0.0042) +[2024-06-18 00:35:11,386][12883] Updated weights for policy 0, policy_version 22670 (0.0034) +[2024-06-18 00:35:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 40821.2). Total num frames: 371441664. Throughput: 0: 40769.0. Samples: 371583020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:35:11,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:35:15,742][12883] Updated weights for policy 0, policy_version 22680 (0.0025) +[2024-06-18 00:35:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40686.8, 300 sec: 40543.4). Total num frames: 371621888. Throughput: 0: 40950.0. Samples: 371716320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:35:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:35:19,496][12883] Updated weights for policy 0, policy_version 22690 (0.0030) +[2024-06-18 00:35:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 371834880. Throughput: 0: 40646.2. Samples: 371953820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:35:21,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:35:23,622][12883] Updated weights for policy 0, policy_version 22700 (0.0038) +[2024-06-18 00:35:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 40960.0, 300 sec: 40710.3). Total num frames: 372047872. Throughput: 0: 41011.8. Samples: 372201220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 00:35:26,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 00:35:27,420][12883] Updated weights for policy 0, policy_version 22710 (0.0037) +[2024-06-18 00:35:31,870][12883] Updated weights for policy 0, policy_version 22720 (0.0045) +[2024-06-18 00:35:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.8, 300 sec: 40654.8). Total num frames: 372244480. Throughput: 0: 40719.0. Samples: 372323480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 00:35:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:35:35,762][12883] Updated weights for policy 0, policy_version 22730 (0.0031) +[2024-06-18 00:35:36,998][12645] Fps is (10 sec: 42578.1, 60 sec: 41775.9, 300 sec: 40876.9). Total num frames: 372473856. Throughput: 0: 40734.9. Samples: 372571060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 00:35:36,999][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:35:39,979][12883] Updated weights for policy 0, policy_version 22740 (0.0030) +[2024-06-18 00:35:41,994][12645] Fps is (10 sec: 42599.4, 60 sec: 40960.2, 300 sec: 40765.6). Total num frames: 372670464. Throughput: 0: 40928.0. Samples: 372813980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 00:35:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:35:43,806][12883] Updated weights for policy 0, policy_version 22750 (0.0029) +[2024-06-18 00:35:46,994][12645] Fps is (10 sec: 37700.6, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 372850688. Throughput: 0: 40616.7. Samples: 372932420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 00:35:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:35:47,879][12883] Updated weights for policy 0, policy_version 22760 (0.0030) +[2024-06-18 00:35:51,793][12883] Updated weights for policy 0, policy_version 22770 (0.0037) +[2024-06-18 00:35:51,996][12645] Fps is (10 sec: 39312.5, 60 sec: 41231.6, 300 sec: 40765.3). Total num frames: 373063680. Throughput: 0: 40915.7. Samples: 373180680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 00:35:51,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:35:56,127][12883] Updated weights for policy 0, policy_version 22780 (0.0036) +[2024-06-18 00:35:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40686.8, 300 sec: 40821.1). Total num frames: 373276672. Throughput: 0: 40888.7. Samples: 373423020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:35:56,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:35:59,881][12883] Updated weights for policy 0, policy_version 22790 (0.0045) +[2024-06-18 00:36:01,994][12645] Fps is (10 sec: 40969.6, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 373473280. Throughput: 0: 40762.0. Samples: 373550600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:36:01,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:36:03,818][12883] Updated weights for policy 0, policy_version 22800 (0.0035) +[2024-06-18 00:36:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40959.9, 300 sec: 40599.0). Total num frames: 373653504. Throughput: 0: 40780.9. Samples: 373788960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 00:36:06,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:36:07,853][12883] Updated weights for policy 0, policy_version 22810 (0.0033) +[2024-06-18 00:36:11,758][12883] Updated weights for policy 0, policy_version 22820 (0.0031) +[2024-06-18 00:36:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 373882880. Throughput: 0: 40870.1. Samples: 374040380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:36:11,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:36:15,949][12883] Updated weights for policy 0, policy_version 22830 (0.0032) +[2024-06-18 00:36:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40687.1, 300 sec: 40765.6). Total num frames: 374063104. Throughput: 0: 40848.2. Samples: 374161640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:36:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:36:17,664][12862] Signal inference workers to stop experience collection... (5250 times) +[2024-06-18 00:36:17,664][12862] Signal inference workers to resume experience collection... (5250 times) +[2024-06-18 00:36:17,694][12883] InferenceWorker_p0-w0: stopping experience collection (5250 times) +[2024-06-18 00:36:17,694][12883] InferenceWorker_p0-w0: resuming experience collection (5250 times) +[2024-06-18 00:36:20,162][12883] Updated weights for policy 0, policy_version 22840 (0.0041) +[2024-06-18 00:36:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 374292480. Throughput: 0: 40728.3. Samples: 374403640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 00:36:21,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:36:23,982][12883] Updated weights for policy 0, policy_version 22850 (0.0034) +[2024-06-18 00:36:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 374489088. Throughput: 0: 40750.7. Samples: 374647760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 00:36:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:36:28,117][12883] Updated weights for policy 0, policy_version 22860 (0.0036) +[2024-06-18 00:36:31,954][12883] Updated weights for policy 0, policy_version 22870 (0.0041) +[2024-06-18 00:36:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40960.0, 300 sec: 40765.9). Total num frames: 374702080. Throughput: 0: 40748.4. Samples: 374766100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 00:36:31,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:36:36,108][12883] Updated weights for policy 0, policy_version 22880 (0.0045) +[2024-06-18 00:36:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40690.2, 300 sec: 40821.2). Total num frames: 374915072. Throughput: 0: 40886.1. Samples: 375020460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 00:36:36,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:36:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022883_374915072.pth... +[2024-06-18 00:36:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022285_365117440.pth +[2024-06-18 00:36:40,008][12883] Updated weights for policy 0, policy_version 22890 (0.0046) +[2024-06-18 00:36:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40686.8, 300 sec: 40821.1). Total num frames: 375111680. Throughput: 0: 40842.2. Samples: 375260920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 00:36:41,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:36:43,865][12883] Updated weights for policy 0, policy_version 22900 (0.0038) +[2024-06-18 00:36:46,996][12645] Fps is (10 sec: 39312.3, 60 sec: 40958.5, 300 sec: 40709.8). Total num frames: 375308288. Throughput: 0: 40666.3. Samples: 375380680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 00:36:46,997][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:36:48,277][12883] Updated weights for policy 0, policy_version 22910 (0.0040) +[2024-06-18 00:36:51,651][12883] Updated weights for policy 0, policy_version 22920 (0.0036) +[2024-06-18 00:36:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40961.5, 300 sec: 40765.9). Total num frames: 375521280. Throughput: 0: 40970.7. Samples: 375632640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 00:36:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:36:56,175][12883] Updated weights for policy 0, policy_version 22930 (0.0040) +[2024-06-18 00:36:56,994][12645] Fps is (10 sec: 40969.8, 60 sec: 40687.1, 300 sec: 40765.6). Total num frames: 375717888. Throughput: 0: 40956.2. Samples: 375883400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 00:36:56,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:36:59,651][12883] Updated weights for policy 0, policy_version 22940 (0.0026) +[2024-06-18 00:37:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 375930880. Throughput: 0: 40862.5. Samples: 376000460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 00:37:01,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:37:04,464][12883] Updated weights for policy 0, policy_version 22950 (0.0041) +[2024-06-18 00:37:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41506.1, 300 sec: 40821.4). Total num frames: 376143872. Throughput: 0: 41028.3. Samples: 376249920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 00:37:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:37:07,390][12883] Updated weights for policy 0, policy_version 22960 (0.0029) +[2024-06-18 00:37:11,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 40765.6). Total num frames: 376324096. Throughput: 0: 41202.6. Samples: 376501880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 00:37:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:37:12,435][12883] Updated weights for policy 0, policy_version 22970 (0.0032) +[2024-06-18 00:37:15,282][12883] Updated weights for policy 0, policy_version 22980 (0.0046) +[2024-06-18 00:37:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 40932.3). Total num frames: 376569856. Throughput: 0: 41244.1. Samples: 376622080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 00:37:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:37:20,292][12883] Updated weights for policy 0, policy_version 22990 (0.0031) +[2024-06-18 00:37:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 376750080. Throughput: 0: 41151.6. Samples: 376872280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 00:37:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:37:23,419][12883] Updated weights for policy 0, policy_version 23000 (0.0028) +[2024-06-18 00:37:26,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 376946688. Throughput: 0: 41344.5. Samples: 377121420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 00:37:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:37:28,214][12883] Updated weights for policy 0, policy_version 23010 (0.0038) +[2024-06-18 00:37:31,143][12883] Updated weights for policy 0, policy_version 23020 (0.0028) +[2024-06-18 00:37:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.2, 300 sec: 40876.7). Total num frames: 377176064. Throughput: 0: 41326.2. Samples: 377240260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) +[2024-06-18 00:37:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:37:36,318][12883] Updated weights for policy 0, policy_version 23030 (0.0029) +[2024-06-18 00:37:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 377372672. Throughput: 0: 41333.4. Samples: 377492640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) +[2024-06-18 00:37:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:37:38,750][12862] Signal inference workers to stop experience collection... (5300 times) +[2024-06-18 00:37:38,750][12862] Signal inference workers to resume experience collection... (5300 times) +[2024-06-18 00:37:38,764][12883] InferenceWorker_p0-w0: stopping experience collection (5300 times) +[2024-06-18 00:37:38,764][12883] InferenceWorker_p0-w0: resuming experience collection (5300 times) +[2024-06-18 00:37:38,905][12883] Updated weights for policy 0, policy_version 23040 (0.0033) +[2024-06-18 00:37:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 40821.1). Total num frames: 377569280. Throughput: 0: 40998.1. Samples: 377728320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) +[2024-06-18 00:37:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:37:44,360][12883] Updated weights for policy 0, policy_version 23050 (0.0036) +[2024-06-18 00:37:46,794][12883] Updated weights for policy 0, policy_version 23060 (0.0042) +[2024-06-18 00:37:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41780.8, 300 sec: 40987.8). Total num frames: 377815040. Throughput: 0: 41282.7. Samples: 377858180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 00:37:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:37:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 377962496. Throughput: 0: 41151.6. Samples: 378101740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 00:37:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:37:52,252][12883] Updated weights for policy 0, policy_version 23070 (0.0035) +[2024-06-18 00:37:55,089][12883] Updated weights for policy 0, policy_version 23080 (0.0045) +[2024-06-18 00:37:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41232.9, 300 sec: 40932.2). Total num frames: 378191872. Throughput: 0: 41033.2. Samples: 378348380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 00:37:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:38:00,619][12883] Updated weights for policy 0, policy_version 23090 (0.0030) +[2024-06-18 00:38:01,994][12645] Fps is (10 sec: 45875.6, 60 sec: 41506.2, 300 sec: 40987.8). Total num frames: 378421248. Throughput: 0: 41237.3. Samples: 378477760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:38:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:38:02,739][12883] Updated weights for policy 0, policy_version 23100 (0.0035) +[2024-06-18 00:38:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.9, 300 sec: 40821.2). Total num frames: 378568704. Throughput: 0: 41003.9. Samples: 378717460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:38:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:38:08,475][12883] Updated weights for policy 0, policy_version 23110 (0.0044) +[2024-06-18 00:38:11,194][12883] Updated weights for policy 0, policy_version 23120 (0.0036) +[2024-06-18 00:38:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 40932.3). Total num frames: 378814464. Throughput: 0: 40922.3. Samples: 378962920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 00:38:11,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:38:16,231][12883] Updated weights for policy 0, policy_version 23130 (0.0048) +[2024-06-18 00:38:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40686.9, 300 sec: 40933.1). Total num frames: 379011072. Throughput: 0: 41228.4. Samples: 379095540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 00:38:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:38:19,019][12883] Updated weights for policy 0, policy_version 23140 (0.0049) +[2024-06-18 00:38:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41232.9, 300 sec: 40987.8). Total num frames: 379224064. Throughput: 0: 41039.9. Samples: 379339440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 00:38:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:38:24,194][12883] Updated weights for policy 0, policy_version 23150 (0.0050) +[2024-06-18 00:38:26,826][12883] Updated weights for policy 0, policy_version 23160 (0.0042) +[2024-06-18 00:38:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 40988.1). Total num frames: 379453440. Throughput: 0: 41216.3. Samples: 379583060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 00:38:26,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:38:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 379600896. Throughput: 0: 41084.0. Samples: 379706960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:38:31,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 00:38:32,107][12883] Updated weights for policy 0, policy_version 23170 (0.0050) +[2024-06-18 00:38:34,771][12883] Updated weights for policy 0, policy_version 23180 (0.0045) +[2024-06-18 00:38:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 379846656. Throughput: 0: 41008.5. Samples: 379947120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:38:36,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:38:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023184_379846656.pth... +[2024-06-18 00:38:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022583_369999872.pth +[2024-06-18 00:38:39,956][12883] Updated weights for policy 0, policy_version 23190 (0.0042) +[2024-06-18 00:38:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 380043264. Throughput: 0: 40996.1. Samples: 380193200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:38:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:38:43,043][12883] Updated weights for policy 0, policy_version 23200 (0.0024) +[2024-06-18 00:38:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.8, 300 sec: 40876.7). Total num frames: 380223488. Throughput: 0: 40863.5. Samples: 380316620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 00:38:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:38:48,007][12883] Updated weights for policy 0, policy_version 23210 (0.0023) +[2024-06-18 00:38:51,095][12883] Updated weights for policy 0, policy_version 23220 (0.0047) +[2024-06-18 00:38:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 380452864. Throughput: 0: 41039.5. Samples: 380564240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 00:38:51,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:38:55,806][12883] Updated weights for policy 0, policy_version 23230 (0.0029) +[2024-06-18 00:38:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 380649472. Throughput: 0: 41072.3. Samples: 380811180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 00:38:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:38:58,873][12862] Signal inference workers to stop experience collection... (5350 times) +[2024-06-18 00:38:58,873][12862] Signal inference workers to resume experience collection... (5350 times) +[2024-06-18 00:38:58,916][12883] InferenceWorker_p0-w0: stopping experience collection (5350 times) +[2024-06-18 00:38:58,916][12883] InferenceWorker_p0-w0: resuming experience collection (5350 times) +[2024-06-18 00:38:59,235][12883] Updated weights for policy 0, policy_version 23240 (0.0031) +[2024-06-18 00:39:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 380846080. Throughput: 0: 40768.9. Samples: 380930140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 00:39:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:39:04,013][12883] Updated weights for policy 0, policy_version 23250 (0.0046) +[2024-06-18 00:39:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41098.9). Total num frames: 381075456. Throughput: 0: 40690.8. Samples: 381170520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:39:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:39:07,281][12883] Updated weights for policy 0, policy_version 23260 (0.0042) +[2024-06-18 00:39:11,962][12883] Updated weights for policy 0, policy_version 23270 (0.0047) +[2024-06-18 00:39:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 381255680. Throughput: 0: 40838.3. Samples: 381420780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:39:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:39:15,288][12883] Updated weights for policy 0, policy_version 23280 (0.0030) +[2024-06-18 00:39:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 381485056. Throughput: 0: 40780.0. Samples: 381542060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 00:39:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:39:19,979][12883] Updated weights for policy 0, policy_version 23290 (0.0045) +[2024-06-18 00:39:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 381681664. Throughput: 0: 41063.0. Samples: 381794960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:39:21,997][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:39:23,191][12883] Updated weights for policy 0, policy_version 23300 (0.0040) +[2024-06-18 00:39:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40140.9, 300 sec: 40876.7). Total num frames: 381861888. Throughput: 0: 40952.5. Samples: 382036060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:39:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:39:27,975][12883] Updated weights for policy 0, policy_version 23310 (0.0031) +[2024-06-18 00:39:31,399][12883] Updated weights for policy 0, policy_version 23320 (0.0039) +[2024-06-18 00:39:32,000][12645] Fps is (10 sec: 42572.1, 60 sec: 41774.9, 300 sec: 41153.5). Total num frames: 382107648. Throughput: 0: 40945.0. Samples: 382159400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:39:32,000][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:39:36,153][12883] Updated weights for policy 0, policy_version 23330 (0.0033) +[2024-06-18 00:39:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 382271488. Throughput: 0: 41103.6. Samples: 382413900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:39:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:39:39,167][12883] Updated weights for policy 0, policy_version 23340 (0.0036) +[2024-06-18 00:39:41,994][12645] Fps is (10 sec: 39346.4, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 382500864. Throughput: 0: 40971.2. Samples: 382654880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:39:41,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:39:44,670][12883] Updated weights for policy 0, policy_version 23350 (0.0036) +[2024-06-18 00:39:46,944][12883] Updated weights for policy 0, policy_version 23360 (0.0031) +[2024-06-18 00:39:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 382730240. Throughput: 0: 41244.4. Samples: 382786140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:39:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:39:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40414.0, 300 sec: 40821.2). Total num frames: 382877696. Throughput: 0: 41056.5. Samples: 383018060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 00:39:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:39:52,323][12883] Updated weights for policy 0, policy_version 23370 (0.0043) +[2024-06-18 00:39:55,055][12883] Updated weights for policy 0, policy_version 23380 (0.0039) +[2024-06-18 00:39:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 383123456. Throughput: 0: 40971.5. Samples: 383264500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 00:39:56,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:39:59,953][12883] Updated weights for policy 0, policy_version 23390 (0.0047) +[2024-06-18 00:40:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 383320064. Throughput: 0: 41201.7. Samples: 383396140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 00:40:01,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:40:02,901][12883] Updated weights for policy 0, policy_version 23400 (0.0033) +[2024-06-18 00:40:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 383516672. Throughput: 0: 40917.4. Samples: 383636240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 00:40:06,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:40:07,807][12883] Updated weights for policy 0, policy_version 23410 (0.0043) +[2024-06-18 00:40:10,694][12883] Updated weights for policy 0, policy_version 23420 (0.0032) +[2024-06-18 00:40:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 383729664. Throughput: 0: 41046.7. Samples: 383883160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:40:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:40:15,562][12883] Updated weights for policy 0, policy_version 23430 (0.0041) +[2024-06-18 00:40:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 383926272. Throughput: 0: 40982.6. Samples: 384003360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:40:16,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:40:19,276][12883] Updated weights for policy 0, policy_version 23440 (0.0034) +[2024-06-18 00:40:20,836][12862] Signal inference workers to stop experience collection... (5400 times) +[2024-06-18 00:40:20,837][12862] Signal inference workers to resume experience collection... (5400 times) +[2024-06-18 00:40:20,887][12883] InferenceWorker_p0-w0: stopping experience collection (5400 times) +[2024-06-18 00:40:20,887][12883] InferenceWorker_p0-w0: resuming experience collection (5400 times) +[2024-06-18 00:40:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 384139264. Throughput: 0: 41049.8. Samples: 384261140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:40:21,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:40:23,444][12883] Updated weights for policy 0, policy_version 23450 (0.0043) +[2024-06-18 00:40:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 41043.3). Total num frames: 384352256. Throughput: 0: 41051.4. Samples: 384502200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:40:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:40:27,002][12883] Updated weights for policy 0, policy_version 23460 (0.0047) +[2024-06-18 00:40:31,726][12883] Updated weights for policy 0, policy_version 23470 (0.0028) +[2024-06-18 00:40:31,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40418.2, 300 sec: 40877.4). Total num frames: 384532480. Throughput: 0: 41000.6. Samples: 384631160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:40:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:40:35,426][12883] Updated weights for policy 0, policy_version 23480 (0.0033) +[2024-06-18 00:40:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 384745472. Throughput: 0: 41342.1. Samples: 384878460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 00:40:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:40:37,115][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023484_384761856.pth... +[2024-06-18 00:40:37,171][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022883_374915072.pth +[2024-06-18 00:40:39,477][12883] Updated weights for policy 0, policy_version 23490 (0.0039) +[2024-06-18 00:40:41,994][12645] Fps is (10 sec: 45874.3, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 384991232. Throughput: 0: 41173.8. Samples: 385117320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 00:40:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:40:43,608][12883] Updated weights for policy 0, policy_version 23500 (0.0047) +[2024-06-18 00:40:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40413.9, 300 sec: 40988.1). Total num frames: 385155072. Throughput: 0: 41236.9. Samples: 385251800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 00:40:46,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:40:47,323][12883] Updated weights for policy 0, policy_version 23510 (0.0042) +[2024-06-18 00:40:51,289][12883] Updated weights for policy 0, policy_version 23520 (0.0042) +[2024-06-18 00:40:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41506.0, 300 sec: 40987.8). Total num frames: 385368064. Throughput: 0: 41201.3. Samples: 385490300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 00:40:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:40:55,078][12883] Updated weights for policy 0, policy_version 23530 (0.0046) +[2024-06-18 00:40:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 385581056. Throughput: 0: 41133.2. Samples: 385734160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 00:40:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:40:59,326][12883] Updated weights for policy 0, policy_version 23540 (0.0038) +[2024-06-18 00:41:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 385777664. Throughput: 0: 41150.6. Samples: 385855140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 00:41:01,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:41:03,281][12883] Updated weights for policy 0, policy_version 23550 (0.0039) +[2024-06-18 00:41:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 385974272. Throughput: 0: 40867.5. Samples: 386100180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 00:41:06,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:41:07,303][12883] Updated weights for policy 0, policy_version 23560 (0.0036) +[2024-06-18 00:41:11,086][12883] Updated weights for policy 0, policy_version 23570 (0.0043) +[2024-06-18 00:41:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 386187264. Throughput: 0: 41084.1. Samples: 386350980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 00:41:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:41:15,320][12883] Updated weights for policy 0, policy_version 23580 (0.0047) +[2024-06-18 00:41:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 386383872. Throughput: 0: 40970.5. Samples: 386474840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 00:41:16,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:41:19,214][12883] Updated weights for policy 0, policy_version 23590 (0.0045) +[2024-06-18 00:41:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 386596864. Throughput: 0: 40843.6. Samples: 386716420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 00:41:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:41:23,500][12883] Updated weights for policy 0, policy_version 23600 (0.0038) +[2024-06-18 00:41:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 386809856. Throughput: 0: 41005.9. Samples: 386962580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 00:41:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:41:27,097][12883] Updated weights for policy 0, policy_version 23610 (0.0051) +[2024-06-18 00:41:31,454][12883] Updated weights for policy 0, policy_version 23620 (0.0034) +[2024-06-18 00:41:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41232.9, 300 sec: 40987.7). Total num frames: 387006464. Throughput: 0: 40812.4. Samples: 387088360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 00:41:31,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:41:35,194][12883] Updated weights for policy 0, policy_version 23630 (0.0037) +[2024-06-18 00:41:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 387219456. Throughput: 0: 40898.8. Samples: 387330740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 00:41:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:41:39,405][12883] Updated weights for policy 0, policy_version 23640 (0.0039) +[2024-06-18 00:41:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.8, 300 sec: 41043.6). Total num frames: 387416064. Throughput: 0: 40952.8. Samples: 387577040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 00:41:41,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:41:43,061][12883] Updated weights for policy 0, policy_version 23650 (0.0034) +[2024-06-18 00:41:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 387612672. Throughput: 0: 40905.8. Samples: 387695900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 00:41:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:41:47,576][12883] Updated weights for policy 0, policy_version 23660 (0.0026) +[2024-06-18 00:41:51,526][12883] Updated weights for policy 0, policy_version 23670 (0.0035) +[2024-06-18 00:41:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 387825664. Throughput: 0: 40951.5. Samples: 387943000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 00:41:51,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:41:54,028][12862] Signal inference workers to stop experience collection... (5450 times) +[2024-06-18 00:41:54,075][12862] Signal inference workers to resume experience collection... (5450 times) +[2024-06-18 00:41:54,075][12883] InferenceWorker_p0-w0: stopping experience collection (5450 times) +[2024-06-18 00:41:54,095][12883] InferenceWorker_p0-w0: resuming experience collection (5450 times) +[2024-06-18 00:41:55,382][12883] Updated weights for policy 0, policy_version 23680 (0.0033) +[2024-06-18 00:41:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 388038656. Throughput: 0: 40837.4. Samples: 388188660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 00:41:56,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 00:41:59,495][12883] Updated weights for policy 0, policy_version 23690 (0.0040) +[2024-06-18 00:42:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 388235264. Throughput: 0: 40858.2. Samples: 388313460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 00:42:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:42:03,121][12883] Updated weights for policy 0, policy_version 23700 (0.0047) +[2024-06-18 00:42:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 388431872. Throughput: 0: 41007.1. Samples: 388561740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 00:42:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:42:07,623][12883] Updated weights for policy 0, policy_version 23710 (0.0039) +[2024-06-18 00:42:11,302][12883] Updated weights for policy 0, policy_version 23720 (0.0038) +[2024-06-18 00:42:11,996][12645] Fps is (10 sec: 40951.0, 60 sec: 40958.5, 300 sec: 40931.9). Total num frames: 388644864. Throughput: 0: 40923.7. Samples: 388804240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 00:42:11,997][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:42:15,611][12883] Updated weights for policy 0, policy_version 23730 (0.0036) +[2024-06-18 00:42:16,996][12645] Fps is (10 sec: 40950.9, 60 sec: 40958.4, 300 sec: 40987.4). Total num frames: 388841472. Throughput: 0: 40965.6. Samples: 388931900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:42:16,997][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:42:19,196][12883] Updated weights for policy 0, policy_version 23740 (0.0035) +[2024-06-18 00:42:21,996][12645] Fps is (10 sec: 40960.0, 60 sec: 40958.5, 300 sec: 41043.0). Total num frames: 389054464. Throughput: 0: 40847.8. Samples: 389168980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:42:21,996][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:42:23,474][12883] Updated weights for policy 0, policy_version 23750 (0.0038) +[2024-06-18 00:42:26,994][12645] Fps is (10 sec: 40969.7, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 389251072. Throughput: 0: 41111.3. Samples: 389427040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:42:26,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:42:27,154][12883] Updated weights for policy 0, policy_version 23760 (0.0034) +[2024-06-18 00:42:31,551][12883] Updated weights for policy 0, policy_version 23770 (0.0035) +[2024-06-18 00:42:31,994][12645] Fps is (10 sec: 40968.6, 60 sec: 40960.0, 300 sec: 40987.7). Total num frames: 389464064. Throughput: 0: 41144.8. Samples: 389547420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:42:31,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:42:35,263][12883] Updated weights for policy 0, policy_version 23780 (0.0044) +[2024-06-18 00:42:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 389693440. Throughput: 0: 40984.0. Samples: 389787280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:42:36,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:42:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023785_389693440.pth... +[2024-06-18 00:42:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023184_379846656.pth +[2024-06-18 00:42:39,597][12883] Updated weights for policy 0, policy_version 23790 (0.0029) +[2024-06-18 00:42:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 389873664. Throughput: 0: 41036.0. Samples: 390035280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:42:41,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:42:43,240][12883] Updated weights for policy 0, policy_version 23800 (0.0033) +[2024-06-18 00:42:46,996][12645] Fps is (10 sec: 37675.3, 60 sec: 40958.5, 300 sec: 41043.0). Total num frames: 390070272. Throughput: 0: 40923.4. Samples: 390155100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:42:46,996][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:42:47,585][12883] Updated weights for policy 0, policy_version 23810 (0.0031) +[2024-06-18 00:42:51,271][12883] Updated weights for policy 0, policy_version 23820 (0.0036) +[2024-06-18 00:42:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 390283264. Throughput: 0: 40905.5. Samples: 390402480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 00:42:51,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:42:55,671][12883] Updated weights for policy 0, policy_version 23830 (0.0029) +[2024-06-18 00:42:56,994][12645] Fps is (10 sec: 40968.7, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 390479872. Throughput: 0: 41058.9. Samples: 390651800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 00:42:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:42:59,428][12883] Updated weights for policy 0, policy_version 23840 (0.0054) +[2024-06-18 00:43:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 390692864. Throughput: 0: 40790.0. Samples: 390767360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 00:43:01,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:43:03,794][12883] Updated weights for policy 0, policy_version 23850 (0.0036) +[2024-06-18 00:43:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 390889472. Throughput: 0: 40970.9. Samples: 391012580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:43:06,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:43:07,301][12883] Updated weights for policy 0, policy_version 23860 (0.0043) +[2024-06-18 00:43:12,000][12645] Fps is (10 sec: 37660.0, 60 sec: 40411.2, 300 sec: 40875.8). Total num frames: 391069696. Throughput: 0: 40715.2. Samples: 391259480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:43:12,000][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:43:12,208][12883] Updated weights for policy 0, policy_version 23870 (0.0030) +[2024-06-18 00:43:12,658][12862] Signal inference workers to stop experience collection... (5500 times) +[2024-06-18 00:43:12,695][12883] InferenceWorker_p0-w0: stopping experience collection (5500 times) +[2024-06-18 00:43:12,704][12862] Signal inference workers to resume experience collection... (5500 times) +[2024-06-18 00:43:12,707][12883] InferenceWorker_p0-w0: resuming experience collection (5500 times) +[2024-06-18 00:43:15,094][12883] Updated weights for policy 0, policy_version 23880 (0.0038) +[2024-06-18 00:43:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40961.5, 300 sec: 40932.2). Total num frames: 391299072. Throughput: 0: 40648.5. Samples: 391376600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 00:43:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:43:19,895][12883] Updated weights for policy 0, policy_version 23890 (0.0033) +[2024-06-18 00:43:21,994][12645] Fps is (10 sec: 45903.3, 60 sec: 41234.5, 300 sec: 40932.2). Total num frames: 391528448. Throughput: 0: 40894.2. Samples: 391627520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 00:43:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:43:22,999][12883] Updated weights for policy 0, policy_version 23900 (0.0029) +[2024-06-18 00:43:26,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 391725056. Throughput: 0: 40815.6. Samples: 391871980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 00:43:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:43:27,856][12883] Updated weights for policy 0, policy_version 23910 (0.0036) +[2024-06-18 00:43:31,107][12883] Updated weights for policy 0, policy_version 23920 (0.0032) +[2024-06-18 00:43:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 391921664. Throughput: 0: 40838.3. Samples: 391992740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 00:43:31,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:43:35,691][12883] Updated weights for policy 0, policy_version 23930 (0.0040) +[2024-06-18 00:43:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 392134656. Throughput: 0: 40919.0. Samples: 392243840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 00:43:36,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:43:39,086][12883] Updated weights for policy 0, policy_version 23940 (0.0032) +[2024-06-18 00:43:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 392314880. Throughput: 0: 40758.7. Samples: 392485940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 00:43:41,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 00:43:43,564][12883] Updated weights for policy 0, policy_version 23950 (0.0030) +[2024-06-18 00:43:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41234.6, 300 sec: 40987.8). Total num frames: 392544256. Throughput: 0: 40870.3. Samples: 392606520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 00:43:46,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:43:47,102][12883] Updated weights for policy 0, policy_version 23960 (0.0036) +[2024-06-18 00:43:51,493][12883] Updated weights for policy 0, policy_version 23970 (0.0045) +[2024-06-18 00:43:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 392740864. Throughput: 0: 40833.5. Samples: 392850080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 00:43:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:43:55,187][12883] Updated weights for policy 0, policy_version 23980 (0.0033) +[2024-06-18 00:43:56,994][12645] Fps is (10 sec: 37682.4, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 392921088. Throughput: 0: 40902.4. Samples: 393099840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:43:56,995][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:43:59,380][12883] Updated weights for policy 0, policy_version 23990 (0.0028) +[2024-06-18 00:44:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 393134080. Throughput: 0: 40989.9. Samples: 393221140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:44:01,995][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:44:03,162][12883] Updated weights for policy 0, policy_version 24000 (0.0028) +[2024-06-18 00:44:06,994][12645] Fps is (10 sec: 40961.1, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 393330688. Throughput: 0: 40874.8. Samples: 393466880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 00:44:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:44:07,638][12883] Updated weights for policy 0, policy_version 24010 (0.0038) +[2024-06-18 00:44:11,351][12883] Updated weights for policy 0, policy_version 24020 (0.0022) +[2024-06-18 00:44:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41510.4, 300 sec: 40932.2). Total num frames: 393560064. Throughput: 0: 40828.8. Samples: 393709280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 00:44:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:44:15,774][12883] Updated weights for policy 0, policy_version 24030 (0.0037) +[2024-06-18 00:44:16,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 393756672. Throughput: 0: 40847.2. Samples: 393830860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 00:44:16,995][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:44:19,393][12883] Updated weights for policy 0, policy_version 24040 (0.0039) +[2024-06-18 00:44:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 393953280. Throughput: 0: 40733.8. Samples: 394076860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 00:44:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:44:23,765][12883] Updated weights for policy 0, policy_version 24050 (0.0031) +[2024-06-18 00:44:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40959.8, 300 sec: 40933.1). Total num frames: 394182656. Throughput: 0: 40730.2. Samples: 394318800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 00:44:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:44:27,513][12883] Updated weights for policy 0, policy_version 24060 (0.0046) +[2024-06-18 00:44:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 394346496. Throughput: 0: 40738.7. Samples: 394439760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 00:44:31,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:44:32,053][12883] Updated weights for policy 0, policy_version 24070 (0.0037) +[2024-06-18 00:44:35,620][12883] Updated weights for policy 0, policy_version 24080 (0.0040) +[2024-06-18 00:44:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 394575872. Throughput: 0: 40694.8. Samples: 394681360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 00:44:36,995][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:44:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024083_394575872.pth... +[2024-06-18 00:44:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023484_384761856.pth +[2024-06-18 00:44:40,193][12883] Updated weights for policy 0, policy_version 24090 (0.0043) +[2024-06-18 00:44:42,000][12645] Fps is (10 sec: 42571.5, 60 sec: 40955.8, 300 sec: 40820.3). Total num frames: 394772480. Throughput: 0: 40645.2. Samples: 394929120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) +[2024-06-18 00:44:42,001][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:44:43,634][12883] Updated weights for policy 0, policy_version 24100 (0.0043) +[2024-06-18 00:44:46,994][12645] Fps is (10 sec: 37684.0, 60 sec: 40140.8, 300 sec: 40932.2). Total num frames: 394952704. Throughput: 0: 40592.4. Samples: 395047800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) +[2024-06-18 00:44:47,000][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:44:48,121][12883] Updated weights for policy 0, policy_version 24110 (0.0038) +[2024-06-18 00:44:51,555][12883] Updated weights for policy 0, policy_version 24120 (0.0029) +[2024-06-18 00:44:51,994][12645] Fps is (10 sec: 42625.4, 60 sec: 40960.0, 300 sec: 40932.3). Total num frames: 395198464. Throughput: 0: 40588.4. Samples: 395293360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) +[2024-06-18 00:44:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:44:56,002][12883] Updated weights for policy 0, policy_version 24130 (0.0034) +[2024-06-18 00:44:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 395378688. Throughput: 0: 40755.5. Samples: 395543280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) +[2024-06-18 00:44:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:44:58,128][12862] Signal inference workers to stop experience collection... (5550 times) +[2024-06-18 00:44:58,128][12862] Signal inference workers to resume experience collection... (5550 times) +[2024-06-18 00:44:58,152][12883] InferenceWorker_p0-w0: stopping experience collection (5550 times) +[2024-06-18 00:44:58,153][12883] InferenceWorker_p0-w0: resuming experience collection (5550 times) +[2024-06-18 00:44:59,686][12883] Updated weights for policy 0, policy_version 24140 (0.0034) +[2024-06-18 00:45:01,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 395575296. Throughput: 0: 40714.3. Samples: 395663000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 00:45:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:45:04,174][12883] Updated weights for policy 0, policy_version 24150 (0.0028) +[2024-06-18 00:45:07,000][12645] Fps is (10 sec: 40934.6, 60 sec: 40955.7, 300 sec: 40875.8). Total num frames: 395788288. Throughput: 0: 40800.5. Samples: 395913140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 00:45:07,000][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:45:07,661][12883] Updated weights for policy 0, policy_version 24160 (0.0035) +[2024-06-18 00:45:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.9, 300 sec: 40876.7). Total num frames: 395984896. Throughput: 0: 40852.6. Samples: 396157160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 00:45:11,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:45:12,166][12883] Updated weights for policy 0, policy_version 24170 (0.0033) +[2024-06-18 00:45:15,656][12883] Updated weights for policy 0, policy_version 24180 (0.0034) +[2024-06-18 00:45:16,994][12645] Fps is (10 sec: 42624.5, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 396214272. Throughput: 0: 40898.9. Samples: 396280220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 00:45:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:45:20,229][12883] Updated weights for policy 0, policy_version 24190 (0.0040) +[2024-06-18 00:45:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 396410880. Throughput: 0: 40828.2. Samples: 396518620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 00:45:21,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:45:23,674][12883] Updated weights for policy 0, policy_version 24200 (0.0038) +[2024-06-18 00:45:26,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40140.9, 300 sec: 40876.7). Total num frames: 396591104. Throughput: 0: 40848.4. Samples: 396767040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 00:45:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:45:28,233][12883] Updated weights for policy 0, policy_version 24210 (0.0029) +[2024-06-18 00:45:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 396804096. Throughput: 0: 40861.4. Samples: 396886560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 00:45:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:45:32,085][12883] Updated weights for policy 0, policy_version 24220 (0.0037) +[2024-06-18 00:45:36,015][12883] Updated weights for policy 0, policy_version 24230 (0.0035) +[2024-06-18 00:45:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40687.1, 300 sec: 40765.6). Total num frames: 397017088. Throughput: 0: 40852.9. Samples: 397131740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 00:45:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:45:39,998][12883] Updated weights for policy 0, policy_version 24240 (0.0041) +[2024-06-18 00:45:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40691.2, 300 sec: 40876.7). Total num frames: 397213696. Throughput: 0: 40701.9. Samples: 397374860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 00:45:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:45:43,846][12883] Updated weights for policy 0, policy_version 24250 (0.0032) +[2024-06-18 00:45:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 397410304. Throughput: 0: 40648.2. Samples: 397492160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:45:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:45:48,250][12883] Updated weights for policy 0, policy_version 24260 (0.0028) +[2024-06-18 00:45:51,884][12883] Updated weights for policy 0, policy_version 24270 (0.0040) +[2024-06-18 00:45:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 397639680. Throughput: 0: 40585.6. Samples: 397739240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:45:51,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:45:56,283][12883] Updated weights for policy 0, policy_version 24280 (0.0037) +[2024-06-18 00:45:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 397819904. Throughput: 0: 40554.3. Samples: 397982100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:45:56,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:46:00,423][12883] Updated weights for policy 0, policy_version 24290 (0.0041) +[2024-06-18 00:46:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 398016512. Throughput: 0: 40495.7. Samples: 398102520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:46:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:46:04,175][12883] Updated weights for policy 0, policy_version 24300 (0.0030) +[2024-06-18 00:46:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40691.2, 300 sec: 40821.2). Total num frames: 398229504. Throughput: 0: 40667.7. Samples: 398348660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 00:46:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:46:08,371][12883] Updated weights for policy 0, policy_version 24310 (0.0038) +[2024-06-18 00:46:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 398426112. Throughput: 0: 40660.8. Samples: 398596780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 00:46:11,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:46:12,213][12883] Updated weights for policy 0, policy_version 24320 (0.0038) +[2024-06-18 00:46:16,143][12883] Updated weights for policy 0, policy_version 24330 (0.0034) +[2024-06-18 00:46:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40687.1, 300 sec: 40876.7). Total num frames: 398655488. Throughput: 0: 40612.1. Samples: 398714100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 00:46:16,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:46:20,623][12883] Updated weights for policy 0, policy_version 24340 (0.0033) +[2024-06-18 00:46:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40140.9, 300 sec: 40710.1). Total num frames: 398819328. Throughput: 0: 40676.5. Samples: 398962180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 00:46:21,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:46:24,003][12862] Signal inference workers to stop experience collection... (5600 times) +[2024-06-18 00:46:24,004][12862] Signal inference workers to resume experience collection... (5600 times) +[2024-06-18 00:46:24,019][12883] InferenceWorker_p0-w0: stopping experience collection (5600 times) +[2024-06-18 00:46:24,019][12883] InferenceWorker_p0-w0: resuming experience collection (5600 times) +[2024-06-18 00:46:24,200][12883] Updated weights for policy 0, policy_version 24350 (0.0041) +[2024-06-18 00:46:26,994][12645] Fps is (10 sec: 36044.4, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 399015936. Throughput: 0: 40602.2. Samples: 399201960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 00:46:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:46:28,556][12883] Updated weights for policy 0, policy_version 24360 (0.0042) +[2024-06-18 00:46:31,994][12645] Fps is (10 sec: 44235.8, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 399261696. Throughput: 0: 40751.3. Samples: 399325980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 00:46:31,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:46:32,205][12883] Updated weights for policy 0, policy_version 24370 (0.0034) +[2024-06-18 00:46:36,873][12883] Updated weights for policy 0, policy_version 24380 (0.0036) +[2024-06-18 00:46:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40413.8, 300 sec: 40765.6). Total num frames: 399441920. Throughput: 0: 40644.5. Samples: 399568240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 00:46:36,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:46:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024381_399458304.pth... +[2024-06-18 00:46:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023785_389693440.pth +[2024-06-18 00:46:40,628][12883] Updated weights for policy 0, policy_version 24390 (0.0037) +[2024-06-18 00:46:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40686.8, 300 sec: 40821.2). Total num frames: 399654912. Throughput: 0: 40541.7. Samples: 399806480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 00:46:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:46:44,658][12883] Updated weights for policy 0, policy_version 24400 (0.0034) +[2024-06-18 00:46:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 399851520. Throughput: 0: 40636.4. Samples: 399931160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 00:46:46,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 00:46:48,685][12883] Updated weights for policy 0, policy_version 24410 (0.0036) +[2024-06-18 00:46:51,994][12645] Fps is (10 sec: 37683.9, 60 sec: 39867.8, 300 sec: 40654.5). Total num frames: 400031744. Throughput: 0: 40574.7. Samples: 400174520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 00:46:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:46:52,574][12883] Updated weights for policy 0, policy_version 24420 (0.0035) +[2024-06-18 00:46:56,870][12883] Updated weights for policy 0, policy_version 24430 (0.0034) +[2024-06-18 00:46:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 400261120. Throughput: 0: 40540.3. Samples: 400421100. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) +[2024-06-18 00:46:56,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:47:00,795][12883] Updated weights for policy 0, policy_version 24440 (0.0037) +[2024-06-18 00:47:01,994][12645] Fps is (10 sec: 44235.8, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 400474112. Throughput: 0: 40781.6. Samples: 400549280. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) +[2024-06-18 00:47:01,995][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:47:05,018][12883] Updated weights for policy 0, policy_version 24450 (0.0029) +[2024-06-18 00:47:06,994][12645] Fps is (10 sec: 40961.1, 60 sec: 40687.0, 300 sec: 40765.9). Total num frames: 400670720. Throughput: 0: 40454.2. Samples: 400782620. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) +[2024-06-18 00:47:06,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:47:08,649][12883] Updated weights for policy 0, policy_version 24460 (0.0037) +[2024-06-18 00:47:11,994][12645] Fps is (10 sec: 37684.0, 60 sec: 40414.0, 300 sec: 40710.4). Total num frames: 400850944. Throughput: 0: 40664.1. Samples: 401031840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) +[2024-06-18 00:47:11,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:47:13,072][12883] Updated weights for policy 0, policy_version 24470 (0.0034) +[2024-06-18 00:47:16,486][12883] Updated weights for policy 0, policy_version 24480 (0.0041) +[2024-06-18 00:47:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40413.8, 300 sec: 40765.9). Total num frames: 401080320. Throughput: 0: 40509.9. Samples: 401148920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) +[2024-06-18 00:47:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:47:20,914][12883] Updated weights for policy 0, policy_version 24490 (0.0026) +[2024-06-18 00:47:21,994][12645] Fps is (10 sec: 44236.0, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 401293312. Throughput: 0: 40772.8. Samples: 401403020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) +[2024-06-18 00:47:21,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:47:24,363][12883] Updated weights for policy 0, policy_version 24500 (0.0034) +[2024-06-18 00:47:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 401473536. Throughput: 0: 40913.3. Samples: 401647580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 00:47:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:47:29,038][12883] Updated weights for policy 0, policy_version 24510 (0.0033) +[2024-06-18 00:47:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40687.1, 300 sec: 40710.1). Total num frames: 401702912. Throughput: 0: 40896.5. Samples: 401771500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 00:47:31,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:47:32,182][12883] Updated weights for policy 0, policy_version 24520 (0.0029) +[2024-06-18 00:47:36,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 401883136. Throughput: 0: 40987.1. Samples: 402018940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 00:47:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:47:37,179][12883] Updated weights for policy 0, policy_version 24530 (0.0042) +[2024-06-18 00:47:40,222][12883] Updated weights for policy 0, policy_version 24540 (0.0041) +[2024-06-18 00:47:41,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40413.8, 300 sec: 40710.4). Total num frames: 402079744. Throughput: 0: 40864.1. Samples: 402259980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 00:47:41,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:47:45,334][12883] Updated weights for policy 0, policy_version 24550 (0.0034) +[2024-06-18 00:47:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41233.1, 300 sec: 40821.1). Total num frames: 402325504. Throughput: 0: 40859.2. Samples: 402387940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:47:46,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 00:47:48,730][12883] Updated weights for policy 0, policy_version 24560 (0.0034) +[2024-06-18 00:47:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40959.9, 300 sec: 40710.1). Total num frames: 402489344. Throughput: 0: 40925.3. Samples: 402624260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:47:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:47:53,463][12883] Updated weights for policy 0, policy_version 24570 (0.0042) +[2024-06-18 00:47:53,962][12862] Signal inference workers to stop experience collection... (5650 times) +[2024-06-18 00:47:53,962][12862] Signal inference workers to resume experience collection... (5650 times) +[2024-06-18 00:47:53,993][12883] InferenceWorker_p0-w0: stopping experience collection (5650 times) +[2024-06-18 00:47:53,993][12883] InferenceWorker_p0-w0: resuming experience collection (5650 times) +[2024-06-18 00:47:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40687.1, 300 sec: 40710.1). Total num frames: 402702336. Throughput: 0: 40846.6. Samples: 402869940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:47:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:47:57,062][12883] Updated weights for policy 0, policy_version 24580 (0.0039) +[2024-06-18 00:48:01,396][12883] Updated weights for policy 0, policy_version 24590 (0.0032) +[2024-06-18 00:48:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 402915328. Throughput: 0: 40982.6. Samples: 402993140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 00:48:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:48:04,842][12883] Updated weights for policy 0, policy_version 24600 (0.0031) +[2024-06-18 00:48:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40959.9, 300 sec: 40877.6). Total num frames: 403128320. Throughput: 0: 40815.2. Samples: 403239700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 00:48:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:48:09,231][12883] Updated weights for policy 0, policy_version 24610 (0.0043) +[2024-06-18 00:48:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 403308544. Throughput: 0: 40694.8. Samples: 403478840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 00:48:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:48:13,309][12883] Updated weights for policy 0, policy_version 24620 (0.0030) +[2024-06-18 00:48:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 403505152. Throughput: 0: 40604.4. Samples: 403598700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:48:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:48:17,257][12883] Updated weights for policy 0, policy_version 24630 (0.0040) +[2024-06-18 00:48:21,128][12883] Updated weights for policy 0, policy_version 24640 (0.0051) +[2024-06-18 00:48:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 403734528. Throughput: 0: 40606.5. Samples: 403846240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:48:21,998][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:48:25,110][12883] Updated weights for policy 0, policy_version 24650 (0.0038) +[2024-06-18 00:48:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41233.1, 300 sec: 40765.6). Total num frames: 403947520. Throughput: 0: 40715.6. Samples: 404092180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:48:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:48:28,849][12883] Updated weights for policy 0, policy_version 24660 (0.0045) +[2024-06-18 00:48:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 404127744. Throughput: 0: 40607.5. Samples: 404215280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:48:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:48:33,053][12883] Updated weights for policy 0, policy_version 24670 (0.0048) +[2024-06-18 00:48:36,874][12883] Updated weights for policy 0, policy_version 24680 (0.0034) +[2024-06-18 00:48:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 40821.2). Total num frames: 404357120. Throughput: 0: 40793.4. Samples: 404459960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 00:48:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:48:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024680_404357120.pth... +[2024-06-18 00:48:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024083_394575872.pth +[2024-06-18 00:48:41,402][12883] Updated weights for policy 0, policy_version 24690 (0.0037) +[2024-06-18 00:48:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 404537344. Throughput: 0: 40651.5. Samples: 404699260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 00:48:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:48:45,496][12883] Updated weights for policy 0, policy_version 24700 (0.0047) +[2024-06-18 00:48:46,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40413.7, 300 sec: 40710.0). Total num frames: 404750336. Throughput: 0: 40552.4. Samples: 404818000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 00:48:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:48:49,303][12883] Updated weights for policy 0, policy_version 24710 (0.0036) +[2024-06-18 00:48:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 404930560. Throughput: 0: 40547.6. Samples: 405064340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) +[2024-06-18 00:48:51,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:48:53,575][12883] Updated weights for policy 0, policy_version 24720 (0.0030) +[2024-06-18 00:48:56,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 405159936. Throughput: 0: 40611.1. Samples: 405306340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) +[2024-06-18 00:48:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:48:57,097][12883] Updated weights for policy 0, policy_version 24730 (0.0031) +[2024-06-18 00:49:01,511][12883] Updated weights for policy 0, policy_version 24740 (0.0041) +[2024-06-18 00:49:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 405356544. Throughput: 0: 40884.8. Samples: 405438520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) +[2024-06-18 00:49:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:49:05,127][12883] Updated weights for policy 0, policy_version 24750 (0.0043) +[2024-06-18 00:49:06,994][12645] Fps is (10 sec: 37682.5, 60 sec: 40140.7, 300 sec: 40599.0). Total num frames: 405536768. Throughput: 0: 40717.2. Samples: 405678520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:49:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:49:09,449][12883] Updated weights for policy 0, policy_version 24760 (0.0037) +[2024-06-18 00:49:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.0, 300 sec: 40765.6). Total num frames: 405782528. Throughput: 0: 40528.1. Samples: 405915940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:49:11,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:49:13,119][12883] Updated weights for policy 0, policy_version 24770 (0.0040) +[2024-06-18 00:49:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 405962752. Throughput: 0: 40750.2. Samples: 406049040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:49:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:49:17,475][12883] Updated weights for policy 0, policy_version 24780 (0.0033) +[2024-06-18 00:49:21,049][12883] Updated weights for policy 0, policy_version 24790 (0.0035) +[2024-06-18 00:49:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 406175744. Throughput: 0: 40596.8. Samples: 406286820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 00:49:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:49:25,446][12883] Updated weights for policy 0, policy_version 24800 (0.0041) +[2024-06-18 00:49:26,996][12645] Fps is (10 sec: 42588.8, 60 sec: 40685.4, 300 sec: 40820.8). Total num frames: 406388736. Throughput: 0: 40953.0. Samples: 406542240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 00:49:26,997][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:49:28,905][12883] Updated weights for policy 0, policy_version 24810 (0.0036) +[2024-06-18 00:49:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40687.0, 300 sec: 40654.6). Total num frames: 406568960. Throughput: 0: 41020.6. Samples: 406663920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 00:49:31,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 00:49:33,385][12883] Updated weights for policy 0, policy_version 24820 (0.0045) +[2024-06-18 00:49:34,799][12862] Signal inference workers to stop experience collection... (5700 times) +[2024-06-18 00:49:34,799][12862] Signal inference workers to resume experience collection... (5700 times) +[2024-06-18 00:49:34,815][12883] InferenceWorker_p0-w0: stopping experience collection (5700 times) +[2024-06-18 00:49:34,846][12883] InferenceWorker_p0-w0: resuming experience collection (5700 times) +[2024-06-18 00:49:36,786][12883] Updated weights for policy 0, policy_version 24830 (0.0040) +[2024-06-18 00:49:36,994][12645] Fps is (10 sec: 42608.4, 60 sec: 40960.0, 300 sec: 40822.0). Total num frames: 406814720. Throughput: 0: 40949.7. Samples: 406907080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 00:49:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:49:41,298][12883] Updated weights for policy 0, policy_version 24840 (0.0038) +[2024-06-18 00:49:42,000][12645] Fps is (10 sec: 42571.6, 60 sec: 40955.8, 300 sec: 40820.3). Total num frames: 406994944. Throughput: 0: 41060.0. Samples: 407154300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 00:49:42,001][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:49:45,311][12883] Updated weights for policy 0, policy_version 24850 (0.0034) +[2024-06-18 00:49:46,994][12645] Fps is (10 sec: 37682.6, 60 sec: 40687.0, 300 sec: 40654.5). Total num frames: 407191552. Throughput: 0: 40766.6. Samples: 407273020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 00:49:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:49:49,643][12883] Updated weights for policy 0, policy_version 24860 (0.0029) +[2024-06-18 00:49:51,994][12645] Fps is (10 sec: 44264.1, 60 sec: 41779.1, 300 sec: 40876.7). Total num frames: 407437312. Throughput: 0: 40991.2. Samples: 407523120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 00:49:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:49:53,215][12883] Updated weights for policy 0, policy_version 24870 (0.0044) +[2024-06-18 00:49:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 407601152. Throughput: 0: 41217.2. Samples: 407770720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 00:49:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:49:57,737][12883] Updated weights for policy 0, policy_version 24880 (0.0038) +[2024-06-18 00:50:01,061][12883] Updated weights for policy 0, policy_version 24890 (0.0033) +[2024-06-18 00:50:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 40766.5). Total num frames: 407814144. Throughput: 0: 40744.0. Samples: 407882520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 00:50:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:50:05,701][12883] Updated weights for policy 0, policy_version 24900 (0.0040) +[2024-06-18 00:50:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 40821.1). Total num frames: 408027136. Throughput: 0: 41202.1. Samples: 408140920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 00:50:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:50:08,907][12883] Updated weights for policy 0, policy_version 24910 (0.0036) +[2024-06-18 00:50:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40654.6). Total num frames: 408207360. Throughput: 0: 41000.8. Samples: 408387180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 00:50:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:50:14,188][12883] Updated weights for policy 0, policy_version 24920 (0.0036) +[2024-06-18 00:50:16,514][12883] Updated weights for policy 0, policy_version 24930 (0.0032) +[2024-06-18 00:50:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 40821.1). Total num frames: 408453120. Throughput: 0: 41059.8. Samples: 408511620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 00:50:16,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:50:21,962][12883] Updated weights for policy 0, policy_version 24940 (0.0030) +[2024-06-18 00:50:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 408616960. Throughput: 0: 41051.5. Samples: 408754400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 00:50:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:50:24,539][12883] Updated weights for policy 0, policy_version 24950 (0.0043) +[2024-06-18 00:50:26,996][12645] Fps is (10 sec: 37675.4, 60 sec: 40687.0, 300 sec: 40765.3). Total num frames: 408829952. Throughput: 0: 41064.1. Samples: 409002020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 00:50:26,996][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:50:29,766][12883] Updated weights for policy 0, policy_version 24960 (0.0041) +[2024-06-18 00:50:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 41779.2, 300 sec: 40876.7). Total num frames: 409075712. Throughput: 0: 41227.2. Samples: 409128240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 00:50:31,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:50:32,600][12883] Updated weights for policy 0, policy_version 24970 (0.0035) +[2024-06-18 00:50:36,994][12645] Fps is (10 sec: 40969.5, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 409239552. Throughput: 0: 41105.5. Samples: 409372860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 00:50:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:50:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024978_409239552.pth... +[2024-06-18 00:50:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024381_399458304.pth +[2024-06-18 00:50:37,378][12862] Signal inference workers to stop experience collection... (5750 times) +[2024-06-18 00:50:37,413][12883] InferenceWorker_p0-w0: stopping experience collection (5750 times) +[2024-06-18 00:50:37,435][12862] Signal inference workers to resume experience collection... (5750 times) +[2024-06-18 00:50:37,436][12883] InferenceWorker_p0-w0: resuming experience collection (5750 times) +[2024-06-18 00:50:37,574][12883] Updated weights for policy 0, policy_version 24980 (0.0033) +[2024-06-18 00:50:40,631][12883] Updated weights for policy 0, policy_version 24990 (0.0044) +[2024-06-18 00:50:41,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40964.3, 300 sec: 40821.1). Total num frames: 409452544. Throughput: 0: 41061.9. Samples: 409618500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 00:50:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:50:45,618][12883] Updated weights for policy 0, policy_version 25000 (0.0035) +[2024-06-18 00:50:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.2, 300 sec: 40821.2). Total num frames: 409681920. Throughput: 0: 41386.7. Samples: 409744920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 00:50:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:50:49,166][12883] Updated weights for policy 0, policy_version 25010 (0.0051) +[2024-06-18 00:50:51,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40413.9, 300 sec: 40821.1). Total num frames: 409862144. Throughput: 0: 41136.5. Samples: 409992060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 00:50:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:50:53,557][12883] Updated weights for policy 0, policy_version 25020 (0.0037) +[2024-06-18 00:50:56,793][12883] Updated weights for policy 0, policy_version 25030 (0.0032) +[2024-06-18 00:50:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 40932.2). Total num frames: 410091520. Throughput: 0: 41111.9. Samples: 410237220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 00:50:56,996][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:51:01,414][12883] Updated weights for policy 0, policy_version 25040 (0.0040) +[2024-06-18 00:51:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 410271744. Throughput: 0: 41049.1. Samples: 410358820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 00:51:01,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:51:04,763][12883] Updated weights for policy 0, policy_version 25050 (0.0032) +[2024-06-18 00:51:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.2, 300 sec: 40932.2). Total num frames: 410501120. Throughput: 0: 41152.5. Samples: 410606260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 00:51:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:51:09,831][12883] Updated weights for policy 0, policy_version 25060 (0.0036) +[2024-06-18 00:51:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 40876.7). Total num frames: 410714112. Throughput: 0: 41100.7. Samples: 410851460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 00:51:11,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:51:12,644][12883] Updated weights for policy 0, policy_version 25070 (0.0031) +[2024-06-18 00:51:16,994][12645] Fps is (10 sec: 36045.2, 60 sec: 40141.0, 300 sec: 40821.2). Total num frames: 410861568. Throughput: 0: 41055.2. Samples: 410975720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 00:51:16,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 00:51:17,437][12883] Updated weights for policy 0, policy_version 25080 (0.0044) +[2024-06-18 00:51:20,907][12883] Updated weights for policy 0, policy_version 25090 (0.0044) +[2024-06-18 00:51:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 411090944. Throughput: 0: 41009.2. Samples: 411218280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 00:51:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:51:25,276][12883] Updated weights for policy 0, policy_version 25100 (0.0032) +[2024-06-18 00:51:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 41507.7, 300 sec: 40876.7). Total num frames: 411320320. Throughput: 0: 41024.4. Samples: 411464600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 00:51:26,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:51:28,912][12883] Updated weights for policy 0, policy_version 25110 (0.0043) +[2024-06-18 00:51:31,996][12645] Fps is (10 sec: 42588.9, 60 sec: 40685.4, 300 sec: 40931.9). Total num frames: 411516928. Throughput: 0: 41035.3. Samples: 411591600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 00:51:31,997][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:51:32,898][12883] Updated weights for policy 0, policy_version 25120 (0.0041) +[2024-06-18 00:51:36,763][12883] Updated weights for policy 0, policy_version 25130 (0.0056) +[2024-06-18 00:51:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 411729920. Throughput: 0: 41024.0. Samples: 411838140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 00:51:37,000][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:51:40,842][12883] Updated weights for policy 0, policy_version 25140 (0.0036) +[2024-06-18 00:51:41,994][12645] Fps is (10 sec: 39330.0, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 411910144. Throughput: 0: 41079.9. Samples: 412085820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:51:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:51:44,883][12883] Updated weights for policy 0, policy_version 25150 (0.0033) +[2024-06-18 00:51:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 412106752. Throughput: 0: 41087.9. Samples: 412207780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:51:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:51:48,739][12883] Updated weights for policy 0, policy_version 25160 (0.0035) +[2024-06-18 00:51:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 412336128. Throughput: 0: 41012.3. Samples: 412451820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 00:51:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:51:53,117][12883] Updated weights for policy 0, policy_version 25170 (0.0042) +[2024-06-18 00:51:56,635][12862] Signal inference workers to stop experience collection... (5800 times) +[2024-06-18 00:51:56,635][12862] Signal inference workers to resume experience collection... (5800 times) +[2024-06-18 00:51:56,649][12883] InferenceWorker_p0-w0: stopping experience collection (5800 times) +[2024-06-18 00:51:56,650][12883] InferenceWorker_p0-w0: resuming experience collection (5800 times) +[2024-06-18 00:51:56,779][12883] Updated weights for policy 0, policy_version 25180 (0.0029) +[2024-06-18 00:51:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40960.1, 300 sec: 40932.3). Total num frames: 412549120. Throughput: 0: 41086.7. Samples: 412700360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 00:51:56,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 00:52:01,373][12883] Updated weights for policy 0, policy_version 25190 (0.0035) +[2024-06-18 00:52:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 412729344. Throughput: 0: 41092.8. Samples: 412824900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 00:52:01,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:52:04,643][12883] Updated weights for policy 0, policy_version 25200 (0.0045) +[2024-06-18 00:52:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 412958720. Throughput: 0: 41091.1. Samples: 413067380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 00:52:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:52:09,202][12883] Updated weights for policy 0, policy_version 25210 (0.0036) +[2024-06-18 00:52:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 413155328. Throughput: 0: 41100.3. Samples: 413314120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 00:52:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:52:12,837][12883] Updated weights for policy 0, policy_version 25220 (0.0039) +[2024-06-18 00:52:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.0, 300 sec: 40821.2). Total num frames: 413335552. Throughput: 0: 40985.6. Samples: 413435860. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 00:52:16,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:52:17,196][12883] Updated weights for policy 0, policy_version 25230 (0.0040) +[2024-06-18 00:52:21,017][12883] Updated weights for policy 0, policy_version 25240 (0.0038) +[2024-06-18 00:52:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 413581312. Throughput: 0: 40900.0. Samples: 413678640. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 00:52:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:52:25,261][12883] Updated weights for policy 0, policy_version 25250 (0.0048) +[2024-06-18 00:52:26,996][12645] Fps is (10 sec: 40950.9, 60 sec: 40412.3, 300 sec: 40820.8). Total num frames: 413745152. Throughput: 0: 40837.2. Samples: 413923580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 00:52:26,996][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 00:52:29,030][12883] Updated weights for policy 0, policy_version 25260 (0.0040) +[2024-06-18 00:52:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40688.5, 300 sec: 40932.2). Total num frames: 413958144. Throughput: 0: 40699.6. Samples: 414039260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:52:31,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 00:52:33,242][12883] Updated weights for policy 0, policy_version 25270 (0.0038) +[2024-06-18 00:52:36,994][12645] Fps is (10 sec: 42607.7, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 414171136. Throughput: 0: 40753.9. Samples: 414285740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:52:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:52:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025280_414187520.pth... +[2024-06-18 00:52:37,008][12883] Updated weights for policy 0, policy_version 25280 (0.0050) +[2024-06-18 00:52:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024680_404357120.pth +[2024-06-18 00:52:41,632][12883] Updated weights for policy 0, policy_version 25290 (0.0040) +[2024-06-18 00:52:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 40821.1). Total num frames: 414367744. Throughput: 0: 40855.9. Samples: 414538880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 00:52:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:52:44,787][12883] Updated weights for policy 0, policy_version 25300 (0.0031) +[2024-06-18 00:52:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 414580736. Throughput: 0: 40740.4. Samples: 414658220. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 00:52:46,995][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:52:49,543][12883] Updated weights for policy 0, policy_version 25310 (0.0033) +[2024-06-18 00:52:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 414793728. Throughput: 0: 40963.6. Samples: 414910740. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 00:52:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:52:52,660][12883] Updated weights for policy 0, policy_version 25320 (0.0045) +[2024-06-18 00:52:57,000][12645] Fps is (10 sec: 39297.5, 60 sec: 40409.6, 300 sec: 40875.8). Total num frames: 414973952. Throughput: 0: 40911.3. Samples: 415155380. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 00:52:57,000][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:52:57,415][12883] Updated weights for policy 0, policy_version 25330 (0.0043) +[2024-06-18 00:53:00,434][12883] Updated weights for policy 0, policy_version 25340 (0.0037) +[2024-06-18 00:53:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 40987.8). Total num frames: 415219712. Throughput: 0: 40879.1. Samples: 415275420. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 00:53:01,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:53:05,867][12883] Updated weights for policy 0, policy_version 25350 (0.0028) +[2024-06-18 00:53:06,994][12645] Fps is (10 sec: 40984.9, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 415383552. Throughput: 0: 41076.8. Samples: 415527100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 00:53:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:53:08,274][12883] Updated weights for policy 0, policy_version 25360 (0.0047) +[2024-06-18 00:53:11,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 415596544. Throughput: 0: 41009.5. Samples: 415768920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 00:53:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:53:13,673][12883] Updated weights for policy 0, policy_version 25370 (0.0031) +[2024-06-18 00:53:15,713][12862] Signal inference workers to stop experience collection... (5850 times) +[2024-06-18 00:53:15,713][12862] Signal inference workers to resume experience collection... (5850 times) +[2024-06-18 00:53:15,728][12883] InferenceWorker_p0-w0: stopping experience collection (5850 times) +[2024-06-18 00:53:15,728][12883] InferenceWorker_p0-w0: resuming experience collection (5850 times) +[2024-06-18 00:53:16,668][12883] Updated weights for policy 0, policy_version 25380 (0.0051) +[2024-06-18 00:53:16,994][12645] Fps is (10 sec: 45875.9, 60 sec: 41779.2, 300 sec: 41043.3). Total num frames: 415842304. Throughput: 0: 41132.0. Samples: 415890200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 00:53:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:53:21,409][12883] Updated weights for policy 0, policy_version 25390 (0.0048) +[2024-06-18 00:53:21,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40140.8, 300 sec: 40821.2). Total num frames: 415989760. Throughput: 0: 41116.5. Samples: 416135980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 00:53:21,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:53:24,598][12883] Updated weights for policy 0, policy_version 25400 (0.0035) +[2024-06-18 00:53:26,994][12645] Fps is (10 sec: 36044.8, 60 sec: 40961.5, 300 sec: 40932.2). Total num frames: 416202752. Throughput: 0: 40964.9. Samples: 416382300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 00:53:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:53:29,667][12883] Updated weights for policy 0, policy_version 25410 (0.0037) +[2024-06-18 00:53:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 416415744. Throughput: 0: 41185.3. Samples: 416511560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 00:53:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:53:32,548][12883] Updated weights for policy 0, policy_version 25420 (0.0036) +[2024-06-18 00:53:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 416612352. Throughput: 0: 41048.0. Samples: 416757900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 00:53:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:53:37,419][12883] Updated weights for policy 0, policy_version 25430 (0.0038) +[2024-06-18 00:53:40,494][12883] Updated weights for policy 0, policy_version 25440 (0.0041) +[2024-06-18 00:53:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 416858112. Throughput: 0: 40822.9. Samples: 416992160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 00:53:42,003][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:53:45,010][12883] Updated weights for policy 0, policy_version 25450 (0.0029) +[2024-06-18 00:53:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 417038336. Throughput: 0: 41079.1. Samples: 417123980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 00:53:46,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:53:48,780][12883] Updated weights for policy 0, policy_version 25460 (0.0030) +[2024-06-18 00:53:51,994][12645] Fps is (10 sec: 36044.8, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 417218560. Throughput: 0: 40923.6. Samples: 417368660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 00:53:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:53:52,954][12883] Updated weights for policy 0, policy_version 25470 (0.0038) +[2024-06-18 00:53:56,474][12883] Updated weights for policy 0, policy_version 25480 (0.0030) +[2024-06-18 00:53:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41510.4, 300 sec: 41043.3). Total num frames: 417464320. Throughput: 0: 40964.0. Samples: 417612300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 00:53:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:54:01,171][12883] Updated weights for policy 0, policy_version 25490 (0.0030) +[2024-06-18 00:54:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 417660928. Throughput: 0: 41212.1. Samples: 417744740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 00:54:01,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:54:04,289][12883] Updated weights for policy 0, policy_version 25500 (0.0041) +[2024-06-18 00:54:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 417841152. Throughput: 0: 41081.3. Samples: 417984640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 00:54:06,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 00:54:09,081][12883] Updated weights for policy 0, policy_version 25510 (0.0030) +[2024-06-18 00:54:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 418086912. Throughput: 0: 41095.2. Samples: 418231580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 00:54:11,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 00:54:12,415][12883] Updated weights for policy 0, policy_version 25520 (0.0033) +[2024-06-18 00:54:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40140.8, 300 sec: 40932.2). Total num frames: 418250752. Throughput: 0: 41069.5. Samples: 418359680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 00:54:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:54:17,198][12883] Updated weights for policy 0, policy_version 25530 (0.0035) +[2024-06-18 00:54:20,447][12883] Updated weights for policy 0, policy_version 25540 (0.0042) +[2024-06-18 00:54:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 40988.1). Total num frames: 418480128. Throughput: 0: 40774.6. Samples: 418592760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 00:54:21,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 00:54:25,124][12883] Updated weights for policy 0, policy_version 25550 (0.0037) +[2024-06-18 00:54:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 418693120. Throughput: 0: 41114.7. Samples: 418842320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 00:54:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:54:28,571][12883] Updated weights for policy 0, policy_version 25560 (0.0028) +[2024-06-18 00:54:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 418873344. Throughput: 0: 40899.0. Samples: 418964440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 00:54:31,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:54:32,946][12883] Updated weights for policy 0, policy_version 25570 (0.0037) +[2024-06-18 00:54:36,515][12883] Updated weights for policy 0, policy_version 25580 (0.0041) +[2024-06-18 00:54:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41044.2). Total num frames: 419102720. Throughput: 0: 40925.0. Samples: 419210280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 00:54:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:54:37,029][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025581_419119104.pth... +[2024-06-18 00:54:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024978_409239552.pth +[2024-06-18 00:54:40,762][12883] Updated weights for policy 0, policy_version 25590 (0.0040) +[2024-06-18 00:54:41,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40414.0, 300 sec: 40987.8). Total num frames: 419282944. Throughput: 0: 41127.8. Samples: 419463040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 00:54:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:54:42,550][12862] Signal inference workers to stop experience collection... (5900 times) +[2024-06-18 00:54:42,551][12862] Signal inference workers to resume experience collection... (5900 times) +[2024-06-18 00:54:42,595][12883] InferenceWorker_p0-w0: stopping experience collection (5900 times) +[2024-06-18 00:54:42,595][12883] InferenceWorker_p0-w0: resuming experience collection (5900 times) +[2024-06-18 00:54:44,791][12883] Updated weights for policy 0, policy_version 25600 (0.0034) +[2024-06-18 00:54:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 419495936. Throughput: 0: 40870.1. Samples: 419583900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 00:54:46,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:54:48,602][12883] Updated weights for policy 0, policy_version 25610 (0.0030) +[2024-06-18 00:54:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 419692544. Throughput: 0: 40938.8. Samples: 419826880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 00:54:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:54:52,994][12883] Updated weights for policy 0, policy_version 25620 (0.0049) +[2024-06-18 00:54:56,946][12883] Updated weights for policy 0, policy_version 25630 (0.0035) +[2024-06-18 00:54:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 419921920. Throughput: 0: 40882.1. Samples: 420071280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 00:54:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:55:00,896][12883] Updated weights for policy 0, policy_version 25640 (0.0033) +[2024-06-18 00:55:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 420134912. Throughput: 0: 40676.4. Samples: 420190120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 00:55:01,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 00:55:05,423][12883] Updated weights for policy 0, policy_version 25650 (0.0038) +[2024-06-18 00:55:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 420331520. Throughput: 0: 41067.5. Samples: 420440800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 00:55:06,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 00:55:08,907][12883] Updated weights for policy 0, policy_version 25660 (0.0033) +[2024-06-18 00:55:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40413.9, 300 sec: 40876.7). Total num frames: 420511744. Throughput: 0: 40864.5. Samples: 420681220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 00:55:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:55:13,483][12883] Updated weights for policy 0, policy_version 25670 (0.0036) +[2024-06-18 00:55:16,977][12883] Updated weights for policy 0, policy_version 25680 (0.0032) +[2024-06-18 00:55:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 41098.9). Total num frames: 420741120. Throughput: 0: 40767.2. Samples: 420798960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 00:55:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:55:21,179][12883] Updated weights for policy 0, policy_version 25690 (0.0037) +[2024-06-18 00:55:21,996][12645] Fps is (10 sec: 40950.8, 60 sec: 40685.4, 300 sec: 40987.8). Total num frames: 420921344. Throughput: 0: 40842.3. Samples: 421048280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:55:21,997][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:55:25,146][12883] Updated weights for policy 0, policy_version 25700 (0.0034) +[2024-06-18 00:55:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 421150720. Throughput: 0: 40572.4. Samples: 421288800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:55:27,000][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:55:28,993][12883] Updated weights for policy 0, policy_version 25710 (0.0038) +[2024-06-18 00:55:31,994][12645] Fps is (10 sec: 39330.2, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 421314560. Throughput: 0: 40616.0. Samples: 421411620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:55:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:55:32,981][12883] Updated weights for policy 0, policy_version 25720 (0.0043) +[2024-06-18 00:55:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.8, 300 sec: 40987.7). Total num frames: 421543936. Throughput: 0: 40740.3. Samples: 421660200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 00:55:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:55:37,091][12883] Updated weights for policy 0, policy_version 25730 (0.0030) +[2024-06-18 00:55:41,025][12883] Updated weights for policy 0, policy_version 25740 (0.0043) +[2024-06-18 00:55:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40959.8, 300 sec: 40876.7). Total num frames: 421740544. Throughput: 0: 40729.7. Samples: 421904120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 00:55:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:55:44,985][12883] Updated weights for policy 0, policy_version 25750 (0.0030) +[2024-06-18 00:55:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 421953536. Throughput: 0: 40699.9. Samples: 422021620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 00:55:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:55:48,869][12883] Updated weights for policy 0, policy_version 25760 (0.0034) +[2024-06-18 00:55:51,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40686.9, 300 sec: 40821.2). Total num frames: 422133760. Throughput: 0: 40672.6. Samples: 422271060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 00:55:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:55:53,163][12883] Updated weights for policy 0, policy_version 25770 (0.0050) +[2024-06-18 00:55:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 422346752. Throughput: 0: 40802.7. Samples: 422517340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:55:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:55:57,246][12883] Updated weights for policy 0, policy_version 25780 (0.0034) +[2024-06-18 00:56:00,695][12862] Signal inference workers to stop experience collection... (5950 times) +[2024-06-18 00:56:00,728][12883] InferenceWorker_p0-w0: stopping experience collection (5950 times) +[2024-06-18 00:56:00,752][12862] Signal inference workers to resume experience collection... (5950 times) +[2024-06-18 00:56:00,753][12883] InferenceWorker_p0-w0: resuming experience collection (5950 times) +[2024-06-18 00:56:01,061][12883] Updated weights for policy 0, policy_version 25790 (0.0033) +[2024-06-18 00:56:02,000][12645] Fps is (10 sec: 44209.1, 60 sec: 40682.7, 300 sec: 40931.4). Total num frames: 422576128. Throughput: 0: 40992.5. Samples: 422643880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:56:02,000][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:56:05,149][12883] Updated weights for policy 0, policy_version 25800 (0.0042) +[2024-06-18 00:56:06,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40412.4, 300 sec: 40820.8). Total num frames: 422756352. Throughput: 0: 40709.8. Samples: 422880220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:56:06,997][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:56:08,968][12883] Updated weights for policy 0, policy_version 25810 (0.0026) +[2024-06-18 00:56:11,994][12645] Fps is (10 sec: 37706.7, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 422952960. Throughput: 0: 40843.1. Samples: 423126740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 00:56:11,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:56:13,284][12883] Updated weights for policy 0, policy_version 25820 (0.0035) +[2024-06-18 00:56:16,994][12645] Fps is (10 sec: 40969.3, 60 sec: 40413.9, 300 sec: 40932.2). Total num frames: 423165952. Throughput: 0: 40753.8. Samples: 423245540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 00:56:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:56:17,364][12883] Updated weights for policy 0, policy_version 25830 (0.0030) +[2024-06-18 00:56:21,546][12883] Updated weights for policy 0, policy_version 25840 (0.0039) +[2024-06-18 00:56:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41234.5, 300 sec: 40932.2). Total num frames: 423395328. Throughput: 0: 40676.9. Samples: 423490660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 00:56:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:56:25,309][12883] Updated weights for policy 0, policy_version 25850 (0.0042) +[2024-06-18 00:56:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 40877.0). Total num frames: 423575552. Throughput: 0: 40592.2. Samples: 423730760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 00:56:26,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 00:56:29,519][12883] Updated weights for policy 0, policy_version 25860 (0.0035) +[2024-06-18 00:56:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 423772160. Throughput: 0: 40741.4. Samples: 423854980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:56:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:56:33,690][12883] Updated weights for policy 0, policy_version 25870 (0.0033) +[2024-06-18 00:56:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 423968768. Throughput: 0: 40672.0. Samples: 424101300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:56:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:56:37,034][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025878_423985152.pth... +[2024-06-18 00:56:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025280_414187520.pth +[2024-06-18 00:56:37,670][12883] Updated weights for policy 0, policy_version 25880 (0.0045) +[2024-06-18 00:56:41,509][12883] Updated weights for policy 0, policy_version 25890 (0.0026) +[2024-06-18 00:56:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 424198144. Throughput: 0: 40653.7. Samples: 424346760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:56:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 00:56:45,557][12883] Updated weights for policy 0, policy_version 25900 (0.0030) +[2024-06-18 00:56:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 424394752. Throughput: 0: 40658.0. Samples: 424473240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 00:56:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:56:49,288][12883] Updated weights for policy 0, policy_version 25910 (0.0031) +[2024-06-18 00:56:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 424624128. Throughput: 0: 41015.3. Samples: 424725820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:56:51,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:56:53,250][12883] Updated weights for policy 0, policy_version 25920 (0.0046) +[2024-06-18 00:56:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 424804352. Throughput: 0: 40905.8. Samples: 424967500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:56:56,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:56:57,319][12883] Updated weights for policy 0, policy_version 25930 (0.0038) +[2024-06-18 00:57:00,955][12883] Updated weights for policy 0, policy_version 25940 (0.0040) +[2024-06-18 00:57:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40691.2, 300 sec: 40876.7). Total num frames: 425017344. Throughput: 0: 41047.1. Samples: 425092660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 00:57:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:57:05,299][12883] Updated weights for policy 0, policy_version 25950 (0.0039) +[2024-06-18 00:57:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40961.5, 300 sec: 40876.7). Total num frames: 425213952. Throughput: 0: 41034.3. Samples: 425337200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 00:57:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:57:09,172][12883] Updated weights for policy 0, policy_version 25960 (0.0052) +[2024-06-18 00:57:11,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 425410560. Throughput: 0: 41029.6. Samples: 425577100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 00:57:11,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:57:13,876][12883] Updated weights for policy 0, policy_version 25970 (0.0043) +[2024-06-18 00:57:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 425623552. Throughput: 0: 41026.7. Samples: 425701180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 00:57:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 00:57:17,535][12883] Updated weights for policy 0, policy_version 25980 (0.0036) +[2024-06-18 00:57:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.9, 300 sec: 40877.0). Total num frames: 425803776. Throughput: 0: 40985.4. Samples: 425945640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 00:57:21,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 00:57:22,113][12883] Updated weights for policy 0, policy_version 25990 (0.0053) +[2024-06-18 00:57:25,814][12883] Updated weights for policy 0, policy_version 26000 (0.0027) +[2024-06-18 00:57:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 426033152. Throughput: 0: 40803.5. Samples: 426182920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 00:57:26,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 00:57:27,098][12862] Signal inference workers to stop experience collection... (6000 times) +[2024-06-18 00:57:27,125][12883] InferenceWorker_p0-w0: stopping experience collection (6000 times) +[2024-06-18 00:57:27,161][12862] Signal inference workers to resume experience collection... (6000 times) +[2024-06-18 00:57:27,164][12883] InferenceWorker_p0-w0: resuming experience collection (6000 times) +[2024-06-18 00:57:30,144][12883] Updated weights for policy 0, policy_version 26010 (0.0032) +[2024-06-18 00:57:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 426229760. Throughput: 0: 40784.5. Samples: 426308540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 00:57:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:57:33,700][12883] Updated weights for policy 0, policy_version 26020 (0.0049) +[2024-06-18 00:57:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 426442752. Throughput: 0: 40702.3. Samples: 426557420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 00:57:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:57:38,150][12883] Updated weights for policy 0, policy_version 26030 (0.0047) +[2024-06-18 00:57:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.9, 300 sec: 40821.2). Total num frames: 426622976. Throughput: 0: 40482.7. Samples: 426789220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:57:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:57:42,216][12883] Updated weights for policy 0, policy_version 26040 (0.0034) +[2024-06-18 00:57:46,109][12883] Updated weights for policy 0, policy_version 26050 (0.0040) +[2024-06-18 00:57:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40414.0, 300 sec: 40765.6). Total num frames: 426819584. Throughput: 0: 40371.6. Samples: 426909380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:57:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:57:50,074][12883] Updated weights for policy 0, policy_version 26060 (0.0045) +[2024-06-18 00:57:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40413.9, 300 sec: 40933.1). Total num frames: 427048960. Throughput: 0: 40424.1. Samples: 427156280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 00:57:51,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 00:57:54,051][12883] Updated weights for policy 0, policy_version 26070 (0.0025) +[2024-06-18 00:57:56,999][12645] Fps is (10 sec: 42575.0, 60 sec: 40683.2, 300 sec: 40764.9). Total num frames: 427245568. Throughput: 0: 40417.8. Samples: 427396120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 00:57:57,000][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:57:57,947][12883] Updated weights for policy 0, policy_version 26080 (0.0057) +[2024-06-18 00:58:01,996][12645] Fps is (10 sec: 39312.7, 60 sec: 40412.3, 300 sec: 40876.4). Total num frames: 427442176. Throughput: 0: 40403.3. Samples: 427519420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 00:58:01,996][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:58:02,132][12883] Updated weights for policy 0, policy_version 26090 (0.0027) +[2024-06-18 00:58:06,004][12883] Updated weights for policy 0, policy_version 26100 (0.0042) +[2024-06-18 00:58:06,996][12645] Fps is (10 sec: 40973.1, 60 sec: 40685.4, 300 sec: 40876.4). Total num frames: 427655168. Throughput: 0: 40461.0. Samples: 427766480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 00:58:06,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:58:10,138][12883] Updated weights for policy 0, policy_version 26110 (0.0034) +[2024-06-18 00:58:11,994][12645] Fps is (10 sec: 39330.1, 60 sec: 40413.9, 300 sec: 40654.5). Total num frames: 427835392. Throughput: 0: 40680.5. Samples: 428013540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 00:58:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:58:14,029][12883] Updated weights for policy 0, policy_version 26120 (0.0038) +[2024-06-18 00:58:16,994][12645] Fps is (10 sec: 39330.2, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 428048384. Throughput: 0: 40453.3. Samples: 428128940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 00:58:16,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 00:58:18,242][12883] Updated weights for policy 0, policy_version 26130 (0.0041) +[2024-06-18 00:58:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 428261376. Throughput: 0: 40411.6. Samples: 428375940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 00:58:21,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 00:58:22,083][12883] Updated weights for policy 0, policy_version 26140 (0.0046) +[2024-06-18 00:58:26,383][12883] Updated weights for policy 0, policy_version 26150 (0.0029) +[2024-06-18 00:58:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 40765.6). Total num frames: 428441600. Throughput: 0: 40588.4. Samples: 428615700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 00:58:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:58:30,091][12883] Updated weights for policy 0, policy_version 26160 (0.0023) +[2024-06-18 00:58:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 428687360. Throughput: 0: 40575.1. Samples: 428735260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) +[2024-06-18 00:58:31,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:58:34,631][12883] Updated weights for policy 0, policy_version 26170 (0.0042) +[2024-06-18 00:58:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 428867584. Throughput: 0: 40634.6. Samples: 428984840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) +[2024-06-18 00:58:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:58:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026176_428867584.pth... +[2024-06-18 00:58:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025581_419119104.pth +[2024-06-18 00:58:38,038][12883] Updated weights for policy 0, policy_version 26180 (0.0038) +[2024-06-18 00:58:41,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 429064192. Throughput: 0: 40687.1. Samples: 429226820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) +[2024-06-18 00:58:41,998][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:58:42,942][12883] Updated weights for policy 0, policy_version 26190 (0.0038) +[2024-06-18 00:58:46,127][12883] Updated weights for policy 0, policy_version 26200 (0.0045) +[2024-06-18 00:58:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 429293568. Throughput: 0: 40866.0. Samples: 429358300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) +[2024-06-18 00:58:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:58:50,537][12883] Updated weights for policy 0, policy_version 26210 (0.0032) +[2024-06-18 00:58:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.8, 300 sec: 40654.6). Total num frames: 429457408. Throughput: 0: 40852.3. Samples: 429604740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) +[2024-06-18 00:58:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:58:52,557][12862] Signal inference workers to stop experience collection... (6050 times) +[2024-06-18 00:58:52,599][12883] InferenceWorker_p0-w0: stopping experience collection (6050 times) +[2024-06-18 00:58:52,676][12862] Signal inference workers to resume experience collection... (6050 times) +[2024-06-18 00:58:52,676][12883] InferenceWorker_p0-w0: resuming experience collection (6050 times) +[2024-06-18 00:58:54,158][12883] Updated weights for policy 0, policy_version 26220 (0.0031) +[2024-06-18 00:58:56,996][12645] Fps is (10 sec: 42589.0, 60 sec: 41235.3, 300 sec: 40876.4). Total num frames: 429719552. Throughput: 0: 40813.2. Samples: 429850220. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) +[2024-06-18 00:58:56,996][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 00:58:58,742][12883] Updated weights for policy 0, policy_version 26230 (0.0041) +[2024-06-18 00:59:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40961.5, 300 sec: 40876.7). Total num frames: 429899776. Throughput: 0: 41163.1. Samples: 429981280. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) +[2024-06-18 00:59:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:59:02,055][12883] Updated weights for policy 0, policy_version 26240 (0.0031) +[2024-06-18 00:59:06,638][12883] Updated weights for policy 0, policy_version 26250 (0.0045) +[2024-06-18 00:59:06,994][12645] Fps is (10 sec: 36053.0, 60 sec: 40415.4, 300 sec: 40654.5). Total num frames: 430080000. Throughput: 0: 40992.0. Samples: 430220580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 00:59:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:59:10,158][12883] Updated weights for policy 0, policy_version 26260 (0.0048) +[2024-06-18 00:59:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 40987.8). Total num frames: 430342144. Throughput: 0: 40952.8. Samples: 430458580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 00:59:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 00:59:14,478][12883] Updated weights for policy 0, policy_version 26270 (0.0023) +[2024-06-18 00:59:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 430489600. Throughput: 0: 41101.7. Samples: 430584840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 00:59:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:59:18,236][12883] Updated weights for policy 0, policy_version 26280 (0.0032) +[2024-06-18 00:59:21,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 430718976. Throughput: 0: 40917.7. Samples: 430826140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 00:59:21,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 00:59:22,709][12883] Updated weights for policy 0, policy_version 26290 (0.0042) +[2024-06-18 00:59:26,323][12883] Updated weights for policy 0, policy_version 26300 (0.0042) +[2024-06-18 00:59:26,994][12645] Fps is (10 sec: 47514.3, 60 sec: 42052.3, 300 sec: 40987.8). Total num frames: 430964736. Throughput: 0: 41138.3. Samples: 431078040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 00:59:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 00:59:30,499][12883] Updated weights for policy 0, policy_version 26310 (0.0033) +[2024-06-18 00:59:31,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 431095808. Throughput: 0: 41029.8. Samples: 431204640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 00:59:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 00:59:34,127][12883] Updated weights for policy 0, policy_version 26320 (0.0030) +[2024-06-18 00:59:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 431341568. Throughput: 0: 40896.9. Samples: 431445100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 00:59:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 00:59:38,262][12883] Updated weights for policy 0, policy_version 26330 (0.0028) +[2024-06-18 00:59:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 431538176. Throughput: 0: 41158.8. Samples: 431702280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 00:59:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:59:42,129][12883] Updated weights for policy 0, policy_version 26340 (0.0041) +[2024-06-18 00:59:46,014][12883] Updated weights for policy 0, policy_version 26350 (0.0038) +[2024-06-18 00:59:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 431734784. Throughput: 0: 40895.1. Samples: 431821560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 00:59:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:59:50,149][12883] Updated weights for policy 0, policy_version 26360 (0.0032) +[2024-06-18 00:59:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 40876.7). Total num frames: 431980544. Throughput: 0: 41059.1. Samples: 432068240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 00:59:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 00:59:53,688][12862] Signal inference workers to stop experience collection... (6100 times) +[2024-06-18 00:59:53,688][12862] Signal inference workers to resume experience collection... (6100 times) +[2024-06-18 00:59:53,746][12883] InferenceWorker_p0-w0: stopping experience collection (6100 times) +[2024-06-18 00:59:53,746][12883] InferenceWorker_p0-w0: resuming experience collection (6100 times) +[2024-06-18 00:59:54,080][12883] Updated weights for policy 0, policy_version 26370 (0.0035) +[2024-06-18 00:59:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40415.3, 300 sec: 40710.1). Total num frames: 432144384. Throughput: 0: 41340.3. Samples: 432318900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 00:59:56,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 00:59:58,217][12883] Updated weights for policy 0, policy_version 26380 (0.0034) +[2024-06-18 01:00:01,827][12883] Updated weights for policy 0, policy_version 26390 (0.0031) +[2024-06-18 01:00:01,995][12645] Fps is (10 sec: 39316.2, 60 sec: 41232.2, 300 sec: 40821.0). Total num frames: 432373760. Throughput: 0: 41086.8. Samples: 432433800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) +[2024-06-18 01:00:01,996][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:00:06,055][12883] Updated weights for policy 0, policy_version 26400 (0.0036) +[2024-06-18 01:00:06,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41506.1, 300 sec: 40876.7). Total num frames: 432570368. Throughput: 0: 41406.0. Samples: 432689400. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) +[2024-06-18 01:00:06,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 01:00:09,837][12883] Updated weights for policy 0, policy_version 26410 (0.0029) +[2024-06-18 01:00:11,996][12645] Fps is (10 sec: 39318.0, 60 sec: 40412.3, 300 sec: 40765.3). Total num frames: 432766976. Throughput: 0: 41279.6. Samples: 432935720. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) +[2024-06-18 01:00:11,997][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:00:14,267][12883] Updated weights for policy 0, policy_version 26420 (0.0042) +[2024-06-18 01:00:16,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41506.1, 300 sec: 40877.0). Total num frames: 432979968. Throughput: 0: 41115.0. Samples: 433054820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 01:00:16,999][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:00:17,948][12883] Updated weights for policy 0, policy_version 26430 (0.0030) +[2024-06-18 01:00:21,982][12883] Updated weights for policy 0, policy_version 26440 (0.0035) +[2024-06-18 01:00:21,994][12645] Fps is (10 sec: 42608.1, 60 sec: 41233.1, 300 sec: 40821.1). Total num frames: 433192960. Throughput: 0: 41246.1. Samples: 433301180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 01:00:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:00:25,762][12883] Updated weights for policy 0, policy_version 26450 (0.0047) +[2024-06-18 01:00:26,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40140.8, 300 sec: 40876.7). Total num frames: 433373184. Throughput: 0: 41029.5. Samples: 433548600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 01:00:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:00:29,783][12883] Updated weights for policy 0, policy_version 26460 (0.0039) +[2024-06-18 01:00:31,998][12645] Fps is (10 sec: 39303.0, 60 sec: 41502.8, 300 sec: 40820.5). Total num frames: 433586176. Throughput: 0: 41074.9. Samples: 433670120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 01:00:31,999][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:00:34,038][12883] Updated weights for policy 0, policy_version 26470 (0.0032) +[2024-06-18 01:00:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 433815552. Throughput: 0: 41048.8. Samples: 433915440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:00:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:00:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026478_433815552.pth... +[2024-06-18 01:00:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025878_423985152.pth +[2024-06-18 01:00:38,290][12883] Updated weights for policy 0, policy_version 26480 (0.0033) +[2024-06-18 01:00:41,829][12883] Updated weights for policy 0, policy_version 26490 (0.0046) +[2024-06-18 01:00:41,994][12645] Fps is (10 sec: 42618.4, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 434012160. Throughput: 0: 41028.1. Samples: 434165160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:00:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:00:46,163][12883] Updated weights for policy 0, policy_version 26500 (0.0035) +[2024-06-18 01:00:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 434208768. Throughput: 0: 41301.6. Samples: 434292320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:00:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:00:49,627][12883] Updated weights for policy 0, policy_version 26510 (0.0029) +[2024-06-18 01:00:51,994][12645] Fps is (10 sec: 40957.9, 60 sec: 40686.5, 300 sec: 40932.1). Total num frames: 434421760. Throughput: 0: 41019.4. Samples: 434535300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:00:51,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:00:54,033][12883] Updated weights for policy 0, policy_version 26520 (0.0037) +[2024-06-18 01:00:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 40822.0). Total num frames: 434618368. Throughput: 0: 40987.9. Samples: 434780080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:00:56,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:00:57,831][12883] Updated weights for policy 0, policy_version 26530 (0.0030) +[2024-06-18 01:01:01,994][12645] Fps is (10 sec: 39323.9, 60 sec: 40687.9, 300 sec: 40877.0). Total num frames: 434814976. Throughput: 0: 41143.7. Samples: 434906280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:01:01,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:01:02,191][12883] Updated weights for policy 0, policy_version 26540 (0.0034) +[2024-06-18 01:01:05,778][12883] Updated weights for policy 0, policy_version 26550 (0.0030) +[2024-06-18 01:01:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 435044352. Throughput: 0: 41207.5. Samples: 435155520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:01:06,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:01:10,234][12883] Updated weights for policy 0, policy_version 26560 (0.0035) +[2024-06-18 01:01:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41507.7, 300 sec: 40987.8). Total num frames: 435257344. Throughput: 0: 41021.2. Samples: 435394560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:01:11,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:01:13,721][12883] Updated weights for policy 0, policy_version 26570 (0.0048) +[2024-06-18 01:01:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 435437568. Throughput: 0: 41253.9. Samples: 435526360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:01:16,995][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:01:18,037][12883] Updated weights for policy 0, policy_version 26580 (0.0034) +[2024-06-18 01:01:21,910][12883] Updated weights for policy 0, policy_version 26590 (0.0031) +[2024-06-18 01:01:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 435650560. Throughput: 0: 41269.7. Samples: 435772580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:01:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:01:26,247][12883] Updated weights for policy 0, policy_version 26600 (0.0031) +[2024-06-18 01:01:26,993][12645] Fps is (10 sec: 44238.4, 60 sec: 41779.2, 300 sec: 41043.3). Total num frames: 435879936. Throughput: 0: 41180.2. Samples: 436018260. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) +[2024-06-18 01:01:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:01:29,810][12883] Updated weights for policy 0, policy_version 26610 (0.0035) +[2024-06-18 01:01:32,000][12645] Fps is (10 sec: 42572.1, 60 sec: 41505.1, 300 sec: 41042.4). Total num frames: 436076544. Throughput: 0: 41115.2. Samples: 436142760. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) +[2024-06-18 01:01:32,001][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:01:33,973][12883] Updated weights for policy 0, policy_version 26620 (0.0045) +[2024-06-18 01:01:34,648][12862] Signal inference workers to stop experience collection... (6150 times) +[2024-06-18 01:01:34,700][12883] InferenceWorker_p0-w0: stopping experience collection (6150 times) +[2024-06-18 01:01:34,704][12862] Signal inference workers to resume experience collection... (6150 times) +[2024-06-18 01:01:34,712][12883] InferenceWorker_p0-w0: resuming experience collection (6150 times) +[2024-06-18 01:01:36,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 436273152. Throughput: 0: 41129.8. Samples: 436386120. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) +[2024-06-18 01:01:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:01:37,533][12883] Updated weights for policy 0, policy_version 26630 (0.0035) +[2024-06-18 01:01:41,713][12883] Updated weights for policy 0, policy_version 26640 (0.0038) +[2024-06-18 01:01:41,994][12645] Fps is (10 sec: 39345.4, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 436469760. Throughput: 0: 41289.5. Samples: 436638120. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) +[2024-06-18 01:01:41,995][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:01:45,821][12883] Updated weights for policy 0, policy_version 26650 (0.0046) +[2024-06-18 01:01:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 436666368. Throughput: 0: 41097.8. Samples: 436755680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:01:46,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:01:49,701][12883] Updated weights for policy 0, policy_version 26660 (0.0041) +[2024-06-18 01:01:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.4, 300 sec: 40987.8). Total num frames: 436895744. Throughput: 0: 41002.7. Samples: 437000640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:01:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:01:53,562][12883] Updated weights for policy 0, policy_version 26670 (0.0040) +[2024-06-18 01:01:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 437075968. Throughput: 0: 41233.3. Samples: 437250060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:01:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:01:57,586][12883] Updated weights for policy 0, policy_version 26680 (0.0033) +[2024-06-18 01:02:01,872][12883] Updated weights for policy 0, policy_version 26690 (0.0042) +[2024-06-18 01:02:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 437288960. Throughput: 0: 40899.3. Samples: 437366820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:02:01,994][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 01:02:05,725][12883] Updated weights for policy 0, policy_version 26700 (0.0039) +[2024-06-18 01:02:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 437501952. Throughput: 0: 40867.9. Samples: 437611640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) +[2024-06-18 01:02:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:02:09,890][12883] Updated weights for policy 0, policy_version 26710 (0.0043) +[2024-06-18 01:02:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 437682176. Throughput: 0: 40916.2. Samples: 437859500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) +[2024-06-18 01:02:11,995][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:02:13,733][12883] Updated weights for policy 0, policy_version 26720 (0.0047) +[2024-06-18 01:02:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 437911552. Throughput: 0: 40829.2. Samples: 437979820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) +[2024-06-18 01:02:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:02:17,977][12883] Updated weights for policy 0, policy_version 26730 (0.0032) +[2024-06-18 01:02:21,953][12883] Updated weights for policy 0, policy_version 26740 (0.0033) +[2024-06-18 01:02:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 438108160. Throughput: 0: 40828.4. Samples: 438223400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 01:02:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:02:25,721][12883] Updated weights for policy 0, policy_version 26750 (0.0029) +[2024-06-18 01:02:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.7, 300 sec: 40932.2). Total num frames: 438304768. Throughput: 0: 40732.1. Samples: 438471060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 01:02:26,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:02:29,901][12883] Updated weights for policy 0, policy_version 26760 (0.0037) +[2024-06-18 01:02:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40964.3, 300 sec: 40987.8). Total num frames: 438534144. Throughput: 0: 40824.4. Samples: 438592780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 01:02:31,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:02:33,433][12883] Updated weights for policy 0, policy_version 26770 (0.0047) +[2024-06-18 01:02:36,996][12645] Fps is (10 sec: 39313.1, 60 sec: 40412.4, 300 sec: 40931.9). Total num frames: 438697984. Throughput: 0: 40981.5. Samples: 438844900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 01:02:36,996][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 01:02:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026777_438714368.pth... +[2024-06-18 01:02:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026176_428867584.pth +[2024-06-18 01:02:37,885][12883] Updated weights for policy 0, policy_version 26780 (0.0046) +[2024-06-18 01:02:41,236][12883] Updated weights for policy 0, policy_version 26790 (0.0030) +[2024-06-18 01:02:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 438943744. Throughput: 0: 40736.4. Samples: 439083200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 01:02:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:02:46,090][12883] Updated weights for policy 0, policy_version 26800 (0.0028) +[2024-06-18 01:02:46,994][12645] Fps is (10 sec: 44245.9, 60 sec: 41232.9, 300 sec: 40987.7). Total num frames: 439140352. Throughput: 0: 41084.3. Samples: 439215620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 01:02:46,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:02:49,134][12883] Updated weights for policy 0, policy_version 26810 (0.0031) +[2024-06-18 01:02:51,994][12645] Fps is (10 sec: 37683.8, 60 sec: 40413.9, 300 sec: 40933.0). Total num frames: 439320576. Throughput: 0: 41079.7. Samples: 439460220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 01:02:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:02:54,151][12883] Updated weights for policy 0, policy_version 26820 (0.0034) +[2024-06-18 01:02:56,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41506.2, 300 sec: 41099.2). Total num frames: 439566336. Throughput: 0: 40950.8. Samples: 439702280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 01:02:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:02:57,020][12883] Updated weights for policy 0, policy_version 26830 (0.0051) +[2024-06-18 01:03:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.9, 300 sec: 40932.5). Total num frames: 439730176. Throughput: 0: 41202.7. Samples: 439833940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 01:03:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:03:02,163][12883] Updated weights for policy 0, policy_version 26840 (0.0044) +[2024-06-18 01:03:04,843][12883] Updated weights for policy 0, policy_version 26850 (0.0027) +[2024-06-18 01:03:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 439959552. Throughput: 0: 41152.5. Samples: 440075260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 01:03:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:03:10,249][12883] Updated weights for policy 0, policy_version 26860 (0.0044) +[2024-06-18 01:03:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 41779.3, 300 sec: 41154.4). Total num frames: 440188928. Throughput: 0: 41188.1. Samples: 440324520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 01:03:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:03:12,852][12883] Updated weights for policy 0, policy_version 26870 (0.0039) +[2024-06-18 01:03:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 440369152. Throughput: 0: 41379.2. Samples: 440454840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 01:03:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:03:18,032][12883] Updated weights for policy 0, policy_version 26880 (0.0038) +[2024-06-18 01:03:18,166][12862] Signal inference workers to stop experience collection... (6200 times) +[2024-06-18 01:03:18,166][12862] Signal inference workers to resume experience collection... (6200 times) +[2024-06-18 01:03:18,208][12883] InferenceWorker_p0-w0: stopping experience collection (6200 times) +[2024-06-18 01:03:18,208][12883] InferenceWorker_p0-w0: resuming experience collection (6200 times) +[2024-06-18 01:03:21,110][12883] Updated weights for policy 0, policy_version 26890 (0.0033) +[2024-06-18 01:03:21,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 440582144. Throughput: 0: 41176.8. Samples: 440697760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 01:03:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:03:25,773][12883] Updated weights for policy 0, policy_version 26900 (0.0042) +[2024-06-18 01:03:26,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 440795136. Throughput: 0: 41511.5. Samples: 440951220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 01:03:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:03:28,836][12883] Updated weights for policy 0, policy_version 26910 (0.0033) +[2024-06-18 01:03:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 440975360. Throughput: 0: 41311.4. Samples: 441074620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 01:03:31,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:03:33,652][12883] Updated weights for policy 0, policy_version 26920 (0.0035) +[2024-06-18 01:03:36,650][12883] Updated weights for policy 0, policy_version 26930 (0.0042) +[2024-06-18 01:03:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42053.8, 300 sec: 41209.9). Total num frames: 441221120. Throughput: 0: 41173.6. Samples: 441313040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 01:03:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:03:41,444][12883] Updated weights for policy 0, policy_version 26940 (0.0041) +[2024-06-18 01:03:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 441401344. Throughput: 0: 41540.8. Samples: 441571620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 01:03:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:03:44,892][12883] Updated weights for policy 0, policy_version 26950 (0.0032) +[2024-06-18 01:03:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41233.2, 300 sec: 41209.9). Total num frames: 441614336. Throughput: 0: 41205.8. Samples: 441688200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 01:03:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:03:49,758][12883] Updated weights for policy 0, policy_version 26960 (0.0024) +[2024-06-18 01:03:51,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42050.6, 300 sec: 41098.8). Total num frames: 441843712. Throughput: 0: 41382.9. Samples: 441937580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) +[2024-06-18 01:03:51,996][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:03:53,365][12883] Updated weights for policy 0, policy_version 26970 (0.0031) +[2024-06-18 01:03:56,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 441991168. Throughput: 0: 41573.7. Samples: 442195340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) +[2024-06-18 01:03:57,000][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:03:57,451][12883] Updated weights for policy 0, policy_version 26980 (0.0038) +[2024-06-18 01:04:01,224][12883] Updated weights for policy 0, policy_version 26990 (0.0046) +[2024-06-18 01:04:01,994][12645] Fps is (10 sec: 37691.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 442220544. Throughput: 0: 41070.5. Samples: 442303020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) +[2024-06-18 01:04:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:04:05,697][12883] Updated weights for policy 0, policy_version 27000 (0.0042) +[2024-06-18 01:04:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 442433536. Throughput: 0: 41335.1. Samples: 442557840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) +[2024-06-18 01:04:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:04:08,933][12883] Updated weights for policy 0, policy_version 27010 (0.0049) +[2024-06-18 01:04:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 442630144. Throughput: 0: 40975.7. Samples: 442795120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 01:04:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:04:13,615][12883] Updated weights for policy 0, policy_version 27020 (0.0027) +[2024-06-18 01:04:16,603][12883] Updated weights for policy 0, policy_version 27030 (0.0035) +[2024-06-18 01:04:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 442859520. Throughput: 0: 41050.0. Samples: 442921880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 01:04:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:04:21,531][12883] Updated weights for policy 0, policy_version 27040 (0.0046) +[2024-06-18 01:04:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.8, 300 sec: 40876.7). Total num frames: 443023360. Throughput: 0: 41209.8. Samples: 443167480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 01:04:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:04:24,458][12883] Updated weights for policy 0, policy_version 27050 (0.0030) +[2024-06-18 01:04:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 443252736. Throughput: 0: 40834.3. Samples: 443409160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:04:26,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:04:29,768][12883] Updated weights for policy 0, policy_version 27060 (0.0044) +[2024-06-18 01:04:31,766][12862] Signal inference workers to stop experience collection... (6250 times) +[2024-06-18 01:04:31,810][12883] InferenceWorker_p0-w0: stopping experience collection (6250 times) +[2024-06-18 01:04:31,819][12862] Signal inference workers to resume experience collection... (6250 times) +[2024-06-18 01:04:31,833][12883] InferenceWorker_p0-w0: resuming experience collection (6250 times) +[2024-06-18 01:04:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.0, 300 sec: 41154.4). Total num frames: 443482112. Throughput: 0: 41132.7. Samples: 443539180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:04:31,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:04:32,343][12883] Updated weights for policy 0, policy_version 27070 (0.0044) +[2024-06-18 01:04:36,994][12645] Fps is (10 sec: 37681.9, 60 sec: 40140.7, 300 sec: 40987.7). Total num frames: 443629568. Throughput: 0: 41000.0. Samples: 443782500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:04:36,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:04:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027077_443629568.pth... +[2024-06-18 01:04:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026478_433815552.pth +[2024-06-18 01:04:37,891][12883] Updated weights for policy 0, policy_version 27080 (0.0035) +[2024-06-18 01:04:40,675][12883] Updated weights for policy 0, policy_version 27090 (0.0038) +[2024-06-18 01:04:41,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 443858944. Throughput: 0: 40497.4. Samples: 444017720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:04:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:04:45,756][12883] Updated weights for policy 0, policy_version 27100 (0.0043) +[2024-06-18 01:04:46,994][12645] Fps is (10 sec: 42599.3, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 444055552. Throughput: 0: 40913.8. Samples: 444144140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 01:04:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:04:48,867][12883] Updated weights for policy 0, policy_version 27110 (0.0030) +[2024-06-18 01:04:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40142.3, 300 sec: 41043.3). Total num frames: 444252160. Throughput: 0: 40486.6. Samples: 444379740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 01:04:51,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:04:53,899][12883] Updated weights for policy 0, policy_version 27120 (0.0024) +[2024-06-18 01:04:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41043.5). Total num frames: 444481536. Throughput: 0: 40671.0. Samples: 444625320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 01:04:56,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:04:57,020][12883] Updated weights for policy 0, policy_version 27130 (0.0051) +[2024-06-18 01:05:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 444645376. Throughput: 0: 40616.1. Samples: 444749600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 01:05:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:05:02,026][12883] Updated weights for policy 0, policy_version 27140 (0.0039) +[2024-06-18 01:05:05,105][12883] Updated weights for policy 0, policy_version 27150 (0.0033) +[2024-06-18 01:05:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41099.2). Total num frames: 444891136. Throughput: 0: 40525.4. Samples: 444991120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) +[2024-06-18 01:05:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:05:10,024][12883] Updated weights for policy 0, policy_version 27160 (0.0038) +[2024-06-18 01:05:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 445087744. Throughput: 0: 40591.5. Samples: 445235780. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) +[2024-06-18 01:05:11,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:05:13,087][12883] Updated weights for policy 0, policy_version 27170 (0.0029) +[2024-06-18 01:05:16,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39867.7, 300 sec: 40876.7). Total num frames: 445251584. Throughput: 0: 40359.6. Samples: 445355360. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) +[2024-06-18 01:05:16,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:05:17,994][12883] Updated weights for policy 0, policy_version 27180 (0.0043) +[2024-06-18 01:05:21,551][12883] Updated weights for policy 0, policy_version 27190 (0.0033) +[2024-06-18 01:05:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 445497344. Throughput: 0: 40291.4. Samples: 445595600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:05:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:05:26,100][12883] Updated weights for policy 0, policy_version 27200 (0.0044) +[2024-06-18 01:05:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40413.8, 300 sec: 40988.4). Total num frames: 445677568. Throughput: 0: 40400.8. Samples: 445835760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:05:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:05:29,485][12883] Updated weights for policy 0, policy_version 27210 (0.0036) +[2024-06-18 01:05:31,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39594.8, 300 sec: 40821.2). Total num frames: 445857792. Throughput: 0: 40164.1. Samples: 445951520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:05:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:05:34,386][12883] Updated weights for policy 0, policy_version 27220 (0.0047) +[2024-06-18 01:05:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40960.3, 300 sec: 40932.2). Total num frames: 446087168. Throughput: 0: 40393.4. Samples: 446197440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:05:36,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:05:37,602][12883] Updated weights for policy 0, policy_version 27230 (0.0046) +[2024-06-18 01:05:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.7, 300 sec: 40876.7). Total num frames: 446267392. Throughput: 0: 40443.1. Samples: 446445260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 01:05:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:05:42,432][12883] Updated weights for policy 0, policy_version 27240 (0.0036) +[2024-06-18 01:05:46,274][12883] Updated weights for policy 0, policy_version 27250 (0.0038) +[2024-06-18 01:05:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40414.0, 300 sec: 40876.8). Total num frames: 446480384. Throughput: 0: 40156.0. Samples: 446556620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 01:05:46,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:05:50,449][12883] Updated weights for policy 0, policy_version 27260 (0.0036) +[2024-06-18 01:05:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 446676992. Throughput: 0: 40266.8. Samples: 446803120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 01:05:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:05:54,168][12883] Updated weights for policy 0, policy_version 27270 (0.0039) +[2024-06-18 01:05:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 40932.2). Total num frames: 446889984. Throughput: 0: 40377.5. Samples: 447052760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 01:05:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:05:58,402][12883] Updated weights for policy 0, policy_version 27280 (0.0046) +[2024-06-18 01:06:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 447102976. Throughput: 0: 40382.3. Samples: 447172560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 01:06:01,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:06:02,346][12883] Updated weights for policy 0, policy_version 27290 (0.0025) +[2024-06-18 01:06:03,891][12862] Signal inference workers to stop experience collection... (6300 times) +[2024-06-18 01:06:03,928][12883] InferenceWorker_p0-w0: stopping experience collection (6300 times) +[2024-06-18 01:06:03,941][12862] Signal inference workers to resume experience collection... (6300 times) +[2024-06-18 01:06:03,956][12883] InferenceWorker_p0-w0: resuming experience collection (6300 times) +[2024-06-18 01:06:06,282][12883] Updated weights for policy 0, policy_version 27300 (0.0044) +[2024-06-18 01:06:06,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40140.8, 300 sec: 40821.2). Total num frames: 447299584. Throughput: 0: 40593.7. Samples: 447422320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 01:06:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:06:10,279][12883] Updated weights for policy 0, policy_version 27310 (0.0030) +[2024-06-18 01:06:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39867.8, 300 sec: 40821.2). Total num frames: 447479808. Throughput: 0: 40735.2. Samples: 447668840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 01:06:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:06:14,205][12883] Updated weights for policy 0, policy_version 27320 (0.0029) +[2024-06-18 01:06:17,000][12645] Fps is (10 sec: 42572.0, 60 sec: 41228.9, 300 sec: 40931.4). Total num frames: 447725568. Throughput: 0: 40738.3. Samples: 447785000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:06:17,001][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:06:18,379][12883] Updated weights for policy 0, policy_version 27330 (0.0036) +[2024-06-18 01:06:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40413.9, 300 sec: 40821.1). Total num frames: 447922176. Throughput: 0: 40884.8. Samples: 448037260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:06:21,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:06:22,047][12883] Updated weights for policy 0, policy_version 27340 (0.0040) +[2024-06-18 01:06:26,139][12883] Updated weights for policy 0, policy_version 27350 (0.0037) +[2024-06-18 01:06:26,994][12645] Fps is (10 sec: 39346.1, 60 sec: 40687.0, 300 sec: 40822.0). Total num frames: 448118784. Throughput: 0: 40840.0. Samples: 448283060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:06:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:06:29,994][12883] Updated weights for policy 0, policy_version 27360 (0.0028) +[2024-06-18 01:06:31,996][12645] Fps is (10 sec: 42589.2, 60 sec: 41504.6, 300 sec: 40931.9). Total num frames: 448348160. Throughput: 0: 41156.6. Samples: 448408760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:06:31,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:06:34,717][12883] Updated weights for policy 0, policy_version 27370 (0.0040) +[2024-06-18 01:06:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 448528384. Throughput: 0: 41184.9. Samples: 448656440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:06:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:06:37,109][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027377_448544768.pth... +[2024-06-18 01:06:37,165][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026777_438714368.pth +[2024-06-18 01:06:37,788][12883] Updated weights for policy 0, policy_version 27380 (0.0031) +[2024-06-18 01:06:41,994][12645] Fps is (10 sec: 39330.1, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 448741376. Throughput: 0: 41146.1. Samples: 448904340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:06:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:06:42,499][12883] Updated weights for policy 0, policy_version 27390 (0.0033) +[2024-06-18 01:06:46,374][12883] Updated weights for policy 0, policy_version 27400 (0.0036) +[2024-06-18 01:06:46,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 448937984. Throughput: 0: 41189.3. Samples: 449026080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:06:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:06:50,174][12883] Updated weights for policy 0, policy_version 27410 (0.0036) +[2024-06-18 01:06:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 449150976. Throughput: 0: 41162.3. Samples: 449274620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 01:06:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:06:54,419][12883] Updated weights for policy 0, policy_version 27420 (0.0027) +[2024-06-18 01:06:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41232.9, 300 sec: 40932.2). Total num frames: 449363968. Throughput: 0: 41103.4. Samples: 449518500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 01:06:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:06:57,914][12883] Updated weights for policy 0, policy_version 27430 (0.0046) +[2024-06-18 01:07:01,996][12645] Fps is (10 sec: 40950.6, 60 sec: 40958.5, 300 sec: 40876.4). Total num frames: 449560576. Throughput: 0: 41282.8. Samples: 449642560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 01:07:01,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:07:02,416][12883] Updated weights for policy 0, policy_version 27440 (0.0035) +[2024-06-18 01:07:06,051][12883] Updated weights for policy 0, policy_version 27450 (0.0028) +[2024-06-18 01:07:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 449757184. Throughput: 0: 41077.7. Samples: 449885760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 01:07:06,995][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:07:10,314][12883] Updated weights for policy 0, policy_version 27460 (0.0037) +[2024-06-18 01:07:11,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41506.1, 300 sec: 40876.7). Total num frames: 449970176. Throughput: 0: 41008.9. Samples: 450128460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:07:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:07:14,020][12883] Updated weights for policy 0, policy_version 27470 (0.0036) +[2024-06-18 01:07:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40691.2, 300 sec: 40876.7). Total num frames: 450166784. Throughput: 0: 40891.8. Samples: 450248800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:07:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:07:18,336][12883] Updated weights for policy 0, policy_version 27480 (0.0039) +[2024-06-18 01:07:21,947][12883] Updated weights for policy 0, policy_version 27490 (0.0040) +[2024-06-18 01:07:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 450396160. Throughput: 0: 40941.3. Samples: 450498800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:07:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:07:24,062][12862] Signal inference workers to stop experience collection... (6350 times) +[2024-06-18 01:07:24,115][12883] InferenceWorker_p0-w0: stopping experience collection (6350 times) +[2024-06-18 01:07:24,118][12862] Signal inference workers to resume experience collection... (6350 times) +[2024-06-18 01:07:24,144][12883] InferenceWorker_p0-w0: resuming experience collection (6350 times) +[2024-06-18 01:07:26,421][12883] Updated weights for policy 0, policy_version 27500 (0.0047) +[2024-06-18 01:07:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 450576384. Throughput: 0: 40891.1. Samples: 450744440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:07:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:07:29,767][12883] Updated weights for policy 0, policy_version 27510 (0.0028) +[2024-06-18 01:07:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40961.5, 300 sec: 41043.6). Total num frames: 450805760. Throughput: 0: 40956.1. Samples: 450869100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 01:07:31,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:07:34,239][12883] Updated weights for policy 0, policy_version 27520 (0.0033) +[2024-06-18 01:07:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41506.1, 300 sec: 40932.3). Total num frames: 451018752. Throughput: 0: 40905.4. Samples: 451115360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 01:07:36,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:07:37,751][12883] Updated weights for policy 0, policy_version 27530 (0.0032) +[2024-06-18 01:07:41,996][12645] Fps is (10 sec: 39312.9, 60 sec: 40958.5, 300 sec: 40876.4). Total num frames: 451198976. Throughput: 0: 41058.1. Samples: 451366200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 01:07:41,996][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:07:42,180][12883] Updated weights for policy 0, policy_version 27540 (0.0030) +[2024-06-18 01:07:45,664][12883] Updated weights for policy 0, policy_version 27550 (0.0038) +[2024-06-18 01:07:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 451411968. Throughput: 0: 40930.6. Samples: 451484340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 01:07:46,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:07:50,446][12883] Updated weights for policy 0, policy_version 27560 (0.0045) +[2024-06-18 01:07:51,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 451608576. Throughput: 0: 40969.8. Samples: 451729400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:07:52,004][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:07:53,780][12883] Updated weights for policy 0, policy_version 27570 (0.0029) +[2024-06-18 01:07:56,996][12645] Fps is (10 sec: 39312.2, 60 sec: 40685.5, 300 sec: 40931.9). Total num frames: 451805184. Throughput: 0: 40959.3. Samples: 451971720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:07:57,005][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:07:58,281][12883] Updated weights for policy 0, policy_version 27580 (0.0040) +[2024-06-18 01:08:01,686][12883] Updated weights for policy 0, policy_version 27590 (0.0044) +[2024-06-18 01:08:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41234.7, 300 sec: 40932.2). Total num frames: 452034560. Throughput: 0: 41114.3. Samples: 452098940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:08:01,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 01:08:01,994][12862] Saving new best policy, reward=0.048! +[2024-06-18 01:08:06,149][12883] Updated weights for policy 0, policy_version 27600 (0.0033) +[2024-06-18 01:08:06,994][12645] Fps is (10 sec: 40969.4, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 452214784. Throughput: 0: 41104.8. Samples: 452348520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:08:06,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 01:08:09,499][12883] Updated weights for policy 0, policy_version 27610 (0.0033) +[2024-06-18 01:08:11,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 452427776. Throughput: 0: 41051.0. Samples: 452591740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:08:12,000][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:08:14,327][12883] Updated weights for policy 0, policy_version 27620 (0.0031) +[2024-06-18 01:08:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 40932.2). Total num frames: 452657152. Throughput: 0: 41190.3. Samples: 452722660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:08:16,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:08:17,355][12883] Updated weights for policy 0, policy_version 27630 (0.0041) +[2024-06-18 01:08:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40686.9, 300 sec: 40821.2). Total num frames: 452837376. Throughput: 0: 41092.9. Samples: 452964540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:08:21,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:08:22,100][12883] Updated weights for policy 0, policy_version 27640 (0.0030) +[2024-06-18 01:08:25,327][12883] Updated weights for policy 0, policy_version 27650 (0.0041) +[2024-06-18 01:08:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 40987.7). Total num frames: 453066752. Throughput: 0: 40918.8. Samples: 453207460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) +[2024-06-18 01:08:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:08:30,202][12883] Updated weights for policy 0, policy_version 27660 (0.0029) +[2024-06-18 01:08:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 453263360. Throughput: 0: 41267.0. Samples: 453341360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) +[2024-06-18 01:08:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:08:33,161][12883] Updated weights for policy 0, policy_version 27670 (0.0033) +[2024-06-18 01:08:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 453459968. Throughput: 0: 41281.4. Samples: 453587060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) +[2024-06-18 01:08:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:08:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027677_453459968.pth... +[2024-06-18 01:08:37,084][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027077_443629568.pth +[2024-06-18 01:08:37,929][12883] Updated weights for policy 0, policy_version 27680 (0.0040) +[2024-06-18 01:08:41,256][12883] Updated weights for policy 0, policy_version 27690 (0.0039) +[2024-06-18 01:08:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41507.7, 300 sec: 40932.2). Total num frames: 453689344. Throughput: 0: 41287.4. Samples: 453829560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) +[2024-06-18 01:08:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:08:45,852][12883] Updated weights for policy 0, policy_version 27700 (0.0043) +[2024-06-18 01:08:46,996][12645] Fps is (10 sec: 40950.8, 60 sec: 40958.4, 300 sec: 40765.6). Total num frames: 453869568. Throughput: 0: 41361.8. Samples: 453960320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 01:08:46,997][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:08:49,401][12883] Updated weights for policy 0, policy_version 27710 (0.0038) +[2024-06-18 01:08:49,769][12862] Signal inference workers to stop experience collection... (6400 times) +[2024-06-18 01:08:49,796][12883] InferenceWorker_p0-w0: stopping experience collection (6400 times) +[2024-06-18 01:08:49,822][12862] Signal inference workers to resume experience collection... (6400 times) +[2024-06-18 01:08:49,823][12883] InferenceWorker_p0-w0: resuming experience collection (6400 times) +[2024-06-18 01:08:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 454066176. Throughput: 0: 41086.2. Samples: 454197400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 01:08:51,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:08:53,578][12883] Updated weights for policy 0, policy_version 27720 (0.0040) +[2024-06-18 01:08:56,994][12645] Fps is (10 sec: 42608.0, 60 sec: 41507.7, 300 sec: 40932.2). Total num frames: 454295552. Throughput: 0: 41232.9. Samples: 454447220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 01:08:56,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:08:57,229][12883] Updated weights for policy 0, policy_version 27730 (0.0039) +[2024-06-18 01:09:01,470][12883] Updated weights for policy 0, policy_version 27740 (0.0037) +[2024-06-18 01:09:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 454492160. Throughput: 0: 41269.4. Samples: 454579780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-18 01:09:01,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:09:05,143][12883] Updated weights for policy 0, policy_version 27750 (0.0037) +[2024-06-18 01:09:06,996][12645] Fps is (10 sec: 40950.8, 60 sec: 41504.6, 300 sec: 40931.9). Total num frames: 454705152. Throughput: 0: 41198.3. Samples: 454818560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-18 01:09:06,997][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:09:09,630][12883] Updated weights for policy 0, policy_version 27760 (0.0037) +[2024-06-18 01:09:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 40876.7). Total num frames: 454918144. Throughput: 0: 41399.2. Samples: 455070420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-18 01:09:11,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:09:13,023][12883] Updated weights for policy 0, policy_version 27770 (0.0039) +[2024-06-18 01:09:16,994][12645] Fps is (10 sec: 39330.8, 60 sec: 40687.0, 300 sec: 40932.3). Total num frames: 455098368. Throughput: 0: 41160.5. Samples: 455193580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-18 01:09:16,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:09:17,474][12883] Updated weights for policy 0, policy_version 27780 (0.0034) +[2024-06-18 01:09:21,031][12883] Updated weights for policy 0, policy_version 27790 (0.0044) +[2024-06-18 01:09:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 455327744. Throughput: 0: 41115.1. Samples: 455437240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 01:09:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:09:25,513][12883] Updated weights for policy 0, policy_version 27800 (0.0045) +[2024-06-18 01:09:26,996][12645] Fps is (10 sec: 44226.6, 60 sec: 41231.6, 300 sec: 40876.4). Total num frames: 455540736. Throughput: 0: 41241.9. Samples: 455685540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 01:09:26,996][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:09:29,155][12883] Updated weights for policy 0, policy_version 27810 (0.0036) +[2024-06-18 01:09:31,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 455720960. Throughput: 0: 41089.3. Samples: 455809240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 01:09:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:09:33,421][12883] Updated weights for policy 0, policy_version 27820 (0.0040) +[2024-06-18 01:09:36,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41506.2, 300 sec: 40987.8). Total num frames: 455950336. Throughput: 0: 41268.4. Samples: 456054480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 01:09:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:09:37,526][12883] Updated weights for policy 0, policy_version 27830 (0.0028) +[2024-06-18 01:09:41,919][12883] Updated weights for policy 0, policy_version 27840 (0.0036) +[2024-06-18 01:09:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 456130560. Throughput: 0: 41381.8. Samples: 456309400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:09:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:09:45,329][12883] Updated weights for policy 0, policy_version 27850 (0.0047) +[2024-06-18 01:09:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41234.7, 300 sec: 40987.8). Total num frames: 456343552. Throughput: 0: 40991.1. Samples: 456424380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:09:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:09:49,857][12883] Updated weights for policy 0, policy_version 27860 (0.0046) +[2024-06-18 01:09:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 40932.2). Total num frames: 456556544. Throughput: 0: 41057.6. Samples: 456666060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:09:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:09:53,688][12883] Updated weights for policy 0, policy_version 27870 (0.0037) +[2024-06-18 01:09:56,994][12645] Fps is (10 sec: 40959.0, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 456753152. Throughput: 0: 41016.3. Samples: 456916160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 01:09:56,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 01:09:57,827][12883] Updated weights for policy 0, policy_version 27880 (0.0050) +[2024-06-18 01:10:01,447][12883] Updated weights for policy 0, policy_version 27890 (0.0048) +[2024-06-18 01:10:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41232.9, 300 sec: 40932.2). Total num frames: 456966144. Throughput: 0: 41036.7. Samples: 457040240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 01:10:01,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:10:05,574][12883] Updated weights for policy 0, policy_version 27900 (0.0032) +[2024-06-18 01:10:06,994][12645] Fps is (10 sec: 44237.9, 60 sec: 41507.8, 300 sec: 41043.3). Total num frames: 457195520. Throughput: 0: 41148.6. Samples: 457288920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 01:10:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:10:09,404][12883] Updated weights for policy 0, policy_version 27910 (0.0040) +[2024-06-18 01:10:11,994][12645] Fps is (10 sec: 40958.4, 60 sec: 40959.6, 300 sec: 41098.8). Total num frames: 457375744. Throughput: 0: 41029.1. Samples: 457531780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 01:10:11,995][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:10:13,319][12883] Updated weights for policy 0, policy_version 27920 (0.0045) +[2024-06-18 01:10:13,961][12862] Signal inference workers to stop experience collection... (6450 times) +[2024-06-18 01:10:14,007][12883] InferenceWorker_p0-w0: stopping experience collection (6450 times) +[2024-06-18 01:10:14,012][12862] Signal inference workers to resume experience collection... (6450 times) +[2024-06-18 01:10:14,021][12883] InferenceWorker_p0-w0: resuming experience collection (6450 times) +[2024-06-18 01:10:16,994][12645] Fps is (10 sec: 36044.4, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 457555968. Throughput: 0: 40809.7. Samples: 457645680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 01:10:16,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:10:17,646][12883] Updated weights for policy 0, policy_version 27930 (0.0046) +[2024-06-18 01:10:21,365][12883] Updated weights for policy 0, policy_version 27940 (0.0037) +[2024-06-18 01:10:22,000][12645] Fps is (10 sec: 40936.8, 60 sec: 40955.8, 300 sec: 41042.5). Total num frames: 457785344. Throughput: 0: 40890.8. Samples: 457894820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 01:10:22,000][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:10:25,524][12883] Updated weights for policy 0, policy_version 27950 (0.0041) +[2024-06-18 01:10:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40688.5, 300 sec: 41098.8). Total num frames: 457981952. Throughput: 0: 40775.6. Samples: 458144300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 01:10:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:10:29,253][12883] Updated weights for policy 0, policy_version 27960 (0.0036) +[2024-06-18 01:10:31,994][12645] Fps is (10 sec: 39345.7, 60 sec: 40959.9, 300 sec: 40987.7). Total num frames: 458178560. Throughput: 0: 40954.5. Samples: 458267340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 01:10:31,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:10:33,206][12883] Updated weights for policy 0, policy_version 27970 (0.0037) +[2024-06-18 01:10:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.8, 300 sec: 41098.8). Total num frames: 458391552. Throughput: 0: 41098.2. Samples: 458515480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) +[2024-06-18 01:10:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:10:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027978_458391552.pth... +[2024-06-18 01:10:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027377_448544768.pth +[2024-06-18 01:10:37,394][12883] Updated weights for policy 0, policy_version 27980 (0.0032) +[2024-06-18 01:10:41,120][12883] Updated weights for policy 0, policy_version 27990 (0.0036) +[2024-06-18 01:10:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 458604544. Throughput: 0: 40938.7. Samples: 458758400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) +[2024-06-18 01:10:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:10:45,447][12883] Updated weights for policy 0, policy_version 28000 (0.0030) +[2024-06-18 01:10:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 458817536. Throughput: 0: 40899.2. Samples: 458880700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) +[2024-06-18 01:10:46,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:10:49,063][12883] Updated weights for policy 0, policy_version 28010 (0.0038) +[2024-06-18 01:10:51,994][12645] Fps is (10 sec: 39322.5, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 458997760. Throughput: 0: 40798.7. Samples: 459124860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 01:10:51,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:10:53,476][12883] Updated weights for policy 0, policy_version 28020 (0.0036) +[2024-06-18 01:10:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 459227136. Throughput: 0: 40867.7. Samples: 459370800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 01:10:56,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:10:57,399][12883] Updated weights for policy 0, policy_version 28030 (0.0035) +[2024-06-18 01:11:01,751][12883] Updated weights for policy 0, policy_version 28040 (0.0042) +[2024-06-18 01:11:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 459423744. Throughput: 0: 41093.3. Samples: 459494880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 01:11:01,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:11:05,395][12883] Updated weights for policy 0, policy_version 28050 (0.0036) +[2024-06-18 01:11:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 41154.4). Total num frames: 459620352. Throughput: 0: 40942.5. Samples: 459736980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 01:11:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:11:09,729][12883] Updated weights for policy 0, policy_version 28060 (0.0036) +[2024-06-18 01:11:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40960.4, 300 sec: 41044.2). Total num frames: 459833344. Throughput: 0: 40882.2. Samples: 459984000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:11:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:11:13,418][12883] Updated weights for policy 0, policy_version 28070 (0.0032) +[2024-06-18 01:11:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 460013568. Throughput: 0: 40752.0. Samples: 460101180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:11:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:11:17,944][12883] Updated weights for policy 0, policy_version 28080 (0.0033) +[2024-06-18 01:11:21,521][12883] Updated weights for policy 0, policy_version 28090 (0.0041) +[2024-06-18 01:11:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40964.2, 300 sec: 41098.9). Total num frames: 460242944. Throughput: 0: 40721.4. Samples: 460347940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:11:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:11:26,027][12883] Updated weights for policy 0, policy_version 28100 (0.0026) +[2024-06-18 01:11:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 40988.1). Total num frames: 460439552. Throughput: 0: 40803.6. Samples: 460594560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:11:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:11:29,495][12883] Updated weights for policy 0, policy_version 28110 (0.0035) +[2024-06-18 01:11:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 460652544. Throughput: 0: 40690.3. Samples: 460711760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 01:11:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:11:33,941][12883] Updated weights for policy 0, policy_version 28120 (0.0046) +[2024-06-18 01:11:36,996][12645] Fps is (10 sec: 39313.0, 60 sec: 40685.5, 300 sec: 40987.5). Total num frames: 460832768. Throughput: 0: 40678.4. Samples: 460955480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 01:11:36,997][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:11:37,014][12862] Signal inference workers to stop experience collection... (6500 times) +[2024-06-18 01:11:37,014][12862] Signal inference workers to resume experience collection... (6500 times) +[2024-06-18 01:11:37,036][12883] InferenceWorker_p0-w0: stopping experience collection (6500 times) +[2024-06-18 01:11:37,037][12883] InferenceWorker_p0-w0: resuming experience collection (6500 times) +[2024-06-18 01:11:37,848][12883] Updated weights for policy 0, policy_version 28130 (0.0030) +[2024-06-18 01:11:41,994][12645] Fps is (10 sec: 36045.2, 60 sec: 40141.0, 300 sec: 40932.3). Total num frames: 461012992. Throughput: 0: 40798.7. Samples: 461206740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 01:11:41,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:11:42,379][12883] Updated weights for policy 0, policy_version 28140 (0.0052) +[2024-06-18 01:11:45,685][12883] Updated weights for policy 0, policy_version 28150 (0.0038) +[2024-06-18 01:11:46,994][12645] Fps is (10 sec: 42608.1, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 461258752. Throughput: 0: 40669.9. Samples: 461325020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 01:11:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:11:50,248][12883] Updated weights for policy 0, policy_version 28160 (0.0039) +[2024-06-18 01:11:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 461455360. Throughput: 0: 40865.8. Samples: 461575940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 01:11:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:11:53,641][12883] Updated weights for policy 0, policy_version 28170 (0.0038) +[2024-06-18 01:11:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 40988.1). Total num frames: 461651968. Throughput: 0: 40807.5. Samples: 461820340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 01:11:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:11:58,093][12883] Updated weights for policy 0, policy_version 28180 (0.0037) +[2024-06-18 01:12:01,687][12883] Updated weights for policy 0, policy_version 28190 (0.0037) +[2024-06-18 01:12:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 461864960. Throughput: 0: 40915.2. Samples: 461942360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 01:12:01,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:12:05,938][12883] Updated weights for policy 0, policy_version 28200 (0.0038) +[2024-06-18 01:12:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 462061568. Throughput: 0: 40919.1. Samples: 462189300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 01:12:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:12:09,545][12883] Updated weights for policy 0, policy_version 28210 (0.0035) +[2024-06-18 01:12:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 462274560. Throughput: 0: 40803.2. Samples: 462430700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 01:12:11,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:12:14,370][12883] Updated weights for policy 0, policy_version 28220 (0.0039) +[2024-06-18 01:12:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 462487552. Throughput: 0: 40948.9. Samples: 462554460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 01:12:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:12:17,530][12883] Updated weights for policy 0, policy_version 28230 (0.0035) +[2024-06-18 01:12:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 462667776. Throughput: 0: 40865.5. Samples: 462794340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 01:12:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:12:22,539][12883] Updated weights for policy 0, policy_version 28240 (0.0025) +[2024-06-18 01:12:25,714][12883] Updated weights for policy 0, policy_version 28250 (0.0043) +[2024-06-18 01:12:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 462880768. Throughput: 0: 40750.9. Samples: 463040540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 01:12:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:12:30,342][12883] Updated weights for policy 0, policy_version 28260 (0.0035) +[2024-06-18 01:12:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 463093760. Throughput: 0: 40860.4. Samples: 463163740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 01:12:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:12:34,154][12883] Updated weights for policy 0, policy_version 28270 (0.0042) +[2024-06-18 01:12:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40961.5, 300 sec: 40988.1). Total num frames: 463290368. Throughput: 0: 40704.4. Samples: 463407640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 01:12:37,000][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:12:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028277_463290368.pth... +[2024-06-18 01:12:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027677_453459968.pth +[2024-06-18 01:12:38,359][12883] Updated weights for policy 0, policy_version 28280 (0.0028) +[2024-06-18 01:12:41,994][12645] Fps is (10 sec: 39318.5, 60 sec: 41232.5, 300 sec: 40932.1). Total num frames: 463486976. Throughput: 0: 40622.0. Samples: 463648360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 01:12:41,995][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:12:42,037][12883] Updated weights for policy 0, policy_version 28290 (0.0039) +[2024-06-18 01:12:46,175][12883] Updated weights for policy 0, policy_version 28300 (0.0032) +[2024-06-18 01:12:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 463683584. Throughput: 0: 40723.0. Samples: 463774900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 01:12:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:12:49,907][12883] Updated weights for policy 0, policy_version 28310 (0.0034) +[2024-06-18 01:12:51,994][12645] Fps is (10 sec: 39324.6, 60 sec: 40413.9, 300 sec: 40932.5). Total num frames: 463880192. Throughput: 0: 40712.4. Samples: 464021360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 01:12:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:12:54,160][12883] Updated weights for policy 0, policy_version 28320 (0.0035) +[2024-06-18 01:12:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 464125952. Throughput: 0: 40614.6. Samples: 464258360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 01:12:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:12:58,324][12883] Updated weights for policy 0, policy_version 28330 (0.0033) +[2024-06-18 01:13:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 464306176. Throughput: 0: 40790.2. Samples: 464390020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 01:13:02,007][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:13:02,557][12883] Updated weights for policy 0, policy_version 28340 (0.0050) +[2024-06-18 01:13:06,073][12883] Updated weights for policy 0, policy_version 28350 (0.0035) +[2024-06-18 01:13:06,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40687.0, 300 sec: 40932.3). Total num frames: 464502784. Throughput: 0: 40639.7. Samples: 464623120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 01:13:06,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:13:10,375][12883] Updated weights for policy 0, policy_version 28360 (0.0046) +[2024-06-18 01:13:11,577][12862] Signal inference workers to stop experience collection... (6550 times) +[2024-06-18 01:13:11,625][12883] InferenceWorker_p0-w0: stopping experience collection (6550 times) +[2024-06-18 01:13:11,627][12862] Signal inference workers to resume experience collection... (6550 times) +[2024-06-18 01:13:11,638][12883] InferenceWorker_p0-w0: resuming experience collection (6550 times) +[2024-06-18 01:13:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 464732160. Throughput: 0: 40824.5. Samples: 464877640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 01:13:11,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:13:13,948][12883] Updated weights for policy 0, policy_version 28370 (0.0039) +[2024-06-18 01:13:16,999][12645] Fps is (10 sec: 40939.8, 60 sec: 40410.6, 300 sec: 40931.5). Total num frames: 464912384. Throughput: 0: 40792.5. Samples: 464999600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 01:13:17,004][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:13:18,633][12883] Updated weights for policy 0, policy_version 28380 (0.0032) +[2024-06-18 01:13:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 465108992. Throughput: 0: 40717.3. Samples: 465239920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 01:13:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:13:22,460][12883] Updated weights for policy 0, policy_version 28390 (0.0047) +[2024-06-18 01:13:26,508][12883] Updated weights for policy 0, policy_version 28400 (0.0043) +[2024-06-18 01:13:26,994][12645] Fps is (10 sec: 40979.5, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 465321984. Throughput: 0: 40968.6. Samples: 465491920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 01:13:26,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:13:30,181][12883] Updated weights for policy 0, policy_version 28410 (0.0049) +[2024-06-18 01:13:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 465534976. Throughput: 0: 40862.7. Samples: 465613720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 01:13:31,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:13:34,277][12883] Updated weights for policy 0, policy_version 28420 (0.0047) +[2024-06-18 01:13:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 465731584. Throughput: 0: 40907.1. Samples: 465862180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 01:13:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:13:38,443][12883] Updated weights for policy 0, policy_version 28430 (0.0042) +[2024-06-18 01:13:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.4, 300 sec: 40932.5). Total num frames: 465944576. Throughput: 0: 41089.3. Samples: 466107380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 01:13:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:13:42,151][12883] Updated weights for policy 0, policy_version 28440 (0.0037) +[2024-06-18 01:13:46,131][12883] Updated weights for policy 0, policy_version 28450 (0.0034) +[2024-06-18 01:13:46,996][12645] Fps is (10 sec: 42588.7, 60 sec: 41231.6, 300 sec: 40987.5). Total num frames: 466157568. Throughput: 0: 41008.2. Samples: 466235480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 01:13:46,996][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:13:50,000][12883] Updated weights for policy 0, policy_version 28460 (0.0044) +[2024-06-18 01:13:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 40876.7). Total num frames: 466354176. Throughput: 0: 41149.2. Samples: 466474840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 01:13:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:13:53,908][12883] Updated weights for policy 0, policy_version 28470 (0.0031) +[2024-06-18 01:13:56,994][12645] Fps is (10 sec: 40969.1, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 466567168. Throughput: 0: 41106.7. Samples: 466727440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 01:13:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:13:57,560][12883] Updated weights for policy 0, policy_version 28480 (0.0040) +[2024-06-18 01:14:01,846][12883] Updated weights for policy 0, policy_version 28490 (0.0043) +[2024-06-18 01:14:01,999][12645] Fps is (10 sec: 42575.3, 60 sec: 41229.3, 300 sec: 40931.8). Total num frames: 466780160. Throughput: 0: 41213.6. Samples: 466854240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:14:02,000][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:14:05,315][12883] Updated weights for policy 0, policy_version 28500 (0.0040) +[2024-06-18 01:14:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 466993152. Throughput: 0: 41202.1. Samples: 467094020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:14:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:14:09,838][12883] Updated weights for policy 0, policy_version 28510 (0.0036) +[2024-06-18 01:14:11,994][12645] Fps is (10 sec: 40982.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 467189760. Throughput: 0: 41305.4. Samples: 467350660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:14:11,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:14:12,995][12883] Updated weights for policy 0, policy_version 28520 (0.0036) +[2024-06-18 01:14:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41236.4, 300 sec: 40876.7). Total num frames: 467386368. Throughput: 0: 41234.3. Samples: 467469260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:14:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:14:17,961][12883] Updated weights for policy 0, policy_version 28530 (0.0030) +[2024-06-18 01:14:20,359][12862] Signal inference workers to stop experience collection... (6600 times) +[2024-06-18 01:14:20,359][12862] Signal inference workers to resume experience collection... (6600 times) +[2024-06-18 01:14:20,394][12883] InferenceWorker_p0-w0: stopping experience collection (6600 times) +[2024-06-18 01:14:20,394][12883] InferenceWorker_p0-w0: resuming experience collection (6600 times) +[2024-06-18 01:14:20,913][12883] Updated weights for policy 0, policy_version 28540 (0.0027) +[2024-06-18 01:14:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 40932.5). Total num frames: 467615744. Throughput: 0: 41163.6. Samples: 467714540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) +[2024-06-18 01:14:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:14:26,426][12883] Updated weights for policy 0, policy_version 28550 (0.0047) +[2024-06-18 01:14:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 467779584. Throughput: 0: 41357.0. Samples: 467968440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) +[2024-06-18 01:14:26,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:14:28,976][12883] Updated weights for policy 0, policy_version 28560 (0.0031) +[2024-06-18 01:14:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 468008960. Throughput: 0: 40967.3. Samples: 468078920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) +[2024-06-18 01:14:31,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:14:34,335][12883] Updated weights for policy 0, policy_version 28570 (0.0041) +[2024-06-18 01:14:36,803][12883] Updated weights for policy 0, policy_version 28580 (0.0031) +[2024-06-18 01:14:36,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 41098.9). Total num frames: 468254720. Throughput: 0: 41320.1. Samples: 468334240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) +[2024-06-18 01:14:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:14:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028580_468254720.pth... +[2024-06-18 01:14:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027978_458391552.pth +[2024-06-18 01:14:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40687.0, 300 sec: 40821.1). Total num frames: 468385792. Throughput: 0: 41413.3. Samples: 468591040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) +[2024-06-18 01:14:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:14:42,623][12883] Updated weights for policy 0, policy_version 28590 (0.0030) +[2024-06-18 01:14:45,164][12883] Updated weights for policy 0, policy_version 28600 (0.0042) +[2024-06-18 01:14:46,994][12645] Fps is (10 sec: 37682.6, 60 sec: 41234.5, 300 sec: 40932.2). Total num frames: 468631552. Throughput: 0: 40913.3. Samples: 468695120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) +[2024-06-18 01:14:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:14:50,324][12883] Updated weights for policy 0, policy_version 28610 (0.0046) +[2024-06-18 01:14:51,994][12645] Fps is (10 sec: 45875.9, 60 sec: 41506.3, 300 sec: 40987.8). Total num frames: 468844544. Throughput: 0: 41321.1. Samples: 468953460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) +[2024-06-18 01:14:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:14:52,809][12883] Updated weights for policy 0, policy_version 28620 (0.0032) +[2024-06-18 01:14:56,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 469008384. Throughput: 0: 41199.6. Samples: 469204640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 01:14:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:14:58,157][12883] Updated weights for policy 0, policy_version 28630 (0.0041) +[2024-06-18 01:15:00,517][12883] Updated weights for policy 0, policy_version 28640 (0.0037) +[2024-06-18 01:15:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41236.9, 300 sec: 40876.7). Total num frames: 469254144. Throughput: 0: 41191.2. Samples: 469322860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 01:15:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:15:06,047][12883] Updated weights for policy 0, policy_version 28650 (0.0041) +[2024-06-18 01:15:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 40960.0, 300 sec: 40932.3). Total num frames: 469450752. Throughput: 0: 41367.4. Samples: 469576080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 01:15:06,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 01:15:08,623][12883] Updated weights for policy 0, policy_version 28660 (0.0031) +[2024-06-18 01:15:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 469647360. Throughput: 0: 41118.6. Samples: 469818780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 01:15:11,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:15:13,871][12883] Updated weights for policy 0, policy_version 28670 (0.0034) +[2024-06-18 01:15:16,607][12883] Updated weights for policy 0, policy_version 28680 (0.0037) +[2024-06-18 01:15:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 41044.2). Total num frames: 469893120. Throughput: 0: 41422.2. Samples: 469942920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 01:15:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:15:21,699][12883] Updated weights for policy 0, policy_version 28690 (0.0035) +[2024-06-18 01:15:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 470056960. Throughput: 0: 41214.5. Samples: 470188900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 01:15:21,995][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:15:24,945][12883] Updated weights for policy 0, policy_version 28700 (0.0035) +[2024-06-18 01:15:26,994][12645] Fps is (10 sec: 36045.6, 60 sec: 41233.1, 300 sec: 40932.3). Total num frames: 470253568. Throughput: 0: 40922.8. Samples: 470432560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 01:15:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:15:29,722][12883] Updated weights for policy 0, policy_version 28710 (0.0034) +[2024-06-18 01:15:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 470499328. Throughput: 0: 41389.4. Samples: 470557640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 01:15:31,996][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:15:33,234][12883] Updated weights for policy 0, policy_version 28720 (0.0035) +[2024-06-18 01:15:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40413.7, 300 sec: 40932.2). Total num frames: 470679552. Throughput: 0: 41042.9. Samples: 470800400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:15:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:15:37,463][12883] Updated weights for policy 0, policy_version 28730 (0.0031) +[2024-06-18 01:15:41,313][12883] Updated weights for policy 0, policy_version 28740 (0.0037) +[2024-06-18 01:15:41,994][12645] Fps is (10 sec: 37681.8, 60 sec: 41505.9, 300 sec: 40876.6). Total num frames: 470876160. Throughput: 0: 40922.7. Samples: 471046180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:15:41,995][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:15:43,356][12862] Signal inference workers to stop experience collection... (6650 times) +[2024-06-18 01:15:43,396][12883] InferenceWorker_p0-w0: stopping experience collection (6650 times) +[2024-06-18 01:15:43,409][12862] Signal inference workers to resume experience collection... (6650 times) +[2024-06-18 01:15:43,413][12883] InferenceWorker_p0-w0: resuming experience collection (6650 times) +[2024-06-18 01:15:45,324][12883] Updated weights for policy 0, policy_version 28750 (0.0044) +[2024-06-18 01:15:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.7). Total num frames: 471089152. Throughput: 0: 41138.1. Samples: 471174080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:15:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:15:49,215][12883] Updated weights for policy 0, policy_version 28760 (0.0035) +[2024-06-18 01:15:51,994][12645] Fps is (10 sec: 40961.3, 60 sec: 40686.8, 300 sec: 40876.7). Total num frames: 471285760. Throughput: 0: 40903.2. Samples: 471416720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:15:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:15:53,345][12883] Updated weights for policy 0, policy_version 28770 (0.0039) +[2024-06-18 01:15:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 40987.8). Total num frames: 471515136. Throughput: 0: 40979.1. Samples: 471662840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 01:15:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:15:57,679][12883] Updated weights for policy 0, policy_version 28780 (0.0049) +[2024-06-18 01:16:01,374][12883] Updated weights for policy 0, policy_version 28790 (0.0040) +[2024-06-18 01:16:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 471695360. Throughput: 0: 40950.0. Samples: 471785660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 01:16:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:16:05,554][12883] Updated weights for policy 0, policy_version 28800 (0.0035) +[2024-06-18 01:16:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 471908352. Throughput: 0: 40937.8. Samples: 472031100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 01:16:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:16:09,531][12883] Updated weights for policy 0, policy_version 28810 (0.0037) +[2024-06-18 01:16:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 472121344. Throughput: 0: 40924.4. Samples: 472274160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:16:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:16:13,678][12883] Updated weights for policy 0, policy_version 28820 (0.0029) +[2024-06-18 01:16:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 472317952. Throughput: 0: 40932.5. Samples: 472399600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:16:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:16:17,213][12883] Updated weights for policy 0, policy_version 28830 (0.0030) +[2024-06-18 01:16:21,630][12883] Updated weights for policy 0, policy_version 28840 (0.0039) +[2024-06-18 01:16:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 472530944. Throughput: 0: 41188.6. Samples: 472653880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:16:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:16:25,032][12883] Updated weights for policy 0, policy_version 28850 (0.0045) +[2024-06-18 01:16:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 40987.8). Total num frames: 472743936. Throughput: 0: 41158.5. Samples: 472898300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:16:26,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:16:29,416][12883] Updated weights for policy 0, policy_version 28860 (0.0048) +[2024-06-18 01:16:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 41043.6). Total num frames: 472940544. Throughput: 0: 41047.7. Samples: 473021220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 01:16:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:16:33,036][12883] Updated weights for policy 0, policy_version 28870 (0.0037) +[2024-06-18 01:16:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 473137152. Throughput: 0: 40990.7. Samples: 473261300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 01:16:36,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 01:16:37,068][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028879_473153536.pth... +[2024-06-18 01:16:37,123][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028277_463290368.pth +[2024-06-18 01:16:37,401][12883] Updated weights for policy 0, policy_version 28880 (0.0033) +[2024-06-18 01:16:41,601][12883] Updated weights for policy 0, policy_version 28890 (0.0035) +[2024-06-18 01:16:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.4, 300 sec: 40987.8). Total num frames: 473350144. Throughput: 0: 41039.2. Samples: 473509600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 01:16:41,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 01:16:45,356][12883] Updated weights for policy 0, policy_version 28900 (0.0034) +[2024-06-18 01:16:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 473563136. Throughput: 0: 41064.9. Samples: 473633580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 01:16:46,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:16:49,406][12883] Updated weights for policy 0, policy_version 28910 (0.0034) +[2024-06-18 01:16:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 473743360. Throughput: 0: 40872.2. Samples: 473870340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:16:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:16:53,643][12883] Updated weights for policy 0, policy_version 28920 (0.0026) +[2024-06-18 01:16:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 473956352. Throughput: 0: 40817.6. Samples: 474110960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:16:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:16:57,774][12883] Updated weights for policy 0, policy_version 28930 (0.0034) +[2024-06-18 01:17:01,723][12883] Updated weights for policy 0, policy_version 28940 (0.0037) +[2024-06-18 01:17:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 474169344. Throughput: 0: 40915.1. Samples: 474240780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:17:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:17:05,542][12883] Updated weights for policy 0, policy_version 28950 (0.0039) +[2024-06-18 01:17:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 474365952. Throughput: 0: 40610.1. Samples: 474481340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:17:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:17:09,706][12883] Updated weights for policy 0, policy_version 28960 (0.0043) +[2024-06-18 01:17:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 474578944. Throughput: 0: 40645.9. Samples: 474727360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:17:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:17:12,284][12862] Signal inference workers to stop experience collection... (6700 times) +[2024-06-18 01:17:12,285][12862] Signal inference workers to resume experience collection... (6700 times) +[2024-06-18 01:17:12,314][12883] InferenceWorker_p0-w0: stopping experience collection (6700 times) +[2024-06-18 01:17:12,315][12883] InferenceWorker_p0-w0: resuming experience collection (6700 times) +[2024-06-18 01:17:13,402][12883] Updated weights for policy 0, policy_version 28970 (0.0037) +[2024-06-18 01:17:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 474775552. Throughput: 0: 40530.0. Samples: 474845080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:17:16,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:17:18,076][12883] Updated weights for policy 0, policy_version 28980 (0.0037) +[2024-06-18 01:17:21,277][12883] Updated weights for policy 0, policy_version 28990 (0.0042) +[2024-06-18 01:17:21,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41232.9, 300 sec: 41098.8). Total num frames: 475004928. Throughput: 0: 40674.1. Samples: 475091640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:17:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:17:26,060][12883] Updated weights for policy 0, policy_version 29000 (0.0037) +[2024-06-18 01:17:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 475185152. Throughput: 0: 40745.8. Samples: 475343160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:17:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:17:29,359][12883] Updated weights for policy 0, policy_version 29010 (0.0027) +[2024-06-18 01:17:31,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40686.8, 300 sec: 40987.8). Total num frames: 475381760. Throughput: 0: 40586.2. Samples: 475459960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:17:31,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:17:34,215][12883] Updated weights for policy 0, policy_version 29020 (0.0032) +[2024-06-18 01:17:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 41043.4). Total num frames: 475594752. Throughput: 0: 40874.2. Samples: 475709680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:17:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:17:37,219][12883] Updated weights for policy 0, policy_version 29030 (0.0030) +[2024-06-18 01:17:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 475774976. Throughput: 0: 41015.5. Samples: 475956660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:17:41,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:17:42,316][12883] Updated weights for policy 0, policy_version 29040 (0.0037) +[2024-06-18 01:17:45,227][12883] Updated weights for policy 0, policy_version 29050 (0.0033) +[2024-06-18 01:17:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40413.8, 300 sec: 41043.3). Total num frames: 475987968. Throughput: 0: 40823.5. Samples: 476077840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:17:46,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:17:50,135][12883] Updated weights for policy 0, policy_version 29060 (0.0044) +[2024-06-18 01:17:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 476217344. Throughput: 0: 40992.9. Samples: 476326020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:17:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:17:53,190][12883] Updated weights for policy 0, policy_version 29070 (0.0026) +[2024-06-18 01:17:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 476397568. Throughput: 0: 41216.3. Samples: 476582100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:17:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:17:58,036][12883] Updated weights for policy 0, policy_version 29080 (0.0042) +[2024-06-18 01:18:01,081][12883] Updated weights for policy 0, policy_version 29090 (0.0029) +[2024-06-18 01:18:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 476626944. Throughput: 0: 41249.3. Samples: 476701300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:18:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:18:05,658][12883] Updated weights for policy 0, policy_version 29100 (0.0034) +[2024-06-18 01:18:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 476823552. Throughput: 0: 41357.4. Samples: 476952720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:18:06,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:18:09,036][12883] Updated weights for policy 0, policy_version 29110 (0.0048) +[2024-06-18 01:18:11,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40960.0, 300 sec: 41099.5). Total num frames: 477036544. Throughput: 0: 41200.0. Samples: 477197160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 01:18:11,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:18:14,016][12883] Updated weights for policy 0, policy_version 29120 (0.0032) +[2024-06-18 01:18:16,886][12883] Updated weights for policy 0, policy_version 29130 (0.0035) +[2024-06-18 01:18:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 477265920. Throughput: 0: 41308.8. Samples: 477318860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 01:18:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:18:21,994][12645] Fps is (10 sec: 37682.4, 60 sec: 40140.8, 300 sec: 40987.8). Total num frames: 477413376. Throughput: 0: 41409.6. Samples: 477573120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 01:18:21,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:18:22,055][12883] Updated weights for policy 0, policy_version 29140 (0.0040) +[2024-06-18 01:18:24,740][12883] Updated weights for policy 0, policy_version 29150 (0.0031) +[2024-06-18 01:18:26,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 477642752. Throughput: 0: 41051.2. Samples: 477803960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 01:18:26,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:18:27,144][12862] Signal inference workers to stop experience collection... (6750 times) +[2024-06-18 01:18:27,192][12883] InferenceWorker_p0-w0: stopping experience collection (6750 times) +[2024-06-18 01:18:27,214][12862] Signal inference workers to resume experience collection... (6750 times) +[2024-06-18 01:18:27,214][12883] InferenceWorker_p0-w0: resuming experience collection (6750 times) +[2024-06-18 01:18:29,915][12883] Updated weights for policy 0, policy_version 29160 (0.0032) +[2024-06-18 01:18:31,996][12645] Fps is (10 sec: 44227.7, 60 sec: 41231.6, 300 sec: 41098.5). Total num frames: 477855744. Throughput: 0: 41262.9. Samples: 477934760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:18:31,996][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:18:33,138][12883] Updated weights for policy 0, policy_version 29170 (0.0034) +[2024-06-18 01:18:36,996][12645] Fps is (10 sec: 37674.8, 60 sec: 40412.3, 300 sec: 40931.9). Total num frames: 478019584. Throughput: 0: 41104.2. Samples: 478175800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:18:36,997][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:18:37,063][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029177_478035968.pth... +[2024-06-18 01:18:37,127][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028580_468254720.pth +[2024-06-18 01:18:37,719][12883] Updated weights for policy 0, policy_version 29180 (0.0048) +[2024-06-18 01:18:41,284][12883] Updated weights for policy 0, policy_version 29190 (0.0030) +[2024-06-18 01:18:41,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41506.2, 300 sec: 41043.6). Total num frames: 478265344. Throughput: 0: 40785.4. Samples: 478417440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:18:42,000][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:18:45,562][12883] Updated weights for policy 0, policy_version 29200 (0.0028) +[2024-06-18 01:18:46,994][12645] Fps is (10 sec: 45885.7, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 478478336. Throughput: 0: 41088.1. Samples: 478550260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:18:46,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:18:49,451][12883] Updated weights for policy 0, policy_version 29210 (0.0026) +[2024-06-18 01:18:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 478674944. Throughput: 0: 40865.8. Samples: 478791680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:18:51,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:18:53,618][12883] Updated weights for policy 0, policy_version 29220 (0.0043) +[2024-06-18 01:18:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 40988.5). Total num frames: 478871552. Throughput: 0: 40935.1. Samples: 479039240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:18:56,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:18:57,213][12883] Updated weights for policy 0, policy_version 29230 (0.0036) +[2024-06-18 01:19:01,819][12883] Updated weights for policy 0, policy_version 29240 (0.0029) +[2024-06-18 01:19:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 479068160. Throughput: 0: 40968.1. Samples: 479162420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:19:01,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:19:05,386][12883] Updated weights for policy 0, policy_version 29250 (0.0035) +[2024-06-18 01:19:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 479297536. Throughput: 0: 40749.8. Samples: 479406860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:19:06,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:19:09,863][12883] Updated weights for policy 0, policy_version 29260 (0.0030) +[2024-06-18 01:19:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 479494144. Throughput: 0: 40976.5. Samples: 479647900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:19:11,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:19:13,187][12883] Updated weights for policy 0, policy_version 29270 (0.0035) +[2024-06-18 01:19:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.9, 300 sec: 40932.2). Total num frames: 479690752. Throughput: 0: 40863.6. Samples: 479773540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:19:16,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:19:17,805][12883] Updated weights for policy 0, policy_version 29280 (0.0039) +[2024-06-18 01:19:21,053][12883] Updated weights for policy 0, policy_version 29290 (0.0030) +[2024-06-18 01:19:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41098.8). Total num frames: 479903744. Throughput: 0: 40991.0. Samples: 480020300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:19:21,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:19:25,907][12883] Updated weights for policy 0, policy_version 29300 (0.0030) +[2024-06-18 01:19:26,996][12645] Fps is (10 sec: 40951.5, 60 sec: 40958.5, 300 sec: 40987.5). Total num frames: 480100352. Throughput: 0: 40978.0. Samples: 480261540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 01:19:26,996][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:19:29,045][12883] Updated weights for policy 0, policy_version 29310 (0.0041) +[2024-06-18 01:19:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40688.4, 300 sec: 40821.1). Total num frames: 480296960. Throughput: 0: 40716.8. Samples: 480382520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 01:19:31,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:19:33,599][12883] Updated weights for policy 0, policy_version 29320 (0.0032) +[2024-06-18 01:19:36,994][12645] Fps is (10 sec: 42607.7, 60 sec: 41780.8, 300 sec: 41154.4). Total num frames: 480526336. Throughput: 0: 40992.9. Samples: 480636360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 01:19:36,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:19:37,003][12883] Updated weights for policy 0, policy_version 29330 (0.0028) +[2024-06-18 01:19:41,599][12883] Updated weights for policy 0, policy_version 29340 (0.0026) +[2024-06-18 01:19:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 480722944. Throughput: 0: 40833.2. Samples: 480876740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 01:19:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:19:45,410][12883] Updated weights for policy 0, policy_version 29350 (0.0037) +[2024-06-18 01:19:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 480919552. Throughput: 0: 40760.0. Samples: 480996620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 01:19:46,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:19:49,497][12883] Updated weights for policy 0, policy_version 29360 (0.0036) +[2024-06-18 01:19:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 481132544. Throughput: 0: 40801.8. Samples: 481242940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 01:19:51,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:19:53,714][12883] Updated weights for policy 0, policy_version 29370 (0.0048) +[2024-06-18 01:19:53,919][12862] Signal inference workers to stop experience collection... (6800 times) +[2024-06-18 01:19:53,953][12883] InferenceWorker_p0-w0: stopping experience collection (6800 times) +[2024-06-18 01:19:53,976][12862] Signal inference workers to resume experience collection... (6800 times) +[2024-06-18 01:19:53,986][12883] InferenceWorker_p0-w0: resuming experience collection (6800 times) +[2024-06-18 01:19:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 481329152. Throughput: 0: 41016.5. Samples: 481493640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 01:19:56,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 01:19:57,264][12883] Updated weights for policy 0, policy_version 29380 (0.0041) +[2024-06-18 01:20:01,569][12883] Updated weights for policy 0, policy_version 29390 (0.0037) +[2024-06-18 01:20:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 481542144. Throughput: 0: 40834.7. Samples: 481611100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 01:20:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:20:05,069][12883] Updated weights for policy 0, policy_version 29400 (0.0046) +[2024-06-18 01:20:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40687.1, 300 sec: 40987.8). Total num frames: 481738752. Throughput: 0: 40909.3. Samples: 481861220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 01:20:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:20:09,597][12883] Updated weights for policy 0, policy_version 29410 (0.0031) +[2024-06-18 01:20:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40821.2). Total num frames: 481935360. Throughput: 0: 41090.4. Samples: 482110520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 01:20:11,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:20:13,543][12883] Updated weights for policy 0, policy_version 29420 (0.0034) +[2024-06-18 01:20:16,995][12645] Fps is (10 sec: 40953.8, 60 sec: 40959.1, 300 sec: 40987.6). Total num frames: 482148352. Throughput: 0: 41104.0. Samples: 482232260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 01:20:16,996][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:20:17,495][12883] Updated weights for policy 0, policy_version 29430 (0.0028) +[2024-06-18 01:20:21,455][12883] Updated weights for policy 0, policy_version 29440 (0.0044) +[2024-06-18 01:20:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 482361344. Throughput: 0: 40950.2. Samples: 482479120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 01:20:21,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:20:25,287][12883] Updated weights for policy 0, policy_version 29450 (0.0023) +[2024-06-18 01:20:26,994][12645] Fps is (10 sec: 40965.8, 60 sec: 40961.5, 300 sec: 40876.7). Total num frames: 482557952. Throughput: 0: 41181.3. Samples: 482729900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 01:20:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:20:29,560][12883] Updated weights for policy 0, policy_version 29460 (0.0041) +[2024-06-18 01:20:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 482770944. Throughput: 0: 41215.1. Samples: 482851300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 01:20:31,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:20:33,653][12883] Updated weights for policy 0, policy_version 29470 (0.0035) +[2024-06-18 01:20:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 41043.4). Total num frames: 482983936. Throughput: 0: 41109.4. Samples: 483092860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 01:20:36,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:20:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029479_482983936.pth... +[2024-06-18 01:20:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028879_473153536.pth +[2024-06-18 01:20:37,261][12883] Updated weights for policy 0, policy_version 29480 (0.0044) +[2024-06-18 01:20:41,328][12883] Updated weights for policy 0, policy_version 29490 (0.0038) +[2024-06-18 01:20:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 483164160. Throughput: 0: 41097.7. Samples: 483343040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 01:20:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:20:44,963][12883] Updated weights for policy 0, policy_version 29500 (0.0028) +[2024-06-18 01:20:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 483393536. Throughput: 0: 41144.1. Samples: 483462580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 01:20:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:20:49,062][12883] Updated weights for policy 0, policy_version 29510 (0.0047) +[2024-06-18 01:20:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 483606528. Throughput: 0: 41245.2. Samples: 483717260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 01:20:51,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:20:52,946][12883] Updated weights for policy 0, policy_version 29520 (0.0032) +[2024-06-18 01:20:56,960][12883] Updated weights for policy 0, policy_version 29530 (0.0038) +[2024-06-18 01:20:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 483819520. Throughput: 0: 41113.8. Samples: 483960640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 01:20:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:21:01,406][12883] Updated weights for policy 0, policy_version 29540 (0.0042) +[2024-06-18 01:21:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 484016128. Throughput: 0: 41010.6. Samples: 484077680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 01:21:01,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:21:05,138][12883] Updated weights for policy 0, policy_version 29550 (0.0026) +[2024-06-18 01:21:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 484196352. Throughput: 0: 40895.2. Samples: 484319400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 01:21:06,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:21:09,318][12883] Updated weights for policy 0, policy_version 29560 (0.0036) +[2024-06-18 01:21:10,809][12862] Signal inference workers to stop experience collection... (6850 times) +[2024-06-18 01:21:10,816][12862] Signal inference workers to resume experience collection... (6850 times) +[2024-06-18 01:21:10,840][12883] InferenceWorker_p0-w0: stopping experience collection (6850 times) +[2024-06-18 01:21:10,840][12883] InferenceWorker_p0-w0: resuming experience collection (6850 times) +[2024-06-18 01:21:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 484409344. Throughput: 0: 40736.4. Samples: 484563040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 01:21:11,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:21:13,463][12883] Updated weights for policy 0, policy_version 29570 (0.0044) +[2024-06-18 01:21:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40961.1, 300 sec: 40932.2). Total num frames: 484605952. Throughput: 0: 40818.3. Samples: 484688120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 01:21:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:21:17,150][12883] Updated weights for policy 0, policy_version 29580 (0.0030) +[2024-06-18 01:21:21,518][12883] Updated weights for policy 0, policy_version 29590 (0.0043) +[2024-06-18 01:21:21,996][12645] Fps is (10 sec: 40951.2, 60 sec: 40958.5, 300 sec: 40931.9). Total num frames: 484818944. Throughput: 0: 41024.2. Samples: 484939040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 01:21:21,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:21:25,505][12883] Updated weights for policy 0, policy_version 29600 (0.0039) +[2024-06-18 01:21:26,994][12645] Fps is (10 sec: 44235.7, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 485048320. Throughput: 0: 40671.4. Samples: 485173260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) +[2024-06-18 01:21:26,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:21:29,934][12883] Updated weights for policy 0, policy_version 29610 (0.0028) +[2024-06-18 01:21:31,994][12645] Fps is (10 sec: 39330.7, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 485212160. Throughput: 0: 40922.3. Samples: 485304080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) +[2024-06-18 01:21:31,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:21:33,582][12883] Updated weights for policy 0, policy_version 29620 (0.0035) +[2024-06-18 01:21:36,994][12645] Fps is (10 sec: 36045.3, 60 sec: 40413.9, 300 sec: 40876.7). Total num frames: 485408768. Throughput: 0: 40499.6. Samples: 485539740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) +[2024-06-18 01:21:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:21:37,795][12883] Updated weights for policy 0, policy_version 29630 (0.0027) +[2024-06-18 01:21:41,515][12883] Updated weights for policy 0, policy_version 29640 (0.0037) +[2024-06-18 01:21:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 485638144. Throughput: 0: 40586.2. Samples: 485787020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 01:21:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:21:45,601][12883] Updated weights for policy 0, policy_version 29650 (0.0037) +[2024-06-18 01:21:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40686.8, 300 sec: 40987.7). Total num frames: 485834752. Throughput: 0: 40812.4. Samples: 485914240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 01:21:46,995][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:21:49,309][12883] Updated weights for policy 0, policy_version 29660 (0.0039) +[2024-06-18 01:21:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 486047744. Throughput: 0: 40862.1. Samples: 486158200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 01:21:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:21:53,639][12883] Updated weights for policy 0, policy_version 29670 (0.0032) +[2024-06-18 01:21:56,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40140.9, 300 sec: 40876.7). Total num frames: 486227968. Throughput: 0: 40761.1. Samples: 486397280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 01:21:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:21:57,812][12883] Updated weights for policy 0, policy_version 29680 (0.0028) +[2024-06-18 01:22:01,365][12883] Updated weights for policy 0, policy_version 29690 (0.0031) +[2024-06-18 01:22:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.9, 300 sec: 40932.2). Total num frames: 486440960. Throughput: 0: 40651.0. Samples: 486517420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) +[2024-06-18 01:22:02,000][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:22:05,769][12883] Updated weights for policy 0, policy_version 29700 (0.0037) +[2024-06-18 01:22:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 486653952. Throughput: 0: 40714.1. Samples: 486771080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) +[2024-06-18 01:22:06,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:22:09,274][12883] Updated weights for policy 0, policy_version 29710 (0.0030) +[2024-06-18 01:22:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 486850560. Throughput: 0: 40922.8. Samples: 487014780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) +[2024-06-18 01:22:11,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:22:13,877][12883] Updated weights for policy 0, policy_version 29720 (0.0041) +[2024-06-18 01:22:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41232.9, 300 sec: 40932.2). Total num frames: 487079936. Throughput: 0: 40865.7. Samples: 487143040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) +[2024-06-18 01:22:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:22:17,238][12883] Updated weights for policy 0, policy_version 29730 (0.0024) +[2024-06-18 01:22:21,544][12883] Updated weights for policy 0, policy_version 29740 (0.0043) +[2024-06-18 01:22:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40961.6, 300 sec: 40987.8). Total num frames: 487276544. Throughput: 0: 41153.4. Samples: 487391640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 01:22:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:22:25,224][12883] Updated weights for policy 0, policy_version 29750 (0.0031) +[2024-06-18 01:22:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40687.1, 300 sec: 41043.3). Total num frames: 487489536. Throughput: 0: 40974.0. Samples: 487630840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 01:22:26,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:22:28,511][12862] Signal inference workers to stop experience collection... (6900 times) +[2024-06-18 01:22:28,512][12862] Signal inference workers to resume experience collection... (6900 times) +[2024-06-18 01:22:28,524][12883] InferenceWorker_p0-w0: stopping experience collection (6900 times) +[2024-06-18 01:22:28,524][12883] InferenceWorker_p0-w0: resuming experience collection (6900 times) +[2024-06-18 01:22:29,346][12883] Updated weights for policy 0, policy_version 29760 (0.0031) +[2024-06-18 01:22:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 487686144. Throughput: 0: 40965.1. Samples: 487757660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 01:22:31,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:22:33,160][12883] Updated weights for policy 0, policy_version 29770 (0.0035) +[2024-06-18 01:22:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 487882752. Throughput: 0: 41116.6. Samples: 488008440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 01:22:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:22:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029779_487899136.pth... +[2024-06-18 01:22:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029177_478035968.pth +[2024-06-18 01:22:37,235][12883] Updated weights for policy 0, policy_version 29780 (0.0039) +[2024-06-18 01:22:41,029][12883] Updated weights for policy 0, policy_version 29790 (0.0029) +[2024-06-18 01:22:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 488095744. Throughput: 0: 41344.9. Samples: 488257800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:22:41,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:22:45,111][12883] Updated weights for policy 0, policy_version 29800 (0.0046) +[2024-06-18 01:22:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 488308736. Throughput: 0: 41479.1. Samples: 488383980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:22:46,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:22:49,035][12883] Updated weights for policy 0, policy_version 29810 (0.0038) +[2024-06-18 01:22:51,996][12645] Fps is (10 sec: 39312.5, 60 sec: 40685.5, 300 sec: 40987.5). Total num frames: 488488960. Throughput: 0: 41350.3. Samples: 488631940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:22:51,997][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:22:53,122][12883] Updated weights for policy 0, policy_version 29820 (0.0047) +[2024-06-18 01:22:56,936][12883] Updated weights for policy 0, policy_version 29830 (0.0042) +[2024-06-18 01:22:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 41043.3). Total num frames: 488734720. Throughput: 0: 41479.9. Samples: 488881380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 01:22:56,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:23:00,963][12883] Updated weights for policy 0, policy_version 29840 (0.0051) +[2024-06-18 01:23:01,994][12645] Fps is (10 sec: 44247.1, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 488931328. Throughput: 0: 41427.3. Samples: 489007260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:23:01,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:23:05,241][12883] Updated weights for policy 0, policy_version 29850 (0.0036) +[2024-06-18 01:23:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41232.9, 300 sec: 40987.7). Total num frames: 489127936. Throughput: 0: 41322.9. Samples: 489251180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:23:06,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:23:08,789][12883] Updated weights for policy 0, policy_version 29860 (0.0030) +[2024-06-18 01:23:11,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 489340928. Throughput: 0: 41429.6. Samples: 489495180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:23:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:23:13,231][12883] Updated weights for policy 0, policy_version 29870 (0.0045) +[2024-06-18 01:23:16,953][12883] Updated weights for policy 0, policy_version 29880 (0.0034) +[2024-06-18 01:23:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 489553920. Throughput: 0: 41415.4. Samples: 489621360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:23:16,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:23:21,342][12883] Updated weights for policy 0, policy_version 29890 (0.0044) +[2024-06-18 01:23:21,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 489734144. Throughput: 0: 41330.2. Samples: 489868300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 01:23:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:23:24,719][12883] Updated weights for policy 0, policy_version 29900 (0.0039) +[2024-06-18 01:23:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41043.6). Total num frames: 489963520. Throughput: 0: 41139.5. Samples: 490109080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 01:23:26,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:23:29,480][12883] Updated weights for policy 0, policy_version 29910 (0.0034) +[2024-06-18 01:23:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41154.7). Total num frames: 490160128. Throughput: 0: 41159.7. Samples: 490236160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 01:23:31,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:23:32,734][12883] Updated weights for policy 0, policy_version 29920 (0.0026) +[2024-06-18 01:23:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 490356736. Throughput: 0: 41111.0. Samples: 490481840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 01:23:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:23:37,123][12883] Updated weights for policy 0, policy_version 29930 (0.0034) +[2024-06-18 01:23:40,670][12883] Updated weights for policy 0, policy_version 29940 (0.0034) +[2024-06-18 01:23:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41506.0, 300 sec: 41043.3). Total num frames: 490586112. Throughput: 0: 41055.9. Samples: 490728900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 01:23:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:23:44,850][12883] Updated weights for policy 0, policy_version 29950 (0.0037) +[2024-06-18 01:23:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 490782720. Throughput: 0: 41068.0. Samples: 490855320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 01:23:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:23:48,578][12883] Updated weights for policy 0, policy_version 29960 (0.0040) +[2024-06-18 01:23:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41234.6, 300 sec: 40987.8). Total num frames: 490962944. Throughput: 0: 41065.4. Samples: 491099120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 01:23:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:23:53,233][12883] Updated weights for policy 0, policy_version 29970 (0.0045) +[2024-06-18 01:23:56,516][12883] Updated weights for policy 0, policy_version 29980 (0.0027) +[2024-06-18 01:23:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 491192320. Throughput: 0: 41151.7. Samples: 491347000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 01:23:56,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:24:00,999][12883] Updated weights for policy 0, policy_version 29990 (0.0034) +[2024-06-18 01:24:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41232.9, 300 sec: 41043.3). Total num frames: 491405312. Throughput: 0: 41275.5. Samples: 491478760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:24:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:24:04,599][12883] Updated weights for policy 0, policy_version 30000 (0.0034) +[2024-06-18 01:24:06,996][12645] Fps is (10 sec: 42588.8, 60 sec: 41504.7, 300 sec: 41098.5). Total num frames: 491618304. Throughput: 0: 41187.6. Samples: 491721840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:24:06,997][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:24:08,670][12883] Updated weights for policy 0, policy_version 30010 (0.0043) +[2024-06-18 01:24:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 491798528. Throughput: 0: 41467.6. Samples: 491975120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:24:11,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:24:12,485][12883] Updated weights for policy 0, policy_version 30020 (0.0040) +[2024-06-18 01:24:14,157][12862] Signal inference workers to stop experience collection... (6950 times) +[2024-06-18 01:24:14,185][12883] InferenceWorker_p0-w0: stopping experience collection (6950 times) +[2024-06-18 01:24:14,210][12862] Signal inference workers to resume experience collection... (6950 times) +[2024-06-18 01:24:14,211][12883] InferenceWorker_p0-w0: resuming experience collection (6950 times) +[2024-06-18 01:24:16,596][12883] Updated weights for policy 0, policy_version 30030 (0.0036) +[2024-06-18 01:24:16,994][12645] Fps is (10 sec: 40969.6, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 492027904. Throughput: 0: 41290.7. Samples: 492094240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:24:16,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:24:20,406][12883] Updated weights for policy 0, policy_version 30040 (0.0030) +[2024-06-18 01:24:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 41154.7). Total num frames: 492240896. Throughput: 0: 41325.7. Samples: 492341500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 01:24:21,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:24:24,508][12883] Updated weights for policy 0, policy_version 30050 (0.0039) +[2024-06-18 01:24:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 492421120. Throughput: 0: 41289.9. Samples: 492586940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 01:24:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:24:28,368][12883] Updated weights for policy 0, policy_version 30060 (0.0046) +[2024-06-18 01:24:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 492650496. Throughput: 0: 41257.7. Samples: 492711920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 01:24:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:24:32,218][12883] Updated weights for policy 0, policy_version 30070 (0.0032) +[2024-06-18 01:24:36,143][12883] Updated weights for policy 0, policy_version 30080 (0.0032) +[2024-06-18 01:24:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 492830720. Throughput: 0: 41328.1. Samples: 492958880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 01:24:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:24:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030080_492830720.pth... +[2024-06-18 01:24:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029479_482983936.pth +[2024-06-18 01:24:40,479][12883] Updated weights for policy 0, policy_version 30090 (0.0030) +[2024-06-18 01:24:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 493043712. Throughput: 0: 41331.9. Samples: 493206940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 01:24:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:24:44,067][12883] Updated weights for policy 0, policy_version 30100 (0.0040) +[2024-06-18 01:24:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41233.0, 300 sec: 41098.9). Total num frames: 493256704. Throughput: 0: 41004.0. Samples: 493323940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 01:24:46,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:24:48,507][12883] Updated weights for policy 0, policy_version 30110 (0.0034) +[2024-06-18 01:24:51,879][12883] Updated weights for policy 0, policy_version 30120 (0.0037) +[2024-06-18 01:24:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41209.9). Total num frames: 493486080. Throughput: 0: 41090.4. Samples: 493570820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 01:24:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:24:56,499][12883] Updated weights for policy 0, policy_version 30130 (0.0039) +[2024-06-18 01:24:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41098.9). Total num frames: 493666304. Throughput: 0: 40911.9. Samples: 493816160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 01:24:56,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:25:00,483][12883] Updated weights for policy 0, policy_version 30140 (0.0028) +[2024-06-18 01:25:01,994][12645] Fps is (10 sec: 36045.3, 60 sec: 40687.1, 300 sec: 41043.3). Total num frames: 493846528. Throughput: 0: 40986.7. Samples: 493938640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 01:25:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:25:04,476][12883] Updated weights for policy 0, policy_version 30150 (0.0034) +[2024-06-18 01:25:06,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40688.3, 300 sec: 41098.8). Total num frames: 494059520. Throughput: 0: 41121.2. Samples: 494191960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 01:25:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:25:08,589][12883] Updated weights for policy 0, policy_version 30160 (0.0039) +[2024-06-18 01:25:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41506.1, 300 sec: 41154.6). Total num frames: 494288896. Throughput: 0: 41101.7. Samples: 494436520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 01:25:11,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:25:12,219][12883] Updated weights for policy 0, policy_version 30170 (0.0030) +[2024-06-18 01:25:16,483][12883] Updated weights for policy 0, policy_version 30180 (0.0037) +[2024-06-18 01:25:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40686.8, 300 sec: 41043.3). Total num frames: 494469120. Throughput: 0: 41123.1. Samples: 494562460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 01:25:16,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:25:19,997][12883] Updated weights for policy 0, policy_version 30190 (0.0037) +[2024-06-18 01:25:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 494682112. Throughput: 0: 40902.2. Samples: 494799480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 01:25:21,996][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:25:23,485][12862] Signal inference workers to stop experience collection... (7000 times) +[2024-06-18 01:25:23,536][12862] Signal inference workers to resume experience collection... (7000 times) +[2024-06-18 01:25:23,540][12883] InferenceWorker_p0-w0: stopping experience collection (7000 times) +[2024-06-18 01:25:23,551][12883] InferenceWorker_p0-w0: resuming experience collection (7000 times) +[2024-06-18 01:25:24,621][12883] Updated weights for policy 0, policy_version 30200 (0.0028) +[2024-06-18 01:25:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 494878720. Throughput: 0: 40966.7. Samples: 495050440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 01:25:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:25:28,137][12883] Updated weights for policy 0, policy_version 30210 (0.0046) +[2024-06-18 01:25:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 495091712. Throughput: 0: 41037.9. Samples: 495170640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 01:25:31,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 01:25:32,471][12883] Updated weights for policy 0, policy_version 30220 (0.0035) +[2024-06-18 01:25:36,224][12883] Updated weights for policy 0, policy_version 30230 (0.0026) +[2024-06-18 01:25:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 495321088. Throughput: 0: 41071.2. Samples: 495419020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 01:25:36,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:25:40,836][12883] Updated weights for policy 0, policy_version 30240 (0.0033) +[2024-06-18 01:25:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 495517696. Throughput: 0: 41035.1. Samples: 495662740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) +[2024-06-18 01:25:41,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 01:25:44,385][12883] Updated weights for policy 0, policy_version 30250 (0.0045) +[2024-06-18 01:25:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 495714304. Throughput: 0: 40964.9. Samples: 495782060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) +[2024-06-18 01:25:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:25:48,828][12883] Updated weights for policy 0, policy_version 30260 (0.0030) +[2024-06-18 01:25:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 495910912. Throughput: 0: 40839.2. Samples: 496029720. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) +[2024-06-18 01:25:51,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:25:52,793][12883] Updated weights for policy 0, policy_version 30270 (0.0030) +[2024-06-18 01:25:56,686][12883] Updated weights for policy 0, policy_version 30280 (0.0035) +[2024-06-18 01:25:57,000][12645] Fps is (10 sec: 40934.2, 60 sec: 40955.8, 300 sec: 41042.5). Total num frames: 496123904. Throughput: 0: 41004.6. Samples: 496281980. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) +[2024-06-18 01:25:57,000][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:26:00,683][12883] Updated weights for policy 0, policy_version 30290 (0.0033) +[2024-06-18 01:26:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 496336896. Throughput: 0: 40879.7. Samples: 496402040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 01:26:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:26:04,690][12883] Updated weights for policy 0, policy_version 30300 (0.0043) +[2024-06-18 01:26:06,994][12645] Fps is (10 sec: 40985.2, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 496533504. Throughput: 0: 41075.5. Samples: 496647880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 01:26:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:26:08,623][12883] Updated weights for policy 0, policy_version 30310 (0.0035) +[2024-06-18 01:26:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 496730112. Throughput: 0: 40946.2. Samples: 496893020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 01:26:11,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:26:12,626][12883] Updated weights for policy 0, policy_version 30320 (0.0038) +[2024-06-18 01:26:16,713][12883] Updated weights for policy 0, policy_version 30330 (0.0041) +[2024-06-18 01:26:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41043.6). Total num frames: 496926720. Throughput: 0: 41104.8. Samples: 497020360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 01:26:16,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:26:20,453][12883] Updated weights for policy 0, policy_version 30340 (0.0030) +[2024-06-18 01:26:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 497156096. Throughput: 0: 41108.4. Samples: 497268900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 01:26:21,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:26:24,702][12883] Updated weights for policy 0, policy_version 30350 (0.0029) +[2024-06-18 01:26:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 497369088. Throughput: 0: 41178.3. Samples: 497515760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 01:26:26,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:26:28,277][12883] Updated weights for policy 0, policy_version 30360 (0.0048) +[2024-06-18 01:26:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 497549312. Throughput: 0: 41425.2. Samples: 497646200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 01:26:31,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:26:32,478][12883] Updated weights for policy 0, policy_version 30370 (0.0036) +[2024-06-18 01:26:35,890][12883] Updated weights for policy 0, policy_version 30380 (0.0033) +[2024-06-18 01:26:36,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 497778688. Throughput: 0: 41432.9. Samples: 497894200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 01:26:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:26:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030382_497778688.pth... +[2024-06-18 01:26:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029779_487899136.pth +[2024-06-18 01:26:40,281][12883] Updated weights for policy 0, policy_version 30390 (0.0038) +[2024-06-18 01:26:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 497975296. Throughput: 0: 41287.9. Samples: 498139680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:26:41,994][12645] Avg episode reward: [(0, '0.000')] +[2024-06-18 01:26:43,901][12883] Updated weights for policy 0, policy_version 30400 (0.0037) +[2024-06-18 01:26:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 498188288. Throughput: 0: 41369.7. Samples: 498263680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:26:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:26:47,880][12862] Signal inference workers to stop experience collection... (7050 times) +[2024-06-18 01:26:47,880][12862] Signal inference workers to resume experience collection... (7050 times) +[2024-06-18 01:26:47,893][12883] InferenceWorker_p0-w0: stopping experience collection (7050 times) +[2024-06-18 01:26:47,893][12883] InferenceWorker_p0-w0: resuming experience collection (7050 times) +[2024-06-18 01:26:48,018][12883] Updated weights for policy 0, policy_version 30410 (0.0031) +[2024-06-18 01:26:51,978][12883] Updated weights for policy 0, policy_version 30420 (0.0033) +[2024-06-18 01:26:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 498401280. Throughput: 0: 41546.4. Samples: 498517460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:26:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:26:55,747][12883] Updated weights for policy 0, policy_version 30430 (0.0025) +[2024-06-18 01:26:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41237.4, 300 sec: 41209.9). Total num frames: 498597888. Throughput: 0: 41598.8. Samples: 498764960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:26:56,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:26:59,822][12883] Updated weights for policy 0, policy_version 30440 (0.0040) +[2024-06-18 01:27:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 498810880. Throughput: 0: 41489.4. Samples: 498887380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 01:27:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:27:03,654][12883] Updated weights for policy 0, policy_version 30450 (0.0030) +[2024-06-18 01:27:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 499023872. Throughput: 0: 41570.7. Samples: 499139580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 01:27:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:27:07,611][12883] Updated weights for policy 0, policy_version 30460 (0.0034) +[2024-06-18 01:27:11,570][12883] Updated weights for policy 0, policy_version 30470 (0.0038) +[2024-06-18 01:27:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 499220480. Throughput: 0: 41564.4. Samples: 499386160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 01:27:12,003][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:27:15,642][12883] Updated weights for policy 0, policy_version 30480 (0.0040) +[2024-06-18 01:27:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 499417088. Throughput: 0: 41315.7. Samples: 499505400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 01:27:16,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 01:27:19,871][12883] Updated weights for policy 0, policy_version 30490 (0.0040) +[2024-06-18 01:27:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41265.4). Total num frames: 499662848. Throughput: 0: 41232.1. Samples: 499749640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 01:27:21,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:27:23,511][12883] Updated weights for policy 0, policy_version 30500 (0.0042) +[2024-06-18 01:27:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 499826688. Throughput: 0: 41352.5. Samples: 500000540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 01:27:26,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:27:27,733][12883] Updated weights for policy 0, policy_version 30510 (0.0037) +[2024-06-18 01:27:31,329][12883] Updated weights for policy 0, policy_version 30520 (0.0038) +[2024-06-18 01:27:31,994][12645] Fps is (10 sec: 37682.7, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 500039680. Throughput: 0: 41230.1. Samples: 500119040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 01:27:31,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:27:35,742][12883] Updated weights for policy 0, policy_version 30530 (0.0042) +[2024-06-18 01:27:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41506.3, 300 sec: 41265.5). Total num frames: 500269056. Throughput: 0: 41197.8. Samples: 500371360. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-18 01:27:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:27:38,981][12883] Updated weights for policy 0, policy_version 30540 (0.0040) +[2024-06-18 01:27:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 500449280. Throughput: 0: 41248.0. Samples: 500621120. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-18 01:27:41,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:27:43,858][12883] Updated weights for policy 0, policy_version 30550 (0.0037) +[2024-06-18 01:27:46,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41506.1, 300 sec: 41321.3). Total num frames: 500678656. Throughput: 0: 41187.8. Samples: 500740840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-18 01:27:46,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:27:47,062][12883] Updated weights for policy 0, policy_version 30560 (0.0035) +[2024-06-18 01:27:51,546][12883] Updated weights for policy 0, policy_version 30570 (0.0037) +[2024-06-18 01:27:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40959.8, 300 sec: 41098.8). Total num frames: 500858880. Throughput: 0: 41255.8. Samples: 500996100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-18 01:27:51,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:27:55,165][12883] Updated weights for policy 0, policy_version 30580 (0.0039) +[2024-06-18 01:27:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 501071872. Throughput: 0: 41316.8. Samples: 501245420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) +[2024-06-18 01:27:56,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:27:59,547][12883] Updated weights for policy 0, policy_version 30590 (0.0046) +[2024-06-18 01:28:01,072][12862] Signal inference workers to stop experience collection... (7100 times) +[2024-06-18 01:28:01,125][12862] Signal inference workers to resume experience collection... (7100 times) +[2024-06-18 01:28:01,127][12883] InferenceWorker_p0-w0: stopping experience collection (7100 times) +[2024-06-18 01:28:01,152][12883] InferenceWorker_p0-w0: resuming experience collection (7100 times) +[2024-06-18 01:28:01,994][12645] Fps is (10 sec: 44237.9, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 501301248. Throughput: 0: 41354.3. Samples: 501366340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) +[2024-06-18 01:28:01,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:28:03,333][12883] Updated weights for policy 0, policy_version 30600 (0.0042) +[2024-06-18 01:28:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 501481472. Throughput: 0: 41484.4. Samples: 501616440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) +[2024-06-18 01:28:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:28:07,623][12883] Updated weights for policy 0, policy_version 30610 (0.0036) +[2024-06-18 01:28:11,018][12883] Updated weights for policy 0, policy_version 30620 (0.0027) +[2024-06-18 01:28:11,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 501678080. Throughput: 0: 41374.7. Samples: 501862400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) +[2024-06-18 01:28:11,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:28:15,368][12883] Updated weights for policy 0, policy_version 30630 (0.0033) +[2024-06-18 01:28:16,997][12645] Fps is (10 sec: 44223.3, 60 sec: 41777.0, 300 sec: 41320.6). Total num frames: 501923840. Throughput: 0: 41462.6. Samples: 501984980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:28:16,997][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:28:19,039][12883] Updated weights for policy 0, policy_version 30640 (0.0043) +[2024-06-18 01:28:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 502120448. Throughput: 0: 41403.0. Samples: 502234500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:28:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:28:23,368][12883] Updated weights for policy 0, policy_version 30650 (0.0041) +[2024-06-18 01:28:26,819][12883] Updated weights for policy 0, policy_version 30660 (0.0032) +[2024-06-18 01:28:26,996][12645] Fps is (10 sec: 40963.4, 60 sec: 41777.6, 300 sec: 41265.1). Total num frames: 502333440. Throughput: 0: 41131.7. Samples: 502472140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:28:26,997][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:28:31,594][12883] Updated weights for policy 0, policy_version 30670 (0.0030) +[2024-06-18 01:28:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 502513664. Throughput: 0: 41289.8. Samples: 502598880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:28:31,995][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:28:35,479][12883] Updated weights for policy 0, policy_version 30680 (0.0039) +[2024-06-18 01:28:36,994][12645] Fps is (10 sec: 37692.1, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 502710272. Throughput: 0: 41118.9. Samples: 502846440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:28:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:28:37,047][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030684_502726656.pth... +[2024-06-18 01:28:37,114][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030080_492830720.pth +[2024-06-18 01:28:39,399][12883] Updated weights for policy 0, policy_version 30690 (0.0033) +[2024-06-18 01:28:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 502939648. Throughput: 0: 40947.1. Samples: 503088040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:28:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:28:43,654][12883] Updated weights for policy 0, policy_version 30700 (0.0035) +[2024-06-18 01:28:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 503136256. Throughput: 0: 41156.3. Samples: 503218380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:28:46,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:28:47,286][12883] Updated weights for policy 0, policy_version 30710 (0.0045) +[2024-06-18 01:28:51,367][12883] Updated weights for policy 0, policy_version 30720 (0.0042) +[2024-06-18 01:28:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 503332864. Throughput: 0: 41042.7. Samples: 503463360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:28:51,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:28:55,138][12883] Updated weights for policy 0, policy_version 30730 (0.0032) +[2024-06-18 01:28:57,000][12645] Fps is (10 sec: 42572.1, 60 sec: 41501.8, 300 sec: 41209.1). Total num frames: 503562240. Throughput: 0: 40999.6. Samples: 503707640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 01:28:57,001][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:28:59,154][12883] Updated weights for policy 0, policy_version 30740 (0.0035) +[2024-06-18 01:29:01,996][12645] Fps is (10 sec: 42590.5, 60 sec: 40958.7, 300 sec: 41154.4). Total num frames: 503758848. Throughput: 0: 41111.8. Samples: 503834960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 01:29:01,996][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:29:03,112][12883] Updated weights for policy 0, policy_version 30750 (0.0031) +[2024-06-18 01:29:06,976][12883] Updated weights for policy 0, policy_version 30760 (0.0043) +[2024-06-18 01:29:06,994][12645] Fps is (10 sec: 40985.8, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 503971840. Throughput: 0: 40868.1. Samples: 504073560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 01:29:06,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:29:11,189][12883] Updated weights for policy 0, policy_version 30770 (0.0035) +[2024-06-18 01:29:11,994][12645] Fps is (10 sec: 40968.1, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 504168448. Throughput: 0: 41182.1. Samples: 504325240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 01:29:11,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:29:14,768][12883] Updated weights for policy 0, policy_version 30780 (0.0030) +[2024-06-18 01:29:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40416.0, 300 sec: 41043.3). Total num frames: 504348672. Throughput: 0: 41117.5. Samples: 504449160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 01:29:16,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:29:17,303][12862] Signal inference workers to stop experience collection... (7150 times) +[2024-06-18 01:29:17,304][12862] Signal inference workers to resume experience collection... (7150 times) +[2024-06-18 01:29:17,344][12883] InferenceWorker_p0-w0: stopping experience collection (7150 times) +[2024-06-18 01:29:17,344][12883] InferenceWorker_p0-w0: resuming experience collection (7150 times) +[2024-06-18 01:29:18,747][12883] Updated weights for policy 0, policy_version 30790 (0.0043) +[2024-06-18 01:29:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 504578048. Throughput: 0: 41096.7. Samples: 504695800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:29:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:29:22,725][12883] Updated weights for policy 0, policy_version 30800 (0.0037) +[2024-06-18 01:29:26,432][12883] Updated weights for policy 0, policy_version 30810 (0.0023) +[2024-06-18 01:29:26,994][12645] Fps is (10 sec: 45874.2, 60 sec: 41234.5, 300 sec: 41209.9). Total num frames: 504807424. Throughput: 0: 41211.0. Samples: 504942540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:29:26,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 01:29:31,161][12883] Updated weights for policy 0, policy_version 30820 (0.0041) +[2024-06-18 01:29:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 504971264. Throughput: 0: 40980.1. Samples: 505062480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:29:31,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:29:34,438][12883] Updated weights for policy 0, policy_version 30830 (0.0036) +[2024-06-18 01:29:36,996][12645] Fps is (10 sec: 40951.2, 60 sec: 41777.6, 300 sec: 41265.2). Total num frames: 505217024. Throughput: 0: 41086.4. Samples: 505312340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:29:36,997][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 01:29:39,334][12883] Updated weights for policy 0, policy_version 30840 (0.0041) +[2024-06-18 01:29:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 505413632. Throughput: 0: 41262.5. Samples: 505564200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:29:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:29:42,621][12883] Updated weights for policy 0, policy_version 30850 (0.0039) +[2024-06-18 01:29:46,974][12883] Updated weights for policy 0, policy_version 30860 (0.0026) +[2024-06-18 01:29:46,994][12645] Fps is (10 sec: 39330.6, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 505610240. Throughput: 0: 41250.2. Samples: 505691140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:29:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:29:50,338][12883] Updated weights for policy 0, policy_version 30870 (0.0042) +[2024-06-18 01:29:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 505823232. Throughput: 0: 41472.3. Samples: 505939820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:29:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:29:54,720][12883] Updated weights for policy 0, policy_version 30880 (0.0028) +[2024-06-18 01:29:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40964.3, 300 sec: 41265.5). Total num frames: 506019840. Throughput: 0: 41534.7. Samples: 506194300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:29:56,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:29:58,211][12883] Updated weights for policy 0, policy_version 30890 (0.0029) +[2024-06-18 01:30:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41234.4, 300 sec: 41265.5). Total num frames: 506232832. Throughput: 0: 41379.9. Samples: 506311260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 01:30:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:30:02,457][12883] Updated weights for policy 0, policy_version 30900 (0.0041) +[2024-06-18 01:30:06,075][12883] Updated weights for policy 0, policy_version 30910 (0.0044) +[2024-06-18 01:30:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 506462208. Throughput: 0: 41455.1. Samples: 506561280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 01:30:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:30:10,666][12883] Updated weights for policy 0, policy_version 30920 (0.0031) +[2024-06-18 01:30:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 506626048. Throughput: 0: 41658.7. Samples: 506817180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 01:30:11,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:30:13,835][12883] Updated weights for policy 0, policy_version 30930 (0.0026) +[2024-06-18 01:30:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 506871808. Throughput: 0: 41598.6. Samples: 506934420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 01:30:16,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:30:18,798][12883] Updated weights for policy 0, policy_version 30940 (0.0038) +[2024-06-18 01:30:21,789][12883] Updated weights for policy 0, policy_version 30950 (0.0049) +[2024-06-18 01:30:21,994][12645] Fps is (10 sec: 45876.2, 60 sec: 41779.3, 300 sec: 41376.6). Total num frames: 507084800. Throughput: 0: 41704.4. Samples: 507188940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 01:30:21,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:30:26,529][12883] Updated weights for policy 0, policy_version 30960 (0.0038) +[2024-06-18 01:30:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 507265024. Throughput: 0: 41613.4. Samples: 507436800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 01:30:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:30:29,847][12883] Updated weights for policy 0, policy_version 30970 (0.0045) +[2024-06-18 01:30:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41265.5). Total num frames: 507494400. Throughput: 0: 41534.7. Samples: 507560200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 01:30:31,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:30:34,407][12883] Updated weights for policy 0, policy_version 30980 (0.0043) +[2024-06-18 01:30:36,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41234.7, 300 sec: 41265.5). Total num frames: 507691008. Throughput: 0: 41576.6. Samples: 507810760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 01:30:36,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:30:37,104][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030988_507707392.pth... +[2024-06-18 01:30:37,173][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030382_497778688.pth +[2024-06-18 01:30:37,685][12883] Updated weights for policy 0, policy_version 30990 (0.0043) +[2024-06-18 01:30:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 507887616. Throughput: 0: 41431.5. Samples: 508058720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 01:30:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:30:42,034][12883] Updated weights for policy 0, policy_version 31000 (0.0042) +[2024-06-18 01:30:45,735][12883] Updated weights for policy 0, policy_version 31010 (0.0025) +[2024-06-18 01:30:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 508116992. Throughput: 0: 41517.8. Samples: 508179560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 01:30:46,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:30:49,768][12883] Updated weights for policy 0, policy_version 31020 (0.0044) +[2024-06-18 01:30:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 41266.3). Total num frames: 508297216. Throughput: 0: 41460.1. Samples: 508426980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 01:30:51,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:30:53,874][12883] Updated weights for policy 0, policy_version 31030 (0.0031) +[2024-06-18 01:30:56,994][12645] Fps is (10 sec: 37682.5, 60 sec: 41232.9, 300 sec: 41209.9). Total num frames: 508493824. Throughput: 0: 41318.6. Samples: 508676520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 01:30:56,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:30:57,966][12883] Updated weights for policy 0, policy_version 31040 (0.0039) +[2024-06-18 01:31:01,745][12883] Updated weights for policy 0, policy_version 31050 (0.0029) +[2024-06-18 01:31:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 508723200. Throughput: 0: 41413.8. Samples: 508798040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:31:01,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:31:05,841][12883] Updated weights for policy 0, policy_version 31060 (0.0038) +[2024-06-18 01:31:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 508936192. Throughput: 0: 41204.7. Samples: 509043160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:31:06,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:31:07,512][12862] Signal inference workers to stop experience collection... (7200 times) +[2024-06-18 01:31:07,540][12883] InferenceWorker_p0-w0: stopping experience collection (7200 times) +[2024-06-18 01:31:07,578][12862] Signal inference workers to resume experience collection... (7200 times) +[2024-06-18 01:31:07,578][12883] InferenceWorker_p0-w0: resuming experience collection (7200 times) +[2024-06-18 01:31:09,719][12883] Updated weights for policy 0, policy_version 31070 (0.0035) +[2024-06-18 01:31:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 509132800. Throughput: 0: 41265.3. Samples: 509293740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:31:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:31:13,770][12883] Updated weights for policy 0, policy_version 31080 (0.0040) +[2024-06-18 01:31:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 509329408. Throughput: 0: 41133.2. Samples: 509411200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:31:17,000][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:31:17,615][12883] Updated weights for policy 0, policy_version 31090 (0.0041) +[2024-06-18 01:31:21,551][12883] Updated weights for policy 0, policy_version 31100 (0.0034) +[2024-06-18 01:31:22,000][12645] Fps is (10 sec: 42572.1, 60 sec: 41228.7, 300 sec: 41320.1). Total num frames: 509558784. Throughput: 0: 41223.1. Samples: 509666060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:31:22,000][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:31:25,646][12883] Updated weights for policy 0, policy_version 31110 (0.0045) +[2024-06-18 01:31:26,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 509755392. Throughput: 0: 41119.6. Samples: 509909100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:31:26,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:31:29,552][12883] Updated weights for policy 0, policy_version 31120 (0.0031) +[2024-06-18 01:31:31,994][12645] Fps is (10 sec: 39346.2, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 509952000. Throughput: 0: 41187.1. Samples: 510032980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:31:31,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:31:33,838][12883] Updated weights for policy 0, policy_version 31130 (0.0034) +[2024-06-18 01:31:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 510164992. Throughput: 0: 41147.1. Samples: 510278600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:31:36,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:31:37,439][12883] Updated weights for policy 0, policy_version 31140 (0.0040) +[2024-06-18 01:31:41,480][12883] Updated weights for policy 0, policy_version 31150 (0.0040) +[2024-06-18 01:31:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 510377984. Throughput: 0: 41069.5. Samples: 510524640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:31:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:31:45,801][12883] Updated weights for policy 0, policy_version 31160 (0.0035) +[2024-06-18 01:31:46,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 510558208. Throughput: 0: 41190.5. Samples: 510651620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:31:46,995][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:31:49,217][12883] Updated weights for policy 0, policy_version 31170 (0.0030) +[2024-06-18 01:31:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 510771200. Throughput: 0: 41259.1. Samples: 510899820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:31:51,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:31:53,507][12883] Updated weights for policy 0, policy_version 31180 (0.0032) +[2024-06-18 01:31:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 511000576. Throughput: 0: 41223.1. Samples: 511148780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:31:56,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:31:57,519][12883] Updated weights for policy 0, policy_version 31190 (0.0042) +[2024-06-18 01:32:01,231][12883] Updated weights for policy 0, policy_version 31200 (0.0030) +[2024-06-18 01:32:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 511197184. Throughput: 0: 41476.0. Samples: 511277620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:32:01,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 01:32:05,310][12883] Updated weights for policy 0, policy_version 31210 (0.0047) +[2024-06-18 01:32:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 511410176. Throughput: 0: 41096.4. Samples: 511515140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:32:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:32:09,372][12883] Updated weights for policy 0, policy_version 31220 (0.0036) +[2024-06-18 01:32:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 511606784. Throughput: 0: 41248.8. Samples: 511765300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:32:11,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:32:13,044][12883] Updated weights for policy 0, policy_version 31230 (0.0042) +[2024-06-18 01:32:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 511803392. Throughput: 0: 41097.4. Samples: 511882360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:32:16,994][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 01:32:17,828][12883] Updated weights for policy 0, policy_version 31240 (0.0052) +[2024-06-18 01:32:21,015][12883] Updated weights for policy 0, policy_version 31250 (0.0023) +[2024-06-18 01:32:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41510.5, 300 sec: 41432.1). Total num frames: 512049152. Throughput: 0: 41228.5. Samples: 512133880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 01:32:21,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 01:32:25,546][12883] Updated weights for policy 0, policy_version 31260 (0.0031) +[2024-06-18 01:32:26,994][12645] Fps is (10 sec: 40958.9, 60 sec: 40959.8, 300 sec: 41265.5). Total num frames: 512212992. Throughput: 0: 41631.4. Samples: 512398060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 01:32:26,995][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:32:28,882][12883] Updated weights for policy 0, policy_version 31270 (0.0030) +[2024-06-18 01:32:31,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 512425984. Throughput: 0: 41232.1. Samples: 512507060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 01:32:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:32:33,426][12883] Updated weights for policy 0, policy_version 31280 (0.0036) +[2024-06-18 01:32:36,858][12883] Updated weights for policy 0, policy_version 31290 (0.0047) +[2024-06-18 01:32:36,994][12645] Fps is (10 sec: 44237.9, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 512655360. Throughput: 0: 41428.5. Samples: 512764100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 01:32:36,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:32:37,049][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031291_512671744.pth... +[2024-06-18 01:32:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030684_502726656.pth +[2024-06-18 01:32:41,213][12883] Updated weights for policy 0, policy_version 31300 (0.0036) +[2024-06-18 01:32:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 512835584. Throughput: 0: 41481.3. Samples: 513015440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 01:32:41,995][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:32:44,090][12862] Signal inference workers to stop experience collection... (7250 times) +[2024-06-18 01:32:44,090][12862] Signal inference workers to resume experience collection... (7250 times) +[2024-06-18 01:32:44,106][12883] InferenceWorker_p0-w0: stopping experience collection (7250 times) +[2024-06-18 01:32:44,107][12883] InferenceWorker_p0-w0: resuming experience collection (7250 times) +[2024-06-18 01:32:44,706][12883] Updated weights for policy 0, policy_version 31310 (0.0044) +[2024-06-18 01:32:46,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 513081344. Throughput: 0: 41226.6. Samples: 513132820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 01:32:46,995][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 01:32:49,340][12883] Updated weights for policy 0, policy_version 31320 (0.0033) +[2024-06-18 01:32:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 513245184. Throughput: 0: 41517.3. Samples: 513383420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 01:32:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:32:52,821][12883] Updated weights for policy 0, policy_version 31330 (0.0033) +[2024-06-18 01:32:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 513458176. Throughput: 0: 41395.0. Samples: 513628080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 01:32:56,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:32:57,546][12883] Updated weights for policy 0, policy_version 31340 (0.0031) +[2024-06-18 01:33:00,736][12883] Updated weights for policy 0, policy_version 31350 (0.0034) +[2024-06-18 01:33:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 513687552. Throughput: 0: 41585.2. Samples: 513753700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 01:33:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:33:05,633][12883] Updated weights for policy 0, policy_version 31360 (0.0035) +[2024-06-18 01:33:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 513867776. Throughput: 0: 41461.2. Samples: 513999640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:33:06,997][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:33:08,615][12883] Updated weights for policy 0, policy_version 31370 (0.0043) +[2024-06-18 01:33:11,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40960.0, 300 sec: 41154.8). Total num frames: 514064384. Throughput: 0: 40826.9. Samples: 514235260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:33:11,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:33:13,655][12883] Updated weights for policy 0, policy_version 31380 (0.0036) +[2024-06-18 01:33:16,552][12883] Updated weights for policy 0, policy_version 31390 (0.0031) +[2024-06-18 01:33:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 41265.5). Total num frames: 514293760. Throughput: 0: 41195.5. Samples: 514360860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:33:16,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:33:21,741][12883] Updated weights for policy 0, policy_version 31400 (0.0029) +[2024-06-18 01:33:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.9, 300 sec: 41154.7). Total num frames: 514473984. Throughput: 0: 41022.7. Samples: 514610120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:33:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:33:24,536][12883] Updated weights for policy 0, policy_version 31410 (0.0039) +[2024-06-18 01:33:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 514703360. Throughput: 0: 40704.5. Samples: 514847140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) +[2024-06-18 01:33:26,996][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:33:29,496][12883] Updated weights for policy 0, policy_version 31420 (0.0034) +[2024-06-18 01:33:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 514899968. Throughput: 0: 40890.4. Samples: 514972880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) +[2024-06-18 01:33:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:33:32,783][12883] Updated weights for policy 0, policy_version 31430 (0.0032) +[2024-06-18 01:33:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 515096576. Throughput: 0: 40694.7. Samples: 515214680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) +[2024-06-18 01:33:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:33:37,191][12883] Updated weights for policy 0, policy_version 31440 (0.0044) +[2024-06-18 01:33:40,848][12883] Updated weights for policy 0, policy_version 31450 (0.0044) +[2024-06-18 01:33:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 515309568. Throughput: 0: 40577.4. Samples: 515454060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) +[2024-06-18 01:33:41,998][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:33:45,280][12883] Updated weights for policy 0, policy_version 31460 (0.0030) +[2024-06-18 01:33:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.9, 300 sec: 41209.9). Total num frames: 515489792. Throughput: 0: 40626.7. Samples: 515581900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 01:33:46,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:33:48,675][12883] Updated weights for policy 0, policy_version 31470 (0.0038) +[2024-06-18 01:33:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40686.9, 300 sec: 41099.7). Total num frames: 515686400. Throughput: 0: 40564.4. Samples: 515825040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 01:33:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:33:53,582][12883] Updated weights for policy 0, policy_version 31480 (0.0038) +[2024-06-18 01:33:56,554][12883] Updated weights for policy 0, policy_version 31490 (0.0033) +[2024-06-18 01:33:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41506.2, 300 sec: 41321.3). Total num frames: 515948544. Throughput: 0: 40720.0. Samples: 516067660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 01:33:56,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:34:01,437][12883] Updated weights for policy 0, policy_version 31500 (0.0039) +[2024-06-18 01:34:01,995][12645] Fps is (10 sec: 42592.9, 60 sec: 40413.0, 300 sec: 41154.2). Total num frames: 516112384. Throughput: 0: 40909.1. Samples: 516201820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 01:34:01,995][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:34:02,422][12862] Signal inference workers to stop experience collection... (7300 times) +[2024-06-18 01:34:02,422][12862] Signal inference workers to resume experience collection... (7300 times) +[2024-06-18 01:34:02,452][12883] InferenceWorker_p0-w0: stopping experience collection (7300 times) +[2024-06-18 01:34:02,452][12883] InferenceWorker_p0-w0: resuming experience collection (7300 times) +[2024-06-18 01:34:04,396][12883] Updated weights for policy 0, policy_version 31510 (0.0047) +[2024-06-18 01:34:06,994][12645] Fps is (10 sec: 36045.1, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 516308992. Throughput: 0: 40690.7. Samples: 516441200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) +[2024-06-18 01:34:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:34:09,395][12883] Updated weights for policy 0, policy_version 31520 (0.0040) +[2024-06-18 01:34:11,994][12645] Fps is (10 sec: 44243.0, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 516554752. Throughput: 0: 41000.9. Samples: 516692180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) +[2024-06-18 01:34:11,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:34:12,258][12883] Updated weights for policy 0, policy_version 31530 (0.0029) +[2024-06-18 01:34:16,996][12645] Fps is (10 sec: 40950.0, 60 sec: 40412.4, 300 sec: 41154.1). Total num frames: 516718592. Throughput: 0: 41090.3. Samples: 516822040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) +[2024-06-18 01:34:17,005][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:34:17,395][12883] Updated weights for policy 0, policy_version 31540 (0.0034) +[2024-06-18 01:34:20,431][12883] Updated weights for policy 0, policy_version 31550 (0.0035) +[2024-06-18 01:34:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 516947968. Throughput: 0: 41003.9. Samples: 517059860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) +[2024-06-18 01:34:21,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:34:25,197][12883] Updated weights for policy 0, policy_version 31560 (0.0041) +[2024-06-18 01:34:26,994][12645] Fps is (10 sec: 44247.2, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 517160960. Throughput: 0: 41380.4. Samples: 517316180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) +[2024-06-18 01:34:26,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:34:28,382][12883] Updated weights for policy 0, policy_version 31570 (0.0039) +[2024-06-18 01:34:31,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40687.0, 300 sec: 41099.2). Total num frames: 517341184. Throughput: 0: 41301.8. Samples: 517440480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) +[2024-06-18 01:34:31,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:34:33,529][12883] Updated weights for policy 0, policy_version 31580 (0.0035) +[2024-06-18 01:34:36,714][12883] Updated weights for policy 0, policy_version 31590 (0.0034) +[2024-06-18 01:34:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 517570560. Throughput: 0: 41086.8. Samples: 517673940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) +[2024-06-18 01:34:36,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:34:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031590_517570560.pth... +[2024-06-18 01:34:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030988_507707392.pth +[2024-06-18 01:34:41,386][12883] Updated weights for policy 0, policy_version 31600 (0.0028) +[2024-06-18 01:34:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 517767168. Throughput: 0: 41384.4. Samples: 517929960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) +[2024-06-18 01:34:41,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:34:44,615][12883] Updated weights for policy 0, policy_version 31610 (0.0030) +[2024-06-18 01:34:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 517963776. Throughput: 0: 41071.5. Samples: 518049980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 01:34:46,994][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 01:34:49,271][12883] Updated weights for policy 0, policy_version 31620 (0.0042) +[2024-06-18 01:34:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 518176768. Throughput: 0: 41143.9. Samples: 518292680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 01:34:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:34:52,832][12883] Updated weights for policy 0, policy_version 31630 (0.0031) +[2024-06-18 01:34:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 518373376. Throughput: 0: 41047.9. Samples: 518539340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 01:34:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:34:57,023][12883] Updated weights for policy 0, policy_version 31640 (0.0036) +[2024-06-18 01:35:01,017][12883] Updated weights for policy 0, policy_version 31650 (0.0043) +[2024-06-18 01:35:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40961.0, 300 sec: 41043.3). Total num frames: 518569984. Throughput: 0: 40909.4. Samples: 518662860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 01:35:01,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:35:04,748][12883] Updated weights for policy 0, policy_version 31660 (0.0036) +[2024-06-18 01:35:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 518766592. Throughput: 0: 41140.1. Samples: 518911160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:35:06,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:35:08,678][12883] Updated weights for policy 0, policy_version 31670 (0.0046) +[2024-06-18 01:35:11,994][12645] Fps is (10 sec: 44236.1, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 519012352. Throughput: 0: 40972.9. Samples: 519159960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:35:11,996][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:35:12,897][12883] Updated weights for policy 0, policy_version 31680 (0.0049) +[2024-06-18 01:35:16,923][12883] Updated weights for policy 0, policy_version 31690 (0.0030) +[2024-06-18 01:35:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41507.8, 300 sec: 41098.8). Total num frames: 519208960. Throughput: 0: 41049.3. Samples: 519287700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:35:16,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:35:20,709][12883] Updated weights for policy 0, policy_version 31700 (0.0048) +[2024-06-18 01:35:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 519405568. Throughput: 0: 41237.3. Samples: 519529620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:35:21,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:35:24,842][12883] Updated weights for policy 0, policy_version 31710 (0.0032) +[2024-06-18 01:35:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 519634944. Throughput: 0: 41064.5. Samples: 519777860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 01:35:26,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 01:35:28,287][12862] Signal inference workers to stop experience collection... (7350 times) +[2024-06-18 01:35:28,289][12862] Signal inference workers to resume experience collection... (7350 times) +[2024-06-18 01:35:28,328][12883] InferenceWorker_p0-w0: stopping experience collection (7350 times) +[2024-06-18 01:35:28,328][12883] InferenceWorker_p0-w0: resuming experience collection (7350 times) +[2024-06-18 01:35:28,427][12883] Updated weights for policy 0, policy_version 31720 (0.0044) +[2024-06-18 01:35:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 519815168. Throughput: 0: 41078.1. Samples: 519898500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 01:35:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:35:32,765][12883] Updated weights for policy 0, policy_version 31730 (0.0031) +[2024-06-18 01:35:36,105][12883] Updated weights for policy 0, policy_version 31740 (0.0029) +[2024-06-18 01:35:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 520028160. Throughput: 0: 41224.0. Samples: 520147760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 01:35:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:35:40,566][12883] Updated weights for policy 0, policy_version 31750 (0.0028) +[2024-06-18 01:35:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 520224768. Throughput: 0: 41471.7. Samples: 520405560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 01:35:41,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:35:43,796][12883] Updated weights for policy 0, policy_version 31760 (0.0047) +[2024-06-18 01:35:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 520437760. Throughput: 0: 41316.3. Samples: 520522100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 01:35:46,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:35:48,536][12883] Updated weights for policy 0, policy_version 31770 (0.0035) +[2024-06-18 01:35:51,446][12883] Updated weights for policy 0, policy_version 31780 (0.0029) +[2024-06-18 01:35:51,994][12645] Fps is (10 sec: 45874.0, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 520683520. Throughput: 0: 41376.3. Samples: 520773100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-18 01:35:51,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:35:56,304][12883] Updated weights for policy 0, policy_version 31790 (0.0036) +[2024-06-18 01:35:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 520863744. Throughput: 0: 41380.1. Samples: 521022060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-18 01:35:56,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:35:59,351][12883] Updated weights for policy 0, policy_version 31800 (0.0026) +[2024-06-18 01:36:01,996][12645] Fps is (10 sec: 37675.3, 60 sec: 41504.5, 300 sec: 41098.5). Total num frames: 521060352. Throughput: 0: 41077.5. Samples: 521136280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-18 01:36:01,996][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:36:04,738][12883] Updated weights for policy 0, policy_version 31810 (0.0030) +[2024-06-18 01:36:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 41265.5). Total num frames: 521306112. Throughput: 0: 41417.8. Samples: 521393420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) +[2024-06-18 01:36:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:36:07,383][12883] Updated weights for policy 0, policy_version 31820 (0.0038) +[2024-06-18 01:36:11,994][12645] Fps is (10 sec: 39330.1, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 521453568. Throughput: 0: 41450.6. Samples: 521643140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-18 01:36:11,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:36:12,825][12883] Updated weights for policy 0, policy_version 31830 (0.0042) +[2024-06-18 01:36:15,245][12883] Updated weights for policy 0, policy_version 31840 (0.0033) +[2024-06-18 01:36:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41155.3). Total num frames: 521699328. Throughput: 0: 41332.5. Samples: 521758460. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-18 01:36:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:36:20,562][12883] Updated weights for policy 0, policy_version 31850 (0.0037) +[2024-06-18 01:36:21,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42052.2, 300 sec: 41265.4). Total num frames: 521928704. Throughput: 0: 41457.7. Samples: 522013360. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-18 01:36:21,998][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:36:23,432][12883] Updated weights for policy 0, policy_version 31860 (0.0040) +[2024-06-18 01:36:26,994][12645] Fps is (10 sec: 36045.1, 60 sec: 40414.0, 300 sec: 41043.3). Total num frames: 522059776. Throughput: 0: 41243.5. Samples: 522261520. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) +[2024-06-18 01:36:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:36:28,611][12883] Updated weights for policy 0, policy_version 31870 (0.0033) +[2024-06-18 01:36:31,636][12883] Updated weights for policy 0, policy_version 31880 (0.0041) +[2024-06-18 01:36:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 41265.5). Total num frames: 522338304. Throughput: 0: 41212.2. Samples: 522376640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) +[2024-06-18 01:36:31,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:36:36,486][12862] Signal inference workers to stop experience collection... (7400 times) +[2024-06-18 01:36:36,516][12883] InferenceWorker_p0-w0: stopping experience collection (7400 times) +[2024-06-18 01:36:36,549][12862] Signal inference workers to resume experience collection... (7400 times) +[2024-06-18 01:36:36,552][12883] InferenceWorker_p0-w0: resuming experience collection (7400 times) +[2024-06-18 01:36:36,554][12883] Updated weights for policy 0, policy_version 31890 (0.0043) +[2024-06-18 01:36:36,994][12645] Fps is (10 sec: 45873.5, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 522518528. Throughput: 0: 41256.3. Samples: 522629640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) +[2024-06-18 01:36:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:36:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031893_522534912.pth... +[2024-06-18 01:36:37,051][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031291_512671744.pth +[2024-06-18 01:36:39,518][12883] Updated weights for policy 0, policy_version 31900 (0.0029) +[2024-06-18 01:36:41,994][12645] Fps is (10 sec: 32767.8, 60 sec: 40686.8, 300 sec: 41043.3). Total num frames: 522665984. Throughput: 0: 41308.9. Samples: 522880960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) +[2024-06-18 01:36:41,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:36:44,171][12883] Updated weights for policy 0, policy_version 31910 (0.0041) +[2024-06-18 01:36:46,994][12645] Fps is (10 sec: 40961.4, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 522928128. Throughput: 0: 41272.3. Samples: 522993440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) +[2024-06-18 01:36:46,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:36:47,577][12883] Updated weights for policy 0, policy_version 31920 (0.0043) +[2024-06-18 01:36:51,836][12883] Updated weights for policy 0, policy_version 31930 (0.0042) +[2024-06-18 01:36:51,994][12645] Fps is (10 sec: 47513.1, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 523141120. Throughput: 0: 41270.2. Samples: 523250580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 01:36:51,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:36:55,459][12883] Updated weights for policy 0, policy_version 31940 (0.0037) +[2024-06-18 01:36:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41098.9). Total num frames: 523321344. Throughput: 0: 41139.1. Samples: 523494400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 01:36:56,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:36:59,857][12883] Updated weights for policy 0, policy_version 31950 (0.0031) +[2024-06-18 01:37:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41507.6, 300 sec: 41154.4). Total num frames: 523550720. Throughput: 0: 41371.0. Samples: 523620160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 01:37:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:37:03,373][12883] Updated weights for policy 0, policy_version 31960 (0.0035) +[2024-06-18 01:37:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 523763712. Throughput: 0: 41417.4. Samples: 523877140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 01:37:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:37:07,777][12883] Updated weights for policy 0, policy_version 31970 (0.0029) +[2024-06-18 01:37:11,053][12883] Updated weights for policy 0, policy_version 31980 (0.0045) +[2024-06-18 01:37:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41265.5). Total num frames: 523976704. Throughput: 0: 41174.6. Samples: 524114380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 01:37:11,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:37:15,719][12883] Updated weights for policy 0, policy_version 31990 (0.0045) +[2024-06-18 01:37:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 524173312. Throughput: 0: 41611.9. Samples: 524249180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 01:37:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:37:18,985][12883] Updated weights for policy 0, policy_version 32000 (0.0042) +[2024-06-18 01:37:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 524386304. Throughput: 0: 41609.5. Samples: 524502060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 01:37:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:37:23,439][12883] Updated weights for policy 0, policy_version 32010 (0.0035) +[2024-06-18 01:37:26,773][12883] Updated weights for policy 0, policy_version 32020 (0.0034) +[2024-06-18 01:37:27,000][12645] Fps is (10 sec: 44209.7, 60 sec: 42594.0, 300 sec: 41320.2). Total num frames: 524615680. Throughput: 0: 41565.5. Samples: 524751660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 01:37:27,009][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:37:31,088][12883] Updated weights for policy 0, policy_version 32030 (0.0045) +[2024-06-18 01:37:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 524795904. Throughput: 0: 41965.3. Samples: 524881880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:37:31,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:37:35,187][12883] Updated weights for policy 0, policy_version 32040 (0.0034) +[2024-06-18 01:37:36,994][12645] Fps is (10 sec: 39346.2, 60 sec: 41506.4, 300 sec: 41265.5). Total num frames: 525008896. Throughput: 0: 41740.2. Samples: 525128880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:37:36,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:37:38,896][12883] Updated weights for policy 0, policy_version 32050 (0.0033) +[2024-06-18 01:37:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41098.9). Total num frames: 525205504. Throughput: 0: 41903.5. Samples: 525380060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:37:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:37:42,976][12883] Updated weights for policy 0, policy_version 32060 (0.0031) +[2024-06-18 01:37:44,732][12862] Signal inference workers to stop experience collection... (7450 times) +[2024-06-18 01:37:44,732][12862] Signal inference workers to resume experience collection... (7450 times) +[2024-06-18 01:37:44,751][12883] InferenceWorker_p0-w0: stopping experience collection (7450 times) +[2024-06-18 01:37:44,751][12883] InferenceWorker_p0-w0: resuming experience collection (7450 times) +[2024-06-18 01:37:46,656][12883] Updated weights for policy 0, policy_version 32070 (0.0042) +[2024-06-18 01:37:46,994][12645] Fps is (10 sec: 42597.4, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 525434880. Throughput: 0: 41852.4. Samples: 525503520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:37:46,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:37:51,028][12883] Updated weights for policy 0, policy_version 32080 (0.0036) +[2024-06-18 01:37:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 525615104. Throughput: 0: 41585.7. Samples: 525748500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:37:51,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:37:54,700][12883] Updated weights for policy 0, policy_version 32090 (0.0037) +[2024-06-18 01:37:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41209.9). Total num frames: 525844480. Throughput: 0: 41833.7. Samples: 525996900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 01:37:56,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 01:37:59,141][12883] Updated weights for policy 0, policy_version 32100 (0.0050) +[2024-06-18 01:38:01,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 526041088. Throughput: 0: 41626.7. Samples: 526122380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 01:38:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:38:02,394][12883] Updated weights for policy 0, policy_version 32110 (0.0039) +[2024-06-18 01:38:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 526237696. Throughput: 0: 41525.4. Samples: 526370700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 01:38:06,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:38:07,363][12883] Updated weights for policy 0, policy_version 32120 (0.0041) +[2024-06-18 01:38:10,390][12883] Updated weights for policy 0, policy_version 32130 (0.0032) +[2024-06-18 01:38:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 526467072. Throughput: 0: 41466.1. Samples: 526617380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 01:38:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:38:15,109][12883] Updated weights for policy 0, policy_version 32140 (0.0029) +[2024-06-18 01:38:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 526663680. Throughput: 0: 41532.0. Samples: 526750820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:38:16,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:38:18,045][12883] Updated weights for policy 0, policy_version 32150 (0.0036) +[2024-06-18 01:38:21,996][12645] Fps is (10 sec: 39313.1, 60 sec: 41231.6, 300 sec: 41209.6). Total num frames: 526860288. Throughput: 0: 41509.8. Samples: 526996920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:38:21,996][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:38:23,043][12883] Updated weights for policy 0, policy_version 32160 (0.0031) +[2024-06-18 01:38:25,875][12883] Updated weights for policy 0, policy_version 32170 (0.0029) +[2024-06-18 01:38:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41237.3, 300 sec: 41321.0). Total num frames: 527089664. Throughput: 0: 41253.0. Samples: 527236440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:38:26,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:38:31,129][12883] Updated weights for policy 0, policy_version 32180 (0.0030) +[2024-06-18 01:38:31,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 527269888. Throughput: 0: 41328.1. Samples: 527363280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 01:38:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:38:33,819][12883] Updated weights for policy 0, policy_version 32190 (0.0033) +[2024-06-18 01:38:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 527482880. Throughput: 0: 41483.3. Samples: 527615240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:38:36,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:38:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032196_527499264.pth... +[2024-06-18 01:38:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031590_517570560.pth +[2024-06-18 01:38:38,888][12883] Updated weights for policy 0, policy_version 32200 (0.0046) +[2024-06-18 01:38:41,762][12883] Updated weights for policy 0, policy_version 32210 (0.0042) +[2024-06-18 01:38:41,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.4, 300 sec: 41487.6). Total num frames: 527728640. Throughput: 0: 41150.3. Samples: 527848660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:38:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:38:46,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 527876096. Throughput: 0: 41180.3. Samples: 527975500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:38:46,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:38:47,097][12883] Updated weights for policy 0, policy_version 32220 (0.0052) +[2024-06-18 01:38:49,768][12883] Updated weights for policy 0, policy_version 32230 (0.0030) +[2024-06-18 01:38:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 528105472. Throughput: 0: 41147.1. Samples: 528222320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:38:51,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:38:54,876][12883] Updated weights for policy 0, policy_version 32240 (0.0041) +[2024-06-18 01:38:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41233.1, 300 sec: 41376.7). Total num frames: 528318464. Throughput: 0: 41272.9. Samples: 528474660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:38:56,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:38:57,983][12883] Updated weights for policy 0, policy_version 32250 (0.0045) +[2024-06-18 01:39:01,996][12645] Fps is (10 sec: 39312.6, 60 sec: 40958.4, 300 sec: 41320.7). Total num frames: 528498688. Throughput: 0: 40929.9. Samples: 528592760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:39:01,997][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:39:02,618][12883] Updated weights for policy 0, policy_version 32260 (0.0037) +[2024-06-18 01:39:06,083][12883] Updated weights for policy 0, policy_version 32270 (0.0040) +[2024-06-18 01:39:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 528728064. Throughput: 0: 40907.4. Samples: 528837660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:39:06,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:39:11,015][12883] Updated weights for policy 0, policy_version 32280 (0.0041) +[2024-06-18 01:39:11,996][12645] Fps is (10 sec: 44236.8, 60 sec: 41231.5, 300 sec: 41432.1). Total num frames: 528941056. Throughput: 0: 41102.4. Samples: 529086140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:39:11,997][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:39:14,099][12883] Updated weights for policy 0, policy_version 32290 (0.0029) +[2024-06-18 01:39:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 529121280. Throughput: 0: 40909.3. Samples: 529204200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:39:16,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:39:18,906][12883] Updated weights for policy 0, policy_version 32300 (0.0043) +[2024-06-18 01:39:21,994][12645] Fps is (10 sec: 40969.6, 60 sec: 41507.7, 300 sec: 41321.0). Total num frames: 529350656. Throughput: 0: 40872.0. Samples: 529454480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:39:21,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:39:22,003][12883] Updated weights for policy 0, policy_version 32310 (0.0035) +[2024-06-18 01:39:26,386][12862] Signal inference workers to stop experience collection... (7500 times) +[2024-06-18 01:39:26,387][12862] Signal inference workers to resume experience collection... (7500 times) +[2024-06-18 01:39:26,443][12883] InferenceWorker_p0-w0: stopping experience collection (7500 times) +[2024-06-18 01:39:26,444][12883] InferenceWorker_p0-w0: resuming experience collection (7500 times) +[2024-06-18 01:39:26,786][12883] Updated weights for policy 0, policy_version 32320 (0.0049) +[2024-06-18 01:39:26,996][12645] Fps is (10 sec: 40951.1, 60 sec: 40685.4, 300 sec: 41320.7). Total num frames: 529530880. Throughput: 0: 41190.7. Samples: 529702340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:39:26,997][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:39:29,866][12883] Updated weights for policy 0, policy_version 32330 (0.0040) +[2024-06-18 01:39:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 529760256. Throughput: 0: 41051.1. Samples: 529822800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:39:31,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:39:34,866][12883] Updated weights for policy 0, policy_version 32340 (0.0037) +[2024-06-18 01:39:36,994][12645] Fps is (10 sec: 42607.9, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 529956864. Throughput: 0: 41187.1. Samples: 530075740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:39:36,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:39:37,752][12883] Updated weights for policy 0, policy_version 32350 (0.0037) +[2024-06-18 01:39:41,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.7, 300 sec: 41265.5). Total num frames: 530137088. Throughput: 0: 41129.7. Samples: 530325500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:39:41,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:39:42,506][12883] Updated weights for policy 0, policy_version 32360 (0.0030) +[2024-06-18 01:39:45,799][12883] Updated weights for policy 0, policy_version 32370 (0.0047) +[2024-06-18 01:39:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 530366464. Throughput: 0: 41156.8. Samples: 530444720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:39:46,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:39:50,240][12883] Updated weights for policy 0, policy_version 32380 (0.0037) +[2024-06-18 01:39:51,994][12645] Fps is (10 sec: 44237.6, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 530579456. Throughput: 0: 41261.9. Samples: 530694440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:39:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:39:54,115][12883] Updated weights for policy 0, policy_version 32390 (0.0035) +[2024-06-18 01:39:56,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 530759680. Throughput: 0: 41195.8. Samples: 530939860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:39:56,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:39:58,181][12883] Updated weights for policy 0, policy_version 32400 (0.0038) +[2024-06-18 01:40:01,827][12883] Updated weights for policy 0, policy_version 32410 (0.0051) +[2024-06-18 01:40:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41780.7, 300 sec: 41487.6). Total num frames: 531005440. Throughput: 0: 41544.9. Samples: 531073720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-18 01:40:01,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:40:05,892][12883] Updated weights for policy 0, policy_version 32420 (0.0039) +[2024-06-18 01:40:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 531185664. Throughput: 0: 41398.5. Samples: 531317420. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-18 01:40:06,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 01:40:09,869][12883] Updated weights for policy 0, policy_version 32430 (0.0035) +[2024-06-18 01:40:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41234.6, 300 sec: 41376.5). Total num frames: 531415040. Throughput: 0: 41321.2. Samples: 531561700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-18 01:40:11,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:40:13,901][12883] Updated weights for policy 0, policy_version 32440 (0.0027) +[2024-06-18 01:40:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 531611648. Throughput: 0: 41419.9. Samples: 531686700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) +[2024-06-18 01:40:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:40:17,866][12883] Updated weights for policy 0, policy_version 32450 (0.0027) +[2024-06-18 01:40:21,625][12883] Updated weights for policy 0, policy_version 32460 (0.0044) +[2024-06-18 01:40:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 531824640. Throughput: 0: 41273.7. Samples: 531933060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 01:40:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:40:26,080][12883] Updated weights for policy 0, policy_version 32470 (0.0044) +[2024-06-18 01:40:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41507.6, 300 sec: 41376.5). Total num frames: 532021248. Throughput: 0: 41288.0. Samples: 532183460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 01:40:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:40:29,738][12883] Updated weights for policy 0, policy_version 32480 (0.0033) +[2024-06-18 01:40:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 532234240. Throughput: 0: 41364.8. Samples: 532306140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 01:40:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:40:34,079][12883] Updated weights for policy 0, policy_version 32490 (0.0032) +[2024-06-18 01:40:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 532430848. Throughput: 0: 41312.7. Samples: 532553520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 01:40:36,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:40:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032498_532447232.pth... +[2024-06-18 01:40:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031893_522534912.pth +[2024-06-18 01:40:37,639][12883] Updated weights for policy 0, policy_version 32500 (0.0032) +[2024-06-18 01:40:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41376.6). Total num frames: 532643840. Throughput: 0: 41353.5. Samples: 532800760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 01:40:41,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:40:42,004][12883] Updated weights for policy 0, policy_version 32510 (0.0043) +[2024-06-18 01:40:45,389][12883] Updated weights for policy 0, policy_version 32520 (0.0047) +[2024-06-18 01:40:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 532840448. Throughput: 0: 41074.3. Samples: 532922060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:40:46,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 01:40:49,877][12883] Updated weights for policy 0, policy_version 32530 (0.0032) +[2024-06-18 01:40:51,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41231.4, 300 sec: 41320.7). Total num frames: 533053440. Throughput: 0: 41244.2. Samples: 533173500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:40:51,996][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:40:53,136][12883] Updated weights for policy 0, policy_version 32540 (0.0042) +[2024-06-18 01:40:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41265.8). Total num frames: 533233664. Throughput: 0: 41435.6. Samples: 533426300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:40:56,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:40:57,690][12883] Updated weights for policy 0, policy_version 32550 (0.0039) +[2024-06-18 01:40:59,544][12862] Signal inference workers to stop experience collection... (7550 times) +[2024-06-18 01:40:59,544][12862] Signal inference workers to resume experience collection... (7550 times) +[2024-06-18 01:40:59,563][12883] InferenceWorker_p0-w0: stopping experience collection (7550 times) +[2024-06-18 01:40:59,592][12883] InferenceWorker_p0-w0: resuming experience collection (7550 times) +[2024-06-18 01:41:00,977][12883] Updated weights for policy 0, policy_version 32560 (0.0041) +[2024-06-18 01:41:01,994][12645] Fps is (10 sec: 42608.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 533479424. Throughput: 0: 41349.0. Samples: 533547400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 01:41:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:41:05,777][12883] Updated weights for policy 0, policy_version 32570 (0.0044) +[2024-06-18 01:41:06,996][12645] Fps is (10 sec: 44227.0, 60 sec: 41504.6, 300 sec: 41431.8). Total num frames: 533676032. Throughput: 0: 41341.6. Samples: 533793520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:41:06,996][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:41:08,820][12883] Updated weights for policy 0, policy_version 32580 (0.0047) +[2024-06-18 01:41:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 533856256. Throughput: 0: 41301.9. Samples: 534042040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:41:11,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:41:14,192][12883] Updated weights for policy 0, policy_version 32590 (0.0036) +[2024-06-18 01:41:16,644][12883] Updated weights for policy 0, policy_version 32600 (0.0036) +[2024-06-18 01:41:16,994][12645] Fps is (10 sec: 45885.2, 60 sec: 42052.3, 300 sec: 41376.5). Total num frames: 534134784. Throughput: 0: 41265.8. Samples: 534163100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:41:16,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 01:41:21,768][12883] Updated weights for policy 0, policy_version 32610 (0.0027) +[2024-06-18 01:41:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 534282240. Throughput: 0: 41503.1. Samples: 534421160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 01:41:21,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:41:24,716][12883] Updated weights for policy 0, policy_version 32621 (0.0040) +[2024-06-18 01:41:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 534495232. Throughput: 0: 41447.0. Samples: 534665880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:41:26,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:41:30,218][12883] Updated weights for policy 0, policy_version 32631 (0.0043) +[2024-06-18 01:41:31,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42052.3, 300 sec: 41487.7). Total num frames: 534757376. Throughput: 0: 41611.5. Samples: 534794580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:41:31,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:41:32,483][12883] Updated weights for policy 0, policy_version 32641 (0.0034) +[2024-06-18 01:41:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 534888448. Throughput: 0: 41609.3. Samples: 535045820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:41:36,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:41:38,095][12883] Updated weights for policy 0, policy_version 32651 (0.0035) +[2024-06-18 01:41:41,326][12883] Updated weights for policy 0, policy_version 32661 (0.0033) +[2024-06-18 01:41:41,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 535134208. Throughput: 0: 41246.6. Samples: 535282400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:41:41,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:41:45,783][12883] Updated weights for policy 0, policy_version 32671 (0.0029) +[2024-06-18 01:41:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 535347200. Throughput: 0: 41574.7. Samples: 535418260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 01:41:46,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:41:49,067][12883] Updated weights for policy 0, policy_version 32681 (0.0033) +[2024-06-18 01:41:51,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40961.6, 300 sec: 41321.0). Total num frames: 535511040. Throughput: 0: 41639.0. Samples: 535667180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 01:41:51,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:41:53,705][12883] Updated weights for policy 0, policy_version 32691 (0.0045) +[2024-06-18 01:41:56,864][12883] Updated weights for policy 0, policy_version 32701 (0.0050) +[2024-06-18 01:41:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 41432.1). Total num frames: 535773184. Throughput: 0: 41416.8. Samples: 535905800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 01:41:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:42:01,935][12883] Updated weights for policy 0, policy_version 32711 (0.0042) +[2024-06-18 01:42:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 535937024. Throughput: 0: 41659.6. Samples: 536037780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 01:42:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:42:04,620][12883] Updated weights for policy 0, policy_version 32721 (0.0039) +[2024-06-18 01:42:06,996][12645] Fps is (10 sec: 37674.9, 60 sec: 41233.0, 300 sec: 41265.1). Total num frames: 536150016. Throughput: 0: 41293.5. Samples: 536279460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 01:42:06,996][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:42:09,903][12883] Updated weights for policy 0, policy_version 32731 (0.0037) +[2024-06-18 01:42:10,774][12862] Signal inference workers to stop experience collection... (7600 times) +[2024-06-18 01:42:10,812][12883] InferenceWorker_p0-w0: stopping experience collection (7600 times) +[2024-06-18 01:42:10,830][12862] Signal inference workers to resume experience collection... (7600 times) +[2024-06-18 01:42:10,831][12883] InferenceWorker_p0-w0: resuming experience collection (7600 times) +[2024-06-18 01:42:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.2, 300 sec: 41432.1). Total num frames: 536395776. Throughput: 0: 41319.5. Samples: 536525260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 01:42:11,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:42:12,387][12883] Updated weights for policy 0, policy_version 32741 (0.0034) +[2024-06-18 01:42:16,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40413.8, 300 sec: 41265.5). Total num frames: 536559616. Throughput: 0: 41287.5. Samples: 536652520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 01:42:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:42:17,720][12883] Updated weights for policy 0, policy_version 32751 (0.0042) +[2024-06-18 01:42:20,841][12883] Updated weights for policy 0, policy_version 32761 (0.0023) +[2024-06-18 01:42:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 41266.3). Total num frames: 536788992. Throughput: 0: 40973.7. Samples: 536889640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 01:42:21,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:42:25,753][12883] Updated weights for policy 0, policy_version 32771 (0.0026) +[2024-06-18 01:42:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 537001984. Throughput: 0: 41433.8. Samples: 537146920. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 01:42:26,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:42:28,607][12883] Updated weights for policy 0, policy_version 32781 (0.0043) +[2024-06-18 01:42:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.8, 300 sec: 41209.9). Total num frames: 537165824. Throughput: 0: 40979.1. Samples: 537262320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 01:42:31,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:42:33,499][12883] Updated weights for policy 0, policy_version 32791 (0.0037) +[2024-06-18 01:42:36,506][12883] Updated weights for policy 0, policy_version 32801 (0.0034) +[2024-06-18 01:42:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 41432.1). Total num frames: 537427968. Throughput: 0: 41013.2. Samples: 537512780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 01:42:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:42:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032802_537427968.pth... +[2024-06-18 01:42:37,058][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032196_527499264.pth +[2024-06-18 01:42:41,346][12883] Updated weights for policy 0, policy_version 32811 (0.0025) +[2024-06-18 01:42:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 537608192. Throughput: 0: 41248.9. Samples: 537762000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 01:42:41,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:42:44,740][12883] Updated weights for policy 0, policy_version 32821 (0.0039) +[2024-06-18 01:42:46,994][12645] Fps is (10 sec: 36045.4, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 537788416. Throughput: 0: 40948.1. Samples: 537880440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 01:42:46,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:42:49,104][12883] Updated weights for policy 0, policy_version 32831 (0.0033) +[2024-06-18 01:42:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41265.5). Total num frames: 538017792. Throughput: 0: 41103.8. Samples: 538129040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 01:42:51,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:42:52,782][12883] Updated weights for policy 0, policy_version 32841 (0.0040) +[2024-06-18 01:42:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 538214400. Throughput: 0: 41220.6. Samples: 538380180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 01:42:56,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:42:57,171][12883] Updated weights for policy 0, policy_version 32851 (0.0027) +[2024-06-18 01:43:01,286][12883] Updated weights for policy 0, policy_version 32861 (0.0040) +[2024-06-18 01:43:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 538411008. Throughput: 0: 40939.2. Samples: 538494780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 01:43:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:43:04,911][12883] Updated weights for policy 0, policy_version 32871 (0.0035) +[2024-06-18 01:43:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41507.7, 300 sec: 41265.5). Total num frames: 538640384. Throughput: 0: 41269.3. Samples: 538746760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 01:43:06,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:43:09,136][12883] Updated weights for policy 0, policy_version 32881 (0.0046) +[2024-06-18 01:43:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.9, 300 sec: 41209.9). Total num frames: 538820608. Throughput: 0: 41225.8. Samples: 539002080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 01:43:11,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:43:12,834][12883] Updated weights for policy 0, policy_version 32891 (0.0045) +[2024-06-18 01:43:16,872][12883] Updated weights for policy 0, policy_version 32901 (0.0041) +[2024-06-18 01:43:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41321.3). Total num frames: 539049984. Throughput: 0: 41208.9. Samples: 539116720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 01:43:16,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:43:21,192][12883] Updated weights for policy 0, policy_version 32911 (0.0039) +[2024-06-18 01:43:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 539262976. Throughput: 0: 41125.4. Samples: 539363420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 01:43:21,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:43:25,181][12883] Updated weights for policy 0, policy_version 32921 (0.0030) +[2024-06-18 01:43:26,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40686.8, 300 sec: 41265.4). Total num frames: 539443200. Throughput: 0: 41063.4. Samples: 539609860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 01:43:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:43:28,937][12883] Updated weights for policy 0, policy_version 32931 (0.0036) +[2024-06-18 01:43:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 539672576. Throughput: 0: 41151.4. Samples: 539732260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 01:43:31,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:43:33,186][12883] Updated weights for policy 0, policy_version 32941 (0.0039) +[2024-06-18 01:43:34,267][12862] Signal inference workers to stop experience collection... (7650 times) +[2024-06-18 01:43:34,268][12862] Signal inference workers to resume experience collection... (7650 times) +[2024-06-18 01:43:34,304][12883] InferenceWorker_p0-w0: stopping experience collection (7650 times) +[2024-06-18 01:43:34,305][12883] InferenceWorker_p0-w0: resuming experience collection (7650 times) +[2024-06-18 01:43:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40413.9, 300 sec: 41098.8). Total num frames: 539852800. Throughput: 0: 41175.6. Samples: 539981940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:43:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:43:37,049][12883] Updated weights for policy 0, policy_version 32951 (0.0047) +[2024-06-18 01:43:41,037][12883] Updated weights for policy 0, policy_version 32961 (0.0042) +[2024-06-18 01:43:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 540065792. Throughput: 0: 41040.4. Samples: 540227000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:43:41,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:43:45,097][12883] Updated weights for policy 0, policy_version 32971 (0.0043) +[2024-06-18 01:43:46,994][12645] Fps is (10 sec: 44235.8, 60 sec: 41779.0, 300 sec: 41321.0). Total num frames: 540295168. Throughput: 0: 41192.3. Samples: 540348440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:43:46,995][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:43:48,805][12883] Updated weights for policy 0, policy_version 32981 (0.0046) +[2024-06-18 01:43:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 540475392. Throughput: 0: 41196.5. Samples: 540600600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:43:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:43:52,667][12883] Updated weights for policy 0, policy_version 32991 (0.0038) +[2024-06-18 01:43:56,770][12883] Updated weights for policy 0, policy_version 33001 (0.0033) +[2024-06-18 01:43:56,994][12645] Fps is (10 sec: 39322.3, 60 sec: 41233.0, 300 sec: 41321.3). Total num frames: 540688384. Throughput: 0: 40922.6. Samples: 540843600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 01:43:56,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:44:00,935][12883] Updated weights for policy 0, policy_version 33011 (0.0048) +[2024-06-18 01:44:01,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 540884992. Throughput: 0: 41055.0. Samples: 540964200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 01:44:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:44:04,729][12883] Updated weights for policy 0, policy_version 33021 (0.0039) +[2024-06-18 01:44:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40686.9, 300 sec: 41154.7). Total num frames: 541081600. Throughput: 0: 41108.7. Samples: 541213320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 01:44:06,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:44:08,623][12883] Updated weights for policy 0, policy_version 33031 (0.0034) +[2024-06-18 01:44:11,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 541294592. Throughput: 0: 41080.2. Samples: 541458460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 01:44:12,000][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:44:12,719][12883] Updated weights for policy 0, policy_version 33041 (0.0032) +[2024-06-18 01:44:16,279][12883] Updated weights for policy 0, policy_version 33051 (0.0032) +[2024-06-18 01:44:16,994][12645] Fps is (10 sec: 44235.6, 60 sec: 41232.8, 300 sec: 41265.4). Total num frames: 541523968. Throughput: 0: 41185.5. Samples: 541585620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) +[2024-06-18 01:44:16,995][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:44:20,955][12883] Updated weights for policy 0, policy_version 33061 (0.0033) +[2024-06-18 01:44:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40413.8, 300 sec: 41210.2). Total num frames: 541687808. Throughput: 0: 41012.8. Samples: 541827520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) +[2024-06-18 01:44:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:44:24,370][12883] Updated weights for policy 0, policy_version 33071 (0.0023) +[2024-06-18 01:44:26,994][12645] Fps is (10 sec: 37684.4, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 541900800. Throughput: 0: 40941.3. Samples: 542069360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) +[2024-06-18 01:44:26,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:44:29,028][12883] Updated weights for policy 0, policy_version 33081 (0.0041) +[2024-06-18 01:44:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 542130176. Throughput: 0: 40985.2. Samples: 542192760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) +[2024-06-18 01:44:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:44:32,169][12883] Updated weights for policy 0, policy_version 33091 (0.0027) +[2024-06-18 01:44:36,839][12883] Updated weights for policy 0, policy_version 33101 (0.0034) +[2024-06-18 01:44:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 542326784. Throughput: 0: 41102.3. Samples: 542450200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) +[2024-06-18 01:44:36,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:44:37,059][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033102_542343168.pth... +[2024-06-18 01:44:37,114][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032498_532447232.pth +[2024-06-18 01:44:39,973][12883] Updated weights for policy 0, policy_version 33111 (0.0048) +[2024-06-18 01:44:41,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 542523392. Throughput: 0: 40918.2. Samples: 542684920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 01:44:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:44:45,127][12883] Updated weights for policy 0, policy_version 33121 (0.0035) +[2024-06-18 01:44:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40960.1, 300 sec: 41265.4). Total num frames: 542752768. Throughput: 0: 40977.4. Samples: 542808180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 01:44:46,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:44:48,594][12883] Updated weights for policy 0, policy_version 33131 (0.0034) +[2024-06-18 01:44:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40413.8, 300 sec: 41154.4). Total num frames: 542900224. Throughput: 0: 40888.0. Samples: 543053280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 01:44:51,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:44:53,388][12883] Updated weights for policy 0, policy_version 33141 (0.0039) +[2024-06-18 01:44:56,399][12883] Updated weights for policy 0, policy_version 33151 (0.0046) +[2024-06-18 01:44:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 543145984. Throughput: 0: 40751.1. Samples: 543292260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 01:44:56,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:45:01,140][12883] Updated weights for policy 0, policy_version 33161 (0.0024) +[2024-06-18 01:45:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40687.1, 300 sec: 41154.4). Total num frames: 543326208. Throughput: 0: 40896.4. Samples: 543425940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:45:01,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:45:02,384][12862] Signal inference workers to stop experience collection... (7700 times) +[2024-06-18 01:45:02,433][12862] Signal inference workers to resume experience collection... (7700 times) +[2024-06-18 01:45:02,434][12883] InferenceWorker_p0-w0: stopping experience collection (7700 times) +[2024-06-18 01:45:02,468][12883] InferenceWorker_p0-w0: resuming experience collection (7700 times) +[2024-06-18 01:45:04,086][12883] Updated weights for policy 0, policy_version 33171 (0.0036) +[2024-06-18 01:45:06,994][12645] Fps is (10 sec: 36044.5, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 543506432. Throughput: 0: 40976.5. Samples: 543671460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:45:06,995][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:45:08,781][12883] Updated weights for policy 0, policy_version 33181 (0.0035) +[2024-06-18 01:45:11,775][12883] Updated weights for policy 0, policy_version 33191 (0.0037) +[2024-06-18 01:45:11,994][12645] Fps is (10 sec: 47512.8, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 543801344. Throughput: 0: 40922.2. Samples: 543910860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:45:11,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:45:16,818][12883] Updated weights for policy 0, policy_version 33201 (0.0051) +[2024-06-18 01:45:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 40687.1, 300 sec: 41154.4). Total num frames: 543965184. Throughput: 0: 41243.4. Samples: 544048720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 01:45:16,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:45:19,949][12883] Updated weights for policy 0, policy_version 33211 (0.0041) +[2024-06-18 01:45:21,994][12645] Fps is (10 sec: 36044.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 544161792. Throughput: 0: 40732.2. Samples: 544283160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 01:45:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:45:24,833][12883] Updated weights for policy 0, policy_version 33221 (0.0026) +[2024-06-18 01:45:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41265.5). Total num frames: 544407552. Throughput: 0: 41238.4. Samples: 544540640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 01:45:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:45:27,529][12883] Updated weights for policy 0, policy_version 33231 (0.0041) +[2024-06-18 01:45:32,000][12645] Fps is (10 sec: 40934.9, 60 sec: 40682.6, 300 sec: 41153.5). Total num frames: 544571392. Throughput: 0: 41292.6. Samples: 544666600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 01:45:32,000][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:45:33,161][12883] Updated weights for policy 0, policy_version 33241 (0.0037) +[2024-06-18 01:45:35,862][12883] Updated weights for policy 0, policy_version 33251 (0.0041) +[2024-06-18 01:45:36,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41232.9, 300 sec: 41209.9). Total num frames: 544800768. Throughput: 0: 41115.9. Samples: 544903500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 01:45:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:45:40,905][12883] Updated weights for policy 0, policy_version 33261 (0.0044) +[2024-06-18 01:45:41,994][12645] Fps is (10 sec: 44263.9, 60 sec: 41506.1, 300 sec: 41265.4). Total num frames: 545013760. Throughput: 0: 41285.6. Samples: 545150120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 01:45:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:45:43,725][12883] Updated weights for policy 0, policy_version 33271 (0.0046) +[2024-06-18 01:45:46,995][12645] Fps is (10 sec: 39318.6, 60 sec: 40686.4, 300 sec: 41154.6). Total num frames: 545193984. Throughput: 0: 41087.1. Samples: 545274900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 01:45:46,995][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:45:48,952][12883] Updated weights for policy 0, policy_version 33281 (0.0040) +[2024-06-18 01:45:51,735][12883] Updated weights for policy 0, policy_version 33291 (0.0028) +[2024-06-18 01:45:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 41376.5). Total num frames: 545439744. Throughput: 0: 41116.1. Samples: 545521680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 01:45:51,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:45:56,847][12883] Updated weights for policy 0, policy_version 33301 (0.0029) +[2024-06-18 01:45:56,994][12645] Fps is (10 sec: 42602.1, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 545619968. Throughput: 0: 41526.3. Samples: 545779540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 01:45:56,996][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 01:45:59,051][12862] Signal inference workers to stop experience collection... (7750 times) +[2024-06-18 01:45:59,052][12862] Signal inference workers to resume experience collection... (7750 times) +[2024-06-18 01:45:59,069][12883] InferenceWorker_p0-w0: stopping experience collection (7750 times) +[2024-06-18 01:45:59,103][12883] InferenceWorker_p0-w0: resuming experience collection (7750 times) +[2024-06-18 01:45:59,386][12883] Updated weights for policy 0, policy_version 33311 (0.0040) +[2024-06-18 01:46:01,994][12645] Fps is (10 sec: 36044.5, 60 sec: 41233.0, 300 sec: 41099.2). Total num frames: 545800192. Throughput: 0: 40869.4. Samples: 545887840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 01:46:01,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:46:04,670][12883] Updated weights for policy 0, policy_version 33321 (0.0036) +[2024-06-18 01:46:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41321.0). Total num frames: 546045952. Throughput: 0: 41320.0. Samples: 546142560. Policy #0 lag: (min: 1.0, avg: 7.9, max: 22.0) +[2024-06-18 01:46:06,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:46:07,571][12883] Updated weights for policy 0, policy_version 33331 (0.0032) +[2024-06-18 01:46:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40140.9, 300 sec: 40932.2). Total num frames: 546209792. Throughput: 0: 41330.2. Samples: 546400500. Policy #0 lag: (min: 1.0, avg: 7.9, max: 22.0) +[2024-06-18 01:46:11,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:46:12,680][12883] Updated weights for policy 0, policy_version 33341 (0.0042) +[2024-06-18 01:46:15,825][12883] Updated weights for policy 0, policy_version 33351 (0.0043) +[2024-06-18 01:46:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 546455552. Throughput: 0: 41129.2. Samples: 546517160. Policy #0 lag: (min: 1.0, avg: 7.9, max: 22.0) +[2024-06-18 01:46:16,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:46:20,519][12883] Updated weights for policy 0, policy_version 33361 (0.0035) +[2024-06-18 01:46:21,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 546652160. Throughput: 0: 41455.6. Samples: 546769000. Policy #0 lag: (min: 1.0, avg: 7.9, max: 22.0) +[2024-06-18 01:46:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:46:24,041][12883] Updated weights for policy 0, policy_version 33371 (0.0035) +[2024-06-18 01:46:26,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 546848768. Throughput: 0: 41592.2. Samples: 547021760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:46:26,994][12645] Avg episode reward: [(0, '0.032')] +[2024-06-18 01:46:28,518][12883] Updated weights for policy 0, policy_version 33381 (0.0034) +[2024-06-18 01:46:31,913][12883] Updated weights for policy 0, policy_version 33391 (0.0030) +[2024-06-18 01:46:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41783.4, 300 sec: 41321.0). Total num frames: 547078144. Throughput: 0: 41514.9. Samples: 547143040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:46:31,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:46:36,173][12883] Updated weights for policy 0, policy_version 33401 (0.0041) +[2024-06-18 01:46:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 547274752. Throughput: 0: 41629.7. Samples: 547395020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:46:36,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:46:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033403_547274752.pth... +[2024-06-18 01:46:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032802_537427968.pth +[2024-06-18 01:46:40,179][12883] Updated weights for policy 0, policy_version 33411 (0.0044) +[2024-06-18 01:46:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 547487744. Throughput: 0: 41344.0. Samples: 547640020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:46:41,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:46:43,887][12883] Updated weights for policy 0, policy_version 33421 (0.0034) +[2024-06-18 01:46:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.7, 300 sec: 41321.0). Total num frames: 547700736. Throughput: 0: 41711.5. Samples: 547764860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:46:46,995][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:46:47,969][12883] Updated weights for policy 0, policy_version 33431 (0.0033) +[2024-06-18 01:46:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 547880960. Throughput: 0: 41570.4. Samples: 548013220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:46:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:46:52,166][12883] Updated weights for policy 0, policy_version 33441 (0.0037) +[2024-06-18 01:46:55,495][12883] Updated weights for policy 0, policy_version 33451 (0.0045) +[2024-06-18 01:46:56,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 548077568. Throughput: 0: 41353.7. Samples: 548261420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:46:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:46:59,864][12883] Updated weights for policy 0, policy_version 33461 (0.0038) +[2024-06-18 01:47:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41210.2). Total num frames: 548306944. Throughput: 0: 41389.8. Samples: 548379700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:47:01,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:47:03,920][12883] Updated weights for policy 0, policy_version 33471 (0.0031) +[2024-06-18 01:47:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 548503552. Throughput: 0: 41301.3. Samples: 548627560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:47:06,995][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:47:07,774][12883] Updated weights for policy 0, policy_version 33481 (0.0037) +[2024-06-18 01:47:11,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 548700160. Throughput: 0: 41040.0. Samples: 548868560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:47:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:47:12,078][12883] Updated weights for policy 0, policy_version 33491 (0.0045) +[2024-06-18 01:47:15,788][12883] Updated weights for policy 0, policy_version 33501 (0.0032) +[2024-06-18 01:47:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 548929536. Throughput: 0: 41009.5. Samples: 548988460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:47:16,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:47:19,962][12883] Updated weights for policy 0, policy_version 33511 (0.0036) +[2024-06-18 01:47:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 549126144. Throughput: 0: 41112.9. Samples: 549245100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:47:21,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:47:23,866][12883] Updated weights for policy 0, policy_version 33521 (0.0031) +[2024-06-18 01:47:26,387][12862] Signal inference workers to stop experience collection... (7800 times) +[2024-06-18 01:47:26,388][12862] Signal inference workers to resume experience collection... (7800 times) +[2024-06-18 01:47:26,413][12883] InferenceWorker_p0-w0: stopping experience collection (7800 times) +[2024-06-18 01:47:26,413][12883] InferenceWorker_p0-w0: resuming experience collection (7800 times) +[2024-06-18 01:47:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 549339136. Throughput: 0: 41069.0. Samples: 549488120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 01:47:26,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:47:27,859][12883] Updated weights for policy 0, policy_version 33531 (0.0034) +[2024-06-18 01:47:31,843][12883] Updated weights for policy 0, policy_version 33541 (0.0034) +[2024-06-18 01:47:31,996][12645] Fps is (10 sec: 40951.2, 60 sec: 40958.6, 300 sec: 41043.0). Total num frames: 549535744. Throughput: 0: 40977.2. Samples: 549608920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:47:31,996][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:47:35,680][12883] Updated weights for policy 0, policy_version 33551 (0.0035) +[2024-06-18 01:47:36,996][12645] Fps is (10 sec: 40950.7, 60 sec: 41231.6, 300 sec: 41154.1). Total num frames: 549748736. Throughput: 0: 41117.9. Samples: 549863620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:47:36,997][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:47:39,533][12883] Updated weights for policy 0, policy_version 33561 (0.0034) +[2024-06-18 01:47:41,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 549945344. Throughput: 0: 41204.4. Samples: 550115620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:47:41,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:47:43,651][12883] Updated weights for policy 0, policy_version 33571 (0.0033) +[2024-06-18 01:47:46,994][12645] Fps is (10 sec: 42607.0, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 550174720. Throughput: 0: 41260.8. Samples: 550236440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:47:46,995][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:47:47,142][12883] Updated weights for policy 0, policy_version 33581 (0.0042) +[2024-06-18 01:47:51,823][12883] Updated weights for policy 0, policy_version 33591 (0.0030) +[2024-06-18 01:47:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 550371328. Throughput: 0: 41360.9. Samples: 550488800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 01:47:51,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:47:54,870][12883] Updated weights for policy 0, policy_version 33601 (0.0035) +[2024-06-18 01:47:56,994][12645] Fps is (10 sec: 39322.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 550567936. Throughput: 0: 41425.3. Samples: 550732700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 01:47:56,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:47:59,591][12883] Updated weights for policy 0, policy_version 33611 (0.0029) +[2024-06-18 01:48:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 550797312. Throughput: 0: 41558.7. Samples: 550858600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 01:48:01,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:48:02,745][12883] Updated weights for policy 0, policy_version 33621 (0.0032) +[2024-06-18 01:48:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 550977536. Throughput: 0: 41395.1. Samples: 551107880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 01:48:06,999][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 01:48:07,527][12883] Updated weights for policy 0, policy_version 33631 (0.0035) +[2024-06-18 01:48:10,995][12883] Updated weights for policy 0, policy_version 33641 (0.0035) +[2024-06-18 01:48:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 551206912. Throughput: 0: 41309.3. Samples: 551347040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 01:48:11,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:48:15,520][12883] Updated weights for policy 0, policy_version 33651 (0.0040) +[2024-06-18 01:48:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 551387136. Throughput: 0: 41587.8. Samples: 551480280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 01:48:16,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:48:18,713][12883] Updated weights for policy 0, policy_version 33661 (0.0034) +[2024-06-18 01:48:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 551600128. Throughput: 0: 41310.9. Samples: 551722520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 01:48:21,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:48:23,468][12883] Updated weights for policy 0, policy_version 33671 (0.0043) +[2024-06-18 01:48:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 551829504. Throughput: 0: 41141.4. Samples: 551966980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 01:48:26,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:48:26,998][12883] Updated weights for policy 0, policy_version 33681 (0.0032) +[2024-06-18 01:48:31,392][12883] Updated weights for policy 0, policy_version 33691 (0.0029) +[2024-06-18 01:48:31,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41234.6, 300 sec: 41209.9). Total num frames: 552009728. Throughput: 0: 41294.9. Samples: 552094700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 01:48:31,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:48:34,871][12883] Updated weights for policy 0, policy_version 33701 (0.0045) +[2024-06-18 01:48:36,994][12645] Fps is (10 sec: 37682.5, 60 sec: 40961.4, 300 sec: 41154.4). Total num frames: 552206336. Throughput: 0: 41023.5. Samples: 552334860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 01:48:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:48:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033704_552206336.pth... +[2024-06-18 01:48:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033102_542343168.pth +[2024-06-18 01:48:39,456][12883] Updated weights for policy 0, policy_version 33711 (0.0047) +[2024-06-18 01:48:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 552435712. Throughput: 0: 41128.5. Samples: 552583480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 01:48:41,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:48:43,029][12883] Updated weights for policy 0, policy_version 33721 (0.0032) +[2024-06-18 01:48:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40960.2, 300 sec: 41209.9). Total num frames: 552632320. Throughput: 0: 41047.5. Samples: 552705740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 01:48:46,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:48:47,196][12883] Updated weights for policy 0, policy_version 33731 (0.0034) +[2024-06-18 01:48:51,458][12883] Updated weights for policy 0, policy_version 33741 (0.0040) +[2024-06-18 01:48:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 552845312. Throughput: 0: 41005.3. Samples: 552953120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 01:48:51,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:48:55,096][12883] Updated weights for policy 0, policy_version 33751 (0.0039) +[2024-06-18 01:48:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 553041920. Throughput: 0: 41158.3. Samples: 553199160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 01:48:56,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:48:59,725][12883] Updated weights for policy 0, policy_version 33761 (0.0031) +[2024-06-18 01:49:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 553254912. Throughput: 0: 41101.8. Samples: 553329860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:49:01,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:49:02,898][12883] Updated weights for policy 0, policy_version 33771 (0.0039) +[2024-06-18 01:49:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 553451520. Throughput: 0: 41152.9. Samples: 553574400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:49:06,998][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:49:07,371][12883] Updated weights for policy 0, policy_version 33781 (0.0046) +[2024-06-18 01:49:10,320][12862] Signal inference workers to stop experience collection... (7850 times) +[2024-06-18 01:49:10,320][12862] Signal inference workers to resume experience collection... (7850 times) +[2024-06-18 01:49:10,356][12883] InferenceWorker_p0-w0: stopping experience collection (7850 times) +[2024-06-18 01:49:10,356][12883] InferenceWorker_p0-w0: resuming experience collection (7850 times) +[2024-06-18 01:49:11,117][12883] Updated weights for policy 0, policy_version 33791 (0.0037) +[2024-06-18 01:49:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41210.0). Total num frames: 553680896. Throughput: 0: 41098.2. Samples: 553816400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:49:11,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:49:15,260][12883] Updated weights for policy 0, policy_version 33801 (0.0042) +[2024-06-18 01:49:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 553861120. Throughput: 0: 40905.2. Samples: 553935440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:49:16,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:49:19,200][12883] Updated weights for policy 0, policy_version 33811 (0.0037) +[2024-06-18 01:49:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 554074112. Throughput: 0: 41116.1. Samples: 554185080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 01:49:21,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:49:23,586][12883] Updated weights for policy 0, policy_version 33821 (0.0031) +[2024-06-18 01:49:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 554287104. Throughput: 0: 41170.6. Samples: 554436160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 01:49:26,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:49:26,999][12883] Updated weights for policy 0, policy_version 33831 (0.0036) +[2024-06-18 01:49:31,436][12883] Updated weights for policy 0, policy_version 33841 (0.0035) +[2024-06-18 01:49:31,993][12645] Fps is (10 sec: 37683.8, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 554450944. Throughput: 0: 41141.0. Samples: 554557080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 01:49:31,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:49:35,048][12883] Updated weights for policy 0, policy_version 33851 (0.0040) +[2024-06-18 01:49:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 554713088. Throughput: 0: 41196.4. Samples: 554806960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 01:49:36,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:49:39,217][12883] Updated weights for policy 0, policy_version 33861 (0.0040) +[2024-06-18 01:49:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 554893312. Throughput: 0: 41396.1. Samples: 555061980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 01:49:41,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:49:43,034][12883] Updated weights for policy 0, policy_version 33871 (0.0035) +[2024-06-18 01:49:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 555089920. Throughput: 0: 41077.0. Samples: 555178320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 01:49:46,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:49:47,058][12883] Updated weights for policy 0, policy_version 33881 (0.0039) +[2024-06-18 01:49:50,759][12883] Updated weights for policy 0, policy_version 33891 (0.0036) +[2024-06-18 01:49:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 555319296. Throughput: 0: 41257.9. Samples: 555431000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 01:49:51,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:49:55,064][12883] Updated weights for policy 0, policy_version 33901 (0.0039) +[2024-06-18 01:49:56,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 555515904. Throughput: 0: 41449.1. Samples: 555681620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 01:49:56,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:49:58,522][12883] Updated weights for policy 0, policy_version 33911 (0.0027) +[2024-06-18 01:50:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 555712512. Throughput: 0: 41431.1. Samples: 555799840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 01:50:01,995][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:50:02,850][12883] Updated weights for policy 0, policy_version 33921 (0.0040) +[2024-06-18 01:50:06,242][12883] Updated weights for policy 0, policy_version 33931 (0.0040) +[2024-06-18 01:50:06,994][12645] Fps is (10 sec: 44237.8, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 555958272. Throughput: 0: 41547.2. Samples: 556054700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:50:07,000][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 01:50:10,705][12883] Updated weights for policy 0, policy_version 33941 (0.0026) +[2024-06-18 01:50:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 556138496. Throughput: 0: 41404.0. Samples: 556299340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:50:11,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 01:50:14,399][12883] Updated weights for policy 0, policy_version 33951 (0.0047) +[2024-06-18 01:50:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 556335104. Throughput: 0: 41536.4. Samples: 556426220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:50:16,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:50:18,348][12883] Updated weights for policy 0, policy_version 33961 (0.0042) +[2024-06-18 01:50:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 556548096. Throughput: 0: 41609.4. Samples: 556679380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:50:21,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:50:22,427][12883] Updated weights for policy 0, policy_version 33971 (0.0037) +[2024-06-18 01:50:26,158][12883] Updated weights for policy 0, policy_version 33981 (0.0034) +[2024-06-18 01:50:26,994][12645] Fps is (10 sec: 45874.6, 60 sec: 41779.1, 300 sec: 41432.9). Total num frames: 556793856. Throughput: 0: 41440.3. Samples: 556926800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:50:26,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 01:50:30,429][12883] Updated weights for policy 0, policy_version 33991 (0.0048) +[2024-06-18 01:50:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 556957696. Throughput: 0: 41735.5. Samples: 557056420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:50:31,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:50:33,795][12883] Updated weights for policy 0, policy_version 34001 (0.0044) +[2024-06-18 01:50:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 557187072. Throughput: 0: 41446.5. Samples: 557296100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:50:36,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:50:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034008_557187072.pth... +[2024-06-18 01:50:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033403_547274752.pth +[2024-06-18 01:50:38,607][12883] Updated weights for policy 0, policy_version 34011 (0.0036) +[2024-06-18 01:50:41,084][12862] Signal inference workers to stop experience collection... (7900 times) +[2024-06-18 01:50:41,084][12862] Signal inference workers to resume experience collection... (7900 times) +[2024-06-18 01:50:41,111][12883] InferenceWorker_p0-w0: stopping experience collection (7900 times) +[2024-06-18 01:50:41,111][12883] InferenceWorker_p0-w0: resuming experience collection (7900 times) +[2024-06-18 01:50:41,583][12883] Updated weights for policy 0, policy_version 34021 (0.0034) +[2024-06-18 01:50:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.1, 300 sec: 41432.2). Total num frames: 557416448. Throughput: 0: 41586.7. Samples: 557553020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:50:41,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:50:46,634][12883] Updated weights for policy 0, policy_version 34031 (0.0043) +[2024-06-18 01:50:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 557580288. Throughput: 0: 41865.9. Samples: 557683800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 01:50:46,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:50:49,176][12883] Updated weights for policy 0, policy_version 34041 (0.0038) +[2024-06-18 01:50:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 557826048. Throughput: 0: 41567.5. Samples: 557925240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:50:51,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:50:54,613][12883] Updated weights for policy 0, policy_version 34051 (0.0042) +[2024-06-18 01:50:56,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42052.4, 300 sec: 41487.6). Total num frames: 558039040. Throughput: 0: 41653.7. Samples: 558173760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:50:56,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:50:57,013][12883] Updated weights for policy 0, policy_version 34061 (0.0051) +[2024-06-18 01:51:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41506.3, 300 sec: 41210.0). Total num frames: 558202880. Throughput: 0: 41534.7. Samples: 558295280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:51:01,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:51:02,384][12883] Updated weights for policy 0, policy_version 34071 (0.0042) +[2024-06-18 01:51:04,983][12883] Updated weights for policy 0, policy_version 34081 (0.0040) +[2024-06-18 01:51:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 558465024. Throughput: 0: 41275.1. Samples: 558536760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:51:06,994][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 01:51:10,438][12883] Updated weights for policy 0, policy_version 34091 (0.0041) +[2024-06-18 01:51:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 558612480. Throughput: 0: 41679.1. Samples: 558802360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) +[2024-06-18 01:51:11,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:51:13,085][12883] Updated weights for policy 0, policy_version 34101 (0.0044) +[2024-06-18 01:51:16,994][12645] Fps is (10 sec: 36044.9, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 558825472. Throughput: 0: 41189.8. Samples: 558909960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 01:51:16,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:51:18,146][12883] Updated weights for policy 0, policy_version 34111 (0.0031) +[2024-06-18 01:51:20,860][12883] Updated weights for policy 0, policy_version 34121 (0.0045) +[2024-06-18 01:51:21,994][12645] Fps is (10 sec: 47514.2, 60 sec: 42325.4, 300 sec: 41487.6). Total num frames: 559087616. Throughput: 0: 41523.3. Samples: 559164640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 01:51:21,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:51:25,991][12883] Updated weights for policy 0, policy_version 34131 (0.0036) +[2024-06-18 01:51:27,000][12645] Fps is (10 sec: 39297.0, 60 sec: 40409.7, 300 sec: 41153.5). Total num frames: 559218688. Throughput: 0: 41592.6. Samples: 559424940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 01:51:27,000][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:51:29,105][12883] Updated weights for policy 0, policy_version 34141 (0.0040) +[2024-06-18 01:51:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 559464448. Throughput: 0: 41204.0. Samples: 559537980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 01:51:31,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:51:33,910][12883] Updated weights for policy 0, policy_version 34151 (0.0033) +[2024-06-18 01:51:36,994][12645] Fps is (10 sec: 45903.9, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 559677440. Throughput: 0: 41358.6. Samples: 559786380. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 01:51:36,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 01:51:37,410][12883] Updated weights for policy 0, policy_version 34161 (0.0030) +[2024-06-18 01:51:41,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40414.0, 300 sec: 41154.4). Total num frames: 559841280. Throughput: 0: 41347.6. Samples: 560034400. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 01:51:41,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:51:42,073][12883] Updated weights for policy 0, policy_version 34171 (0.0034) +[2024-06-18 01:51:44,157][12862] Signal inference workers to stop experience collection... (7950 times) +[2024-06-18 01:51:44,157][12862] Signal inference workers to resume experience collection... (7950 times) +[2024-06-18 01:51:44,188][12883] InferenceWorker_p0-w0: stopping experience collection (7950 times) +[2024-06-18 01:51:44,188][12883] InferenceWorker_p0-w0: resuming experience collection (7950 times) +[2024-06-18 01:51:45,245][12883] Updated weights for policy 0, policy_version 34181 (0.0043) +[2024-06-18 01:51:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 560103424. Throughput: 0: 41310.1. Samples: 560154240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 01:51:46,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:51:49,851][12883] Updated weights for policy 0, policy_version 34191 (0.0038) +[2024-06-18 01:51:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 560300032. Throughput: 0: 41700.5. Samples: 560413280. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 01:51:51,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:51:52,915][12883] Updated weights for policy 0, policy_version 34201 (0.0027) +[2024-06-18 01:51:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 560480256. Throughput: 0: 41338.8. Samples: 560662600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 01:51:56,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:51:57,797][12883] Updated weights for policy 0, policy_version 34211 (0.0032) +[2024-06-18 01:52:00,598][12883] Updated weights for policy 0, policy_version 34221 (0.0042) +[2024-06-18 01:52:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 41432.1). Total num frames: 560726016. Throughput: 0: 41510.6. Samples: 560777940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 01:52:01,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:52:05,594][12883] Updated weights for policy 0, policy_version 34231 (0.0042) +[2024-06-18 01:52:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.9, 300 sec: 41321.0). Total num frames: 560889856. Throughput: 0: 41514.2. Samples: 561032780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 01:52:06,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:52:08,427][12883] Updated weights for policy 0, policy_version 34241 (0.0035) +[2024-06-18 01:52:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 561119232. Throughput: 0: 41211.5. Samples: 561279200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 01:52:11,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:52:13,572][12883] Updated weights for policy 0, policy_version 34251 (0.0047) +[2024-06-18 01:52:16,295][12883] Updated weights for policy 0, policy_version 34261 (0.0043) +[2024-06-18 01:52:16,996][12645] Fps is (10 sec: 44226.4, 60 sec: 41777.6, 300 sec: 41376.2). Total num frames: 561332224. Throughput: 0: 41421.4. Samples: 561402040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 01:52:16,997][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 01:52:21,591][12883] Updated weights for policy 0, policy_version 34271 (0.0037) +[2024-06-18 01:52:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.8, 300 sec: 41209.9). Total num frames: 561496064. Throughput: 0: 41470.7. Samples: 561652560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 01:52:21,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 01:52:24,257][12883] Updated weights for policy 0, policy_version 34281 (0.0041) +[2024-06-18 01:52:27,000][12645] Fps is (10 sec: 40944.0, 60 sec: 42052.3, 300 sec: 41376.0). Total num frames: 561741824. Throughput: 0: 41363.2. Samples: 561896000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 01:52:27,000][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 01:52:29,283][12883] Updated weights for policy 0, policy_version 34291 (0.0034) +[2024-06-18 01:52:31,994][12645] Fps is (10 sec: 47513.0, 60 sec: 41779.1, 300 sec: 41432.4). Total num frames: 561971200. Throughput: 0: 41559.5. Samples: 562024420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 01:52:31,995][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:52:32,173][12883] Updated weights for policy 0, policy_version 34301 (0.0049) +[2024-06-18 01:52:36,994][12645] Fps is (10 sec: 39346.0, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 562135040. Throughput: 0: 41265.3. Samples: 562270220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 01:52:36,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:52:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034310_562135040.pth... +[2024-06-18 01:52:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033704_552206336.pth +[2024-06-18 01:52:37,237][12883] Updated weights for policy 0, policy_version 34311 (0.0030) +[2024-06-18 01:52:40,275][12883] Updated weights for policy 0, policy_version 34321 (0.0026) +[2024-06-18 01:52:41,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42323.7, 300 sec: 41376.3). Total num frames: 562380800. Throughput: 0: 41188.5. Samples: 562516180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:52:41,997][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:52:45,153][12883] Updated weights for policy 0, policy_version 34331 (0.0039) +[2024-06-18 01:52:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40959.9, 300 sec: 41321.0). Total num frames: 562561024. Throughput: 0: 41370.2. Samples: 562639600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:52:46,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:52:48,542][12883] Updated weights for policy 0, policy_version 34341 (0.0047) +[2024-06-18 01:52:51,994][12645] Fps is (10 sec: 37691.6, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 562757632. Throughput: 0: 41123.9. Samples: 562883360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:52:51,999][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 01:52:53,154][12883] Updated weights for policy 0, policy_version 34351 (0.0042) +[2024-06-18 01:52:56,551][12883] Updated weights for policy 0, policy_version 34361 (0.0033) +[2024-06-18 01:52:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 41376.5). Total num frames: 563003392. Throughput: 0: 41037.0. Samples: 563125860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:52:56,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:53:00,836][12883] Updated weights for policy 0, policy_version 34371 (0.0045) +[2024-06-18 01:53:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40960.1, 300 sec: 41376.6). Total num frames: 563183616. Throughput: 0: 41234.6. Samples: 563257500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 01:53:01,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:53:04,517][12883] Updated weights for policy 0, policy_version 34381 (0.0026) +[2024-06-18 01:53:06,994][12645] Fps is (10 sec: 36044.5, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 563363840. Throughput: 0: 41151.0. Samples: 563504360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:53:06,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:53:08,397][12883] Updated weights for policy 0, policy_version 34391 (0.0026) +[2024-06-18 01:53:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 563593216. Throughput: 0: 41136.3. Samples: 563746880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:53:11,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:53:12,205][12862] Signal inference workers to stop experience collection... (8000 times) +[2024-06-18 01:53:12,211][12862] Signal inference workers to resume experience collection... (8000 times) +[2024-06-18 01:53:12,228][12883] InferenceWorker_p0-w0: stopping experience collection (8000 times) +[2024-06-18 01:53:12,229][12883] InferenceWorker_p0-w0: resuming experience collection (8000 times) +[2024-06-18 01:53:12,359][12883] Updated weights for policy 0, policy_version 34401 (0.0045) +[2024-06-18 01:53:16,518][12883] Updated weights for policy 0, policy_version 34411 (0.0037) +[2024-06-18 01:53:16,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41233.1, 300 sec: 41376.2). Total num frames: 563806208. Throughput: 0: 41151.3. Samples: 563876320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:53:16,997][12645] Avg episode reward: [(0, '0.032')] +[2024-06-18 01:53:20,682][12883] Updated weights for policy 0, policy_version 34421 (0.0034) +[2024-06-18 01:53:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 563986432. Throughput: 0: 41071.5. Samples: 564118440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 01:53:21,996][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:53:24,744][12883] Updated weights for policy 0, policy_version 34431 (0.0031) +[2024-06-18 01:53:26,994][12645] Fps is (10 sec: 40969.5, 60 sec: 41237.3, 300 sec: 41376.5). Total num frames: 564215808. Throughput: 0: 40987.0. Samples: 564360500. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) +[2024-06-18 01:53:26,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:53:28,568][12883] Updated weights for policy 0, policy_version 34441 (0.0046) +[2024-06-18 01:53:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40687.0, 300 sec: 41376.6). Total num frames: 564412416. Throughput: 0: 41058.3. Samples: 564487220. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) +[2024-06-18 01:53:31,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:53:32,562][12883] Updated weights for policy 0, policy_version 34451 (0.0041) +[2024-06-18 01:53:36,532][12883] Updated weights for policy 0, policy_version 34461 (0.0047) +[2024-06-18 01:53:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 564609024. Throughput: 0: 41085.3. Samples: 564732200. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) +[2024-06-18 01:53:36,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 01:53:40,510][12883] Updated weights for policy 0, policy_version 34471 (0.0033) +[2024-06-18 01:53:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40961.6, 300 sec: 41376.5). Total num frames: 564838400. Throughput: 0: 40962.2. Samples: 564969160. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) +[2024-06-18 01:53:41,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 01:53:44,946][12883] Updated weights for policy 0, policy_version 34481 (0.0031) +[2024-06-18 01:53:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 565002240. Throughput: 0: 40927.5. Samples: 565099240. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) +[2024-06-18 01:53:46,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 01:53:48,398][12883] Updated weights for policy 0, policy_version 34491 (0.0037) +[2024-06-18 01:53:51,996][12645] Fps is (10 sec: 37675.9, 60 sec: 40958.7, 300 sec: 41265.2). Total num frames: 565215232. Throughput: 0: 40868.1. Samples: 565343500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:53:51,996][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:53:53,195][12883] Updated weights for policy 0, policy_version 34501 (0.0031) +[2024-06-18 01:53:56,762][12883] Updated weights for policy 0, policy_version 34511 (0.0028) +[2024-06-18 01:53:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 565444608. Throughput: 0: 40883.6. Samples: 565586640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:53:56,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:54:01,531][12883] Updated weights for policy 0, policy_version 34521 (0.0043) +[2024-06-18 01:54:01,994][12645] Fps is (10 sec: 39329.2, 60 sec: 40413.8, 300 sec: 41209.9). Total num frames: 565608448. Throughput: 0: 40741.2. Samples: 565709580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:54:01,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 01:54:04,556][12883] Updated weights for policy 0, policy_version 34531 (0.0031) +[2024-06-18 01:54:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 565837824. Throughput: 0: 40560.1. Samples: 565943640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 01:54:06,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:54:09,691][12883] Updated weights for policy 0, policy_version 34541 (0.0035) +[2024-06-18 01:54:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40140.8, 300 sec: 41154.4). Total num frames: 566001664. Throughput: 0: 40786.7. Samples: 566195900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 01:54:11,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:54:12,619][12883] Updated weights for policy 0, policy_version 34551 (0.0044) +[2024-06-18 01:54:16,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40142.3, 300 sec: 41154.4). Total num frames: 566214656. Throughput: 0: 40456.4. Samples: 566307760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 01:54:16,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:54:17,641][12883] Updated weights for policy 0, policy_version 34561 (0.0040) +[2024-06-18 01:54:20,661][12883] Updated weights for policy 0, policy_version 34571 (0.0029) +[2024-06-18 01:54:21,994][12645] Fps is (10 sec: 45874.5, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 566460416. Throughput: 0: 40527.5. Samples: 566555940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 01:54:21,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:54:25,929][12883] Updated weights for policy 0, policy_version 34581 (0.0041) +[2024-06-18 01:54:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40140.7, 300 sec: 41265.4). Total num frames: 566624256. Throughput: 0: 40903.0. Samples: 566809800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 01:54:26,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:54:28,705][12883] Updated weights for policy 0, policy_version 34591 (0.0036) +[2024-06-18 01:54:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40413.9, 300 sec: 41098.9). Total num frames: 566837248. Throughput: 0: 40448.9. Samples: 566919440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 01:54:31,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:54:33,799][12883] Updated weights for policy 0, policy_version 34601 (0.0032) +[2024-06-18 01:54:35,577][12862] Signal inference workers to stop experience collection... (8050 times) +[2024-06-18 01:54:35,616][12883] InferenceWorker_p0-w0: stopping experience collection (8050 times) +[2024-06-18 01:54:35,637][12862] Signal inference workers to resume experience collection... (8050 times) +[2024-06-18 01:54:35,639][12883] InferenceWorker_p0-w0: resuming experience collection (8050 times) +[2024-06-18 01:54:36,841][12883] Updated weights for policy 0, policy_version 34611 (0.0032) +[2024-06-18 01:54:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 567066624. Throughput: 0: 40795.4. Samples: 567179220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 01:54:36,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:54:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034611_567066624.pth... +[2024-06-18 01:54:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034008_557187072.pth +[2024-06-18 01:54:41,571][12883] Updated weights for policy 0, policy_version 34621 (0.0037) +[2024-06-18 01:54:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39867.7, 300 sec: 41154.4). Total num frames: 567230464. Throughput: 0: 40934.6. Samples: 567428700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 01:54:41,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:54:44,826][12883] Updated weights for policy 0, policy_version 34631 (0.0036) +[2024-06-18 01:54:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 567459840. Throughput: 0: 40752.0. Samples: 567543420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 01:54:46,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 01:54:49,422][12883] Updated weights for policy 0, policy_version 34641 (0.0038) +[2024-06-18 01:54:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 40961.3, 300 sec: 41209.9). Total num frames: 567672832. Throughput: 0: 41014.1. Samples: 567789280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 01:54:51,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:54:52,922][12883] Updated weights for policy 0, policy_version 34651 (0.0047) +[2024-06-18 01:54:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39867.7, 300 sec: 41098.9). Total num frames: 567836672. Throughput: 0: 41045.8. Samples: 568042960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:54:56,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:54:57,310][12883] Updated weights for policy 0, policy_version 34661 (0.0037) +[2024-06-18 01:55:00,954][12883] Updated weights for policy 0, policy_version 34671 (0.0039) +[2024-06-18 01:55:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 568098816. Throughput: 0: 41019.5. Samples: 568153640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:55:01,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:55:05,360][12883] Updated weights for policy 0, policy_version 34681 (0.0047) +[2024-06-18 01:55:06,994][12645] Fps is (10 sec: 45875.1, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 568295424. Throughput: 0: 41123.7. Samples: 568406500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:55:06,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:55:09,295][12883] Updated weights for policy 0, policy_version 34691 (0.0033) +[2024-06-18 01:55:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 568492032. Throughput: 0: 40757.9. Samples: 568643900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:55:11,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:55:13,395][12883] Updated weights for policy 0, policy_version 34701 (0.0053) +[2024-06-18 01:55:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 568688640. Throughput: 0: 41188.0. Samples: 568772900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 01:55:16,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:55:17,273][12883] Updated weights for policy 0, policy_version 34711 (0.0042) +[2024-06-18 01:55:21,160][12883] Updated weights for policy 0, policy_version 34721 (0.0036) +[2024-06-18 01:55:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40140.8, 300 sec: 40932.2). Total num frames: 568868864. Throughput: 0: 40727.6. Samples: 569011960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:55:21,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:55:25,369][12883] Updated weights for policy 0, policy_version 34731 (0.0037) +[2024-06-18 01:55:27,000][12645] Fps is (10 sec: 40934.5, 60 sec: 41228.9, 300 sec: 41153.5). Total num frames: 569098240. Throughput: 0: 40644.7. Samples: 569257960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:55:27,000][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 01:55:29,572][12883] Updated weights for policy 0, policy_version 34741 (0.0035) +[2024-06-18 01:55:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 569327616. Throughput: 0: 40813.7. Samples: 569380040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:55:31,994][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 01:55:33,559][12883] Updated weights for policy 0, policy_version 34751 (0.0046) +[2024-06-18 01:55:36,994][12645] Fps is (10 sec: 40985.2, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 569507840. Throughput: 0: 40736.9. Samples: 569622440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:55:36,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:55:37,759][12883] Updated weights for policy 0, policy_version 34761 (0.0036) +[2024-06-18 01:55:41,561][12883] Updated weights for policy 0, policy_version 34771 (0.0052) +[2024-06-18 01:55:41,994][12645] Fps is (10 sec: 36045.3, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 569688064. Throughput: 0: 40474.3. Samples: 569864300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:55:41,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:55:45,651][12883] Updated weights for policy 0, policy_version 34781 (0.0037) +[2024-06-18 01:55:46,996][12645] Fps is (10 sec: 40951.0, 60 sec: 40958.5, 300 sec: 40987.5). Total num frames: 569917440. Throughput: 0: 40608.3. Samples: 569981100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:55:46,996][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:55:49,702][12883] Updated weights for policy 0, policy_version 34791 (0.0036) +[2024-06-18 01:55:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 570097664. Throughput: 0: 40552.1. Samples: 570231340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:55:51,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:55:53,404][12883] Updated weights for policy 0, policy_version 34801 (0.0030) +[2024-06-18 01:55:56,994][12645] Fps is (10 sec: 40968.6, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 570327040. Throughput: 0: 40590.2. Samples: 570470460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:55:56,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 01:55:57,367][12883] Updated weights for policy 0, policy_version 34811 (0.0030) +[2024-06-18 01:55:59,269][12862] Signal inference workers to stop experience collection... (8100 times) +[2024-06-18 01:55:59,270][12862] Signal inference workers to resume experience collection... (8100 times) +[2024-06-18 01:55:59,310][12883] InferenceWorker_p0-w0: stopping experience collection (8100 times) +[2024-06-18 01:55:59,310][12883] InferenceWorker_p0-w0: resuming experience collection (8100 times) +[2024-06-18 01:56:01,271][12883] Updated weights for policy 0, policy_version 34821 (0.0031) +[2024-06-18 01:56:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40140.9, 300 sec: 40821.2). Total num frames: 570507264. Throughput: 0: 40434.7. Samples: 570592460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:56:01,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:56:05,398][12883] Updated weights for policy 0, policy_version 34831 (0.0048) +[2024-06-18 01:56:06,994][12645] Fps is (10 sec: 39322.5, 60 sec: 40413.9, 300 sec: 41043.3). Total num frames: 570720256. Throughput: 0: 40512.1. Samples: 570835000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:56:06,994][12645] Avg episode reward: [(0, '0.002')] +[2024-06-18 01:56:09,748][12883] Updated weights for policy 0, policy_version 34841 (0.0039) +[2024-06-18 01:56:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.9, 300 sec: 40932.2). Total num frames: 570900480. Throughput: 0: 40660.4. Samples: 571087420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:56:11,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:56:13,030][12883] Updated weights for policy 0, policy_version 34851 (0.0038) +[2024-06-18 01:56:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 571146240. Throughput: 0: 40722.2. Samples: 571212540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:56:16,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 01:56:17,723][12883] Updated weights for policy 0, policy_version 34861 (0.0040) +[2024-06-18 01:56:21,575][12883] Updated weights for policy 0, policy_version 34871 (0.0031) +[2024-06-18 01:56:21,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41233.1, 300 sec: 41099.7). Total num frames: 571342848. Throughput: 0: 40771.1. Samples: 571457140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 01:56:21,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:56:25,484][12883] Updated weights for policy 0, policy_version 34881 (0.0035) +[2024-06-18 01:56:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40691.1, 300 sec: 40932.2). Total num frames: 571539456. Throughput: 0: 41092.7. Samples: 571713480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:56:27,003][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 01:56:29,542][12883] Updated weights for policy 0, policy_version 34891 (0.0040) +[2024-06-18 01:56:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 571768832. Throughput: 0: 41215.3. Samples: 571835700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:56:31,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:56:33,158][12883] Updated weights for policy 0, policy_version 34901 (0.0037) +[2024-06-18 01:56:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 571949056. Throughput: 0: 41034.1. Samples: 572077880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:56:36,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:56:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034909_571949056.pth... +[2024-06-18 01:56:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034310_562135040.pth +[2024-06-18 01:56:37,370][12883] Updated weights for policy 0, policy_version 34911 (0.0037) +[2024-06-18 01:56:41,111][12883] Updated weights for policy 0, policy_version 34921 (0.0040) +[2024-06-18 01:56:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 40876.7). Total num frames: 572162048. Throughput: 0: 41194.7. Samples: 572324220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:56:41,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:56:45,321][12883] Updated weights for policy 0, policy_version 34931 (0.0033) +[2024-06-18 01:56:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40961.5, 300 sec: 40932.2). Total num frames: 572375040. Throughput: 0: 41293.2. Samples: 572450660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 01:56:46,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 01:56:48,974][12883] Updated weights for policy 0, policy_version 34941 (0.0024) +[2024-06-18 01:56:51,996][12645] Fps is (10 sec: 42589.2, 60 sec: 41504.5, 300 sec: 41043.0). Total num frames: 572588032. Throughput: 0: 41334.8. Samples: 572695160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 01:56:51,996][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:56:53,351][12883] Updated weights for policy 0, policy_version 34951 (0.0035) +[2024-06-18 01:56:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 572784640. Throughput: 0: 40988.8. Samples: 572931920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 01:56:56,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:56:57,571][12883] Updated weights for policy 0, policy_version 34961 (0.0057) +[2024-06-18 01:57:01,566][12883] Updated weights for policy 0, policy_version 34971 (0.0046) +[2024-06-18 01:57:01,994][12645] Fps is (10 sec: 37692.0, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 572964864. Throughput: 0: 41095.7. Samples: 573061840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 01:57:01,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 01:57:05,596][12883] Updated weights for policy 0, policy_version 34981 (0.0035) +[2024-06-18 01:57:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 40932.3). Total num frames: 573194240. Throughput: 0: 41025.5. Samples: 573303280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 01:57:06,994][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 01:57:07,081][12862] Saving new best policy, reward=0.054! +[2024-06-18 01:57:09,297][12883] Updated weights for policy 0, policy_version 34991 (0.0044) +[2024-06-18 01:57:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41779.1, 300 sec: 40932.5). Total num frames: 573407232. Throughput: 0: 40681.0. Samples: 573544120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 01:57:11,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 01:57:13,458][12883] Updated weights for policy 0, policy_version 35001 (0.0040) +[2024-06-18 01:57:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 573571072. Throughput: 0: 40611.7. Samples: 573663220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 01:57:16,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 01:57:18,144][12883] Updated weights for policy 0, policy_version 35011 (0.0037) +[2024-06-18 01:57:21,127][12883] Updated weights for policy 0, policy_version 35021 (0.0025) +[2024-06-18 01:57:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 40933.1). Total num frames: 573816832. Throughput: 0: 40805.3. Samples: 573914120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 01:57:21,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 01:57:26,035][12883] Updated weights for policy 0, policy_version 35031 (0.0047) +[2024-06-18 01:57:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41233.2, 300 sec: 40821.2). Total num frames: 574013440. Throughput: 0: 40741.0. Samples: 574157560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 01:57:26,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:57:28,523][12862] Signal inference workers to stop experience collection... (8150 times) +[2024-06-18 01:57:28,524][12862] Signal inference workers to resume experience collection... (8150 times) +[2024-06-18 01:57:28,563][12883] InferenceWorker_p0-w0: stopping experience collection (8150 times) +[2024-06-18 01:57:28,563][12883] InferenceWorker_p0-w0: resuming experience collection (8150 times) +[2024-06-18 01:57:29,367][12883] Updated weights for policy 0, policy_version 35041 (0.0033) +[2024-06-18 01:57:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 574193664. Throughput: 0: 40611.6. Samples: 574278180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 01:57:31,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 01:57:33,867][12883] Updated weights for policy 0, policy_version 35051 (0.0036) +[2024-06-18 01:57:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 40821.5). Total num frames: 574423040. Throughput: 0: 40675.3. Samples: 574525460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:57:36,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:57:37,365][12883] Updated weights for policy 0, policy_version 35061 (0.0030) +[2024-06-18 01:57:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 574603264. Throughput: 0: 40976.9. Samples: 574775880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:57:41,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 01:57:41,996][12883] Updated weights for policy 0, policy_version 35071 (0.0038) +[2024-06-18 01:57:45,249][12883] Updated weights for policy 0, policy_version 35081 (0.0029) +[2024-06-18 01:57:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 574832640. Throughput: 0: 40765.6. Samples: 574896300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:57:46,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 01:57:49,820][12883] Updated weights for policy 0, policy_version 35091 (0.0037) +[2024-06-18 01:57:51,994][12645] Fps is (10 sec: 44236.1, 60 sec: 40961.4, 300 sec: 40821.1). Total num frames: 575045632. Throughput: 0: 40987.3. Samples: 575147720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 01:57:51,994][12645] Avg episode reward: [(0, '0.004')] +[2024-06-18 01:57:53,125][12883] Updated weights for policy 0, policy_version 35101 (0.0034) +[2024-06-18 01:57:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 575225856. Throughput: 0: 41099.6. Samples: 575393600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 01:57:56,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 01:57:58,013][12883] Updated weights for policy 0, policy_version 35111 (0.0036) +[2024-06-18 01:58:01,180][12883] Updated weights for policy 0, policy_version 35121 (0.0032) +[2024-06-18 01:58:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 40987.8). Total num frames: 575455232. Throughput: 0: 41100.8. Samples: 575512760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 01:58:01,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 01:58:05,765][12883] Updated weights for policy 0, policy_version 35131 (0.0039) +[2024-06-18 01:58:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 575651840. Throughput: 0: 41088.0. Samples: 575763080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 01:58:06,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:58:08,906][12883] Updated weights for policy 0, policy_version 35141 (0.0034) +[2024-06-18 01:58:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 40877.0). Total num frames: 575864832. Throughput: 0: 41175.9. Samples: 576010480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 01:58:11,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:58:13,375][12883] Updated weights for policy 0, policy_version 35151 (0.0032) +[2024-06-18 01:58:16,776][12883] Updated weights for policy 0, policy_version 35161 (0.0035) +[2024-06-18 01:58:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 40987.8). Total num frames: 576077824. Throughput: 0: 41409.3. Samples: 576141600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 01:58:16,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 01:58:21,402][12883] Updated weights for policy 0, policy_version 35171 (0.0042) +[2024-06-18 01:58:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 576258048. Throughput: 0: 41360.0. Samples: 576386660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:58:21,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 01:58:25,081][12883] Updated weights for policy 0, policy_version 35181 (0.0034) +[2024-06-18 01:58:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 576487424. Throughput: 0: 41293.7. Samples: 576634100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:58:26,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:58:29,124][12883] Updated weights for policy 0, policy_version 35191 (0.0043) +[2024-06-18 01:58:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 40932.2). Total num frames: 576684032. Throughput: 0: 41369.8. Samples: 576757940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:58:31,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 01:58:32,898][12883] Updated weights for policy 0, policy_version 35201 (0.0048) +[2024-06-18 01:58:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 576880640. Throughput: 0: 41269.9. Samples: 577004860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 01:58:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:58:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035210_576880640.pth... +[2024-06-18 01:58:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034611_567066624.pth +[2024-06-18 01:58:37,374][12883] Updated weights for policy 0, policy_version 35211 (0.0039) +[2024-06-18 01:58:40,855][12883] Updated weights for policy 0, policy_version 35221 (0.0038) +[2024-06-18 01:58:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 40987.8). Total num frames: 577093632. Throughput: 0: 41220.4. Samples: 577248520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 01:58:41,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 01:58:45,353][12883] Updated weights for policy 0, policy_version 35231 (0.0047) +[2024-06-18 01:58:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 40988.0). Total num frames: 577306624. Throughput: 0: 41390.1. Samples: 577375320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 01:58:46,994][12645] Avg episode reward: [(0, '0.038')] +[2024-06-18 01:58:49,070][12883] Updated weights for policy 0, policy_version 35241 (0.0035) +[2024-06-18 01:58:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 577503232. Throughput: 0: 41204.3. Samples: 577617280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 01:58:51,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 01:58:53,164][12883] Updated weights for policy 0, policy_version 35251 (0.0039) +[2024-06-18 01:58:56,911][12883] Updated weights for policy 0, policy_version 35261 (0.0033) +[2024-06-18 01:58:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 577716224. Throughput: 0: 41287.6. Samples: 577868420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 01:58:56,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:59:00,863][12883] Updated weights for policy 0, policy_version 35271 (0.0044) +[2024-06-18 01:59:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 577912832. Throughput: 0: 41271.6. Samples: 577998820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 01:59:01,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 01:59:04,843][12883] Updated weights for policy 0, policy_version 35281 (0.0027) +[2024-06-18 01:59:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 578125824. Throughput: 0: 41067.6. Samples: 578234700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 01:59:07,003][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:59:09,202][12883] Updated weights for policy 0, policy_version 35291 (0.0048) +[2024-06-18 01:59:09,483][12862] Signal inference workers to stop experience collection... (8200 times) +[2024-06-18 01:59:09,483][12862] Signal inference workers to resume experience collection... (8200 times) +[2024-06-18 01:59:09,505][12883] InferenceWorker_p0-w0: stopping experience collection (8200 times) +[2024-06-18 01:59:09,505][12883] InferenceWorker_p0-w0: resuming experience collection (8200 times) +[2024-06-18 01:59:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 578322432. Throughput: 0: 41162.3. Samples: 578486400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 01:59:12,003][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 01:59:12,651][12883] Updated weights for policy 0, policy_version 35301 (0.0038) +[2024-06-18 01:59:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 578519040. Throughput: 0: 41095.6. Samples: 578607240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 01:59:16,994][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 01:59:17,066][12883] Updated weights for policy 0, policy_version 35311 (0.0039) +[2024-06-18 01:59:20,601][12883] Updated weights for policy 0, policy_version 35321 (0.0025) +[2024-06-18 01:59:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 578732032. Throughput: 0: 40944.1. Samples: 578847340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 01:59:21,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:59:24,895][12883] Updated weights for policy 0, policy_version 35331 (0.0037) +[2024-06-18 01:59:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.9, 300 sec: 40987.7). Total num frames: 578928640. Throughput: 0: 41146.1. Samples: 579100100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:59:26,994][12645] Avg episode reward: [(0, '0.007')] +[2024-06-18 01:59:28,638][12883] Updated weights for policy 0, policy_version 35341 (0.0045) +[2024-06-18 01:59:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 579141632. Throughput: 0: 41043.7. Samples: 579222280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:59:31,994][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 01:59:32,788][12883] Updated weights for policy 0, policy_version 35351 (0.0033) +[2024-06-18 01:59:36,715][12883] Updated weights for policy 0, policy_version 35361 (0.0042) +[2024-06-18 01:59:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 579354624. Throughput: 0: 41106.3. Samples: 579467060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:59:36,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 01:59:40,729][12883] Updated weights for policy 0, policy_version 35371 (0.0042) +[2024-06-18 01:59:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 579567616. Throughput: 0: 41104.4. Samples: 579718120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:59:41,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 01:59:44,285][12883] Updated weights for policy 0, policy_version 35381 (0.0031) +[2024-06-18 01:59:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 579747840. Throughput: 0: 40864.3. Samples: 579837720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 01:59:46,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 01:59:48,834][12883] Updated weights for policy 0, policy_version 35391 (0.0040) +[2024-06-18 01:59:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 579993600. Throughput: 0: 41190.3. Samples: 580088260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 01:59:51,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 01:59:52,184][12883] Updated weights for policy 0, policy_version 35401 (0.0030) +[2024-06-18 01:59:56,917][12883] Updated weights for policy 0, policy_version 35411 (0.0041) +[2024-06-18 01:59:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 580173824. Throughput: 0: 41193.8. Samples: 580340120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 01:59:56,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 02:00:00,284][12883] Updated weights for policy 0, policy_version 35421 (0.0030) +[2024-06-18 02:00:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 580370432. Throughput: 0: 41026.3. Samples: 580453420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:00:01,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:00:04,954][12883] Updated weights for policy 0, policy_version 35431 (0.0038) +[2024-06-18 02:00:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 580599808. Throughput: 0: 41308.9. Samples: 580706240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:00:06,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:00:08,160][12883] Updated weights for policy 0, policy_version 35441 (0.0039) +[2024-06-18 02:00:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 580796416. Throughput: 0: 41139.7. Samples: 580951380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:00:11,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:00:12,786][12883] Updated weights for policy 0, policy_version 35451 (0.0034) +[2024-06-18 02:00:16,806][12883] Updated weights for policy 0, policy_version 35461 (0.0035) +[2024-06-18 02:00:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 580993024. Throughput: 0: 41236.3. Samples: 581077920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:00:16,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 02:00:20,653][12883] Updated weights for policy 0, policy_version 35471 (0.0041) +[2024-06-18 02:00:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41044.2). Total num frames: 581206016. Throughput: 0: 41286.7. Samples: 581324960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:00:21,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:00:24,582][12883] Updated weights for policy 0, policy_version 35481 (0.0033) +[2024-06-18 02:00:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 581402624. Throughput: 0: 41231.0. Samples: 581573520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:00:26,995][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 02:00:28,558][12883] Updated weights for policy 0, policy_version 35491 (0.0029) +[2024-06-18 02:00:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 581632000. Throughput: 0: 41363.1. Samples: 581699060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:00:31,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:00:32,535][12883] Updated weights for policy 0, policy_version 35501 (0.0035) +[2024-06-18 02:00:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 581795840. Throughput: 0: 41247.0. Samples: 581944380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:00:36,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 02:00:37,036][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035511_581812224.pth... +[2024-06-18 02:00:37,041][12883] Updated weights for policy 0, policy_version 35511 (0.0027) +[2024-06-18 02:00:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034909_571949056.pth +[2024-06-18 02:00:37,324][12862] Signal inference workers to stop experience collection... (8250 times) +[2024-06-18 02:00:37,376][12862] Signal inference workers to resume experience collection... (8250 times) +[2024-06-18 02:00:37,377][12883] InferenceWorker_p0-w0: stopping experience collection (8250 times) +[2024-06-18 02:00:37,396][12883] InferenceWorker_p0-w0: resuming experience collection (8250 times) +[2024-06-18 02:00:40,400][12883] Updated weights for policy 0, policy_version 35521 (0.0028) +[2024-06-18 02:00:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41099.2). Total num frames: 582041600. Throughput: 0: 41135.0. Samples: 582191200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:00:41,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 02:00:44,884][12883] Updated weights for policy 0, policy_version 35531 (0.0034) +[2024-06-18 02:00:46,994][12645] Fps is (10 sec: 45875.7, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 582254592. Throughput: 0: 41423.5. Samples: 582317480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:00:46,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:00:48,224][12883] Updated weights for policy 0, policy_version 35541 (0.0030) +[2024-06-18 02:00:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 582434816. Throughput: 0: 41223.2. Samples: 582561280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:00:51,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 02:00:52,659][12883] Updated weights for policy 0, policy_version 35551 (0.0023) +[2024-06-18 02:00:56,155][12883] Updated weights for policy 0, policy_version 35561 (0.0039) +[2024-06-18 02:00:56,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41504.6, 300 sec: 41209.6). Total num frames: 582664192. Throughput: 0: 41370.4. Samples: 582813140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:00:56,996][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 02:01:01,030][12883] Updated weights for policy 0, policy_version 35571 (0.0037) +[2024-06-18 02:01:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 582877184. Throughput: 0: 41392.6. Samples: 582940580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 02:01:01,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 02:01:03,819][12883] Updated weights for policy 0, policy_version 35581 (0.0035) +[2024-06-18 02:01:06,996][12645] Fps is (10 sec: 39321.3, 60 sec: 40958.5, 300 sec: 41209.6). Total num frames: 583057408. Throughput: 0: 41344.5. Samples: 583185560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 02:01:06,996][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 02:01:08,650][12883] Updated weights for policy 0, policy_version 35591 (0.0039) +[2024-06-18 02:01:11,727][12883] Updated weights for policy 0, policy_version 35601 (0.0037) +[2024-06-18 02:01:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 583286784. Throughput: 0: 41240.5. Samples: 583429340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 02:01:11,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 02:01:16,400][12883] Updated weights for policy 0, policy_version 35611 (0.0039) +[2024-06-18 02:01:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 583483392. Throughput: 0: 41268.9. Samples: 583556160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 02:01:16,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 02:01:19,671][12883] Updated weights for policy 0, policy_version 35621 (0.0043) +[2024-06-18 02:01:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 583680000. Throughput: 0: 41312.0. Samples: 583803420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:01:21,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 02:01:24,164][12883] Updated weights for policy 0, policy_version 35631 (0.0025) +[2024-06-18 02:01:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41098.8). Total num frames: 583892992. Throughput: 0: 41371.1. Samples: 584052900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:01:26,998][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 02:01:27,744][12883] Updated weights for policy 0, policy_version 35641 (0.0032) +[2024-06-18 02:01:31,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 584089600. Throughput: 0: 41342.9. Samples: 584177920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:01:31,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:01:32,059][12883] Updated weights for policy 0, policy_version 35651 (0.0044) +[2024-06-18 02:01:35,509][12883] Updated weights for policy 0, policy_version 35661 (0.0032) +[2024-06-18 02:01:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 584302592. Throughput: 0: 41266.0. Samples: 584418260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:01:36,994][12645] Avg episode reward: [(0, '0.036')] +[2024-06-18 02:01:39,899][12883] Updated weights for policy 0, policy_version 35671 (0.0032) +[2024-06-18 02:01:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 584499200. Throughput: 0: 41214.8. Samples: 584667720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:01:41,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 02:01:43,719][12883] Updated weights for policy 0, policy_version 35681 (0.0035) +[2024-06-18 02:01:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41099.1). Total num frames: 584712192. Throughput: 0: 41094.0. Samples: 584789820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:01:46,994][12645] Avg episode reward: [(0, '0.001')] +[2024-06-18 02:01:48,001][12883] Updated weights for policy 0, policy_version 35691 (0.0045) +[2024-06-18 02:01:51,391][12883] Updated weights for policy 0, policy_version 35701 (0.0035) +[2024-06-18 02:01:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 584925184. Throughput: 0: 41065.1. Samples: 585033400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:01:51,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:01:55,926][12883] Updated weights for policy 0, policy_version 35711 (0.0055) +[2024-06-18 02:01:56,996][12645] Fps is (10 sec: 39313.1, 60 sec: 40686.9, 300 sec: 41154.1). Total num frames: 585105408. Throughput: 0: 41270.0. Samples: 585286580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:01:56,997][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 02:01:59,163][12883] Updated weights for policy 0, policy_version 35721 (0.0037) +[2024-06-18 02:02:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 585318400. Throughput: 0: 41106.7. Samples: 585405960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:02:01,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 02:02:04,055][12883] Updated weights for policy 0, policy_version 35731 (0.0038) +[2024-06-18 02:02:06,994][12645] Fps is (10 sec: 45885.8, 60 sec: 41780.8, 300 sec: 41209.9). Total num frames: 585564160. Throughput: 0: 41100.0. Samples: 585652920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:02:06,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:02:07,015][12883] Updated weights for policy 0, policy_version 35741 (0.0046) +[2024-06-18 02:02:10,262][12862] Signal inference workers to stop experience collection... (8300 times) +[2024-06-18 02:02:10,263][12862] Signal inference workers to resume experience collection... (8300 times) +[2024-06-18 02:02:10,299][12883] InferenceWorker_p0-w0: stopping experience collection (8300 times) +[2024-06-18 02:02:10,299][12883] InferenceWorker_p0-w0: resuming experience collection (8300 times) +[2024-06-18 02:02:11,948][12883] Updated weights for policy 0, policy_version 35751 (0.0039) +[2024-06-18 02:02:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 585744384. Throughput: 0: 41189.8. Samples: 585906440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 02:02:11,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:02:15,414][12883] Updated weights for policy 0, policy_version 35761 (0.0038) +[2024-06-18 02:02:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 585940992. Throughput: 0: 41051.8. Samples: 586025240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 02:02:16,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:02:20,010][12883] Updated weights for policy 0, policy_version 35771 (0.0041) +[2024-06-18 02:02:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 586170368. Throughput: 0: 41389.5. Samples: 586280780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 02:02:21,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:02:23,419][12883] Updated weights for policy 0, policy_version 35781 (0.0034) +[2024-06-18 02:02:26,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 586350592. Throughput: 0: 41360.8. Samples: 586528960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 02:02:26,994][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 02:02:28,003][12883] Updated weights for policy 0, policy_version 35791 (0.0033) +[2024-06-18 02:02:31,135][12883] Updated weights for policy 0, policy_version 35801 (0.0038) +[2024-06-18 02:02:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 586579968. Throughput: 0: 41259.6. Samples: 586646500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 02:02:31,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:02:36,083][12883] Updated weights for policy 0, policy_version 35811 (0.0038) +[2024-06-18 02:02:36,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 586792960. Throughput: 0: 41401.0. Samples: 586896440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 02:02:36,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 02:02:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035815_586792960.pth... +[2024-06-18 02:02:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035210_576880640.pth +[2024-06-18 02:02:39,733][12883] Updated weights for policy 0, policy_version 35821 (0.0039) +[2024-06-18 02:02:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 586973184. Throughput: 0: 41233.6. Samples: 587142000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 02:02:41,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:02:43,686][12883] Updated weights for policy 0, policy_version 35831 (0.0040) +[2024-06-18 02:02:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 587202560. Throughput: 0: 41332.3. Samples: 587265920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 02:02:46,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 02:02:47,606][12883] Updated weights for policy 0, policy_version 35841 (0.0036) +[2024-06-18 02:02:51,604][12883] Updated weights for policy 0, policy_version 35851 (0.0040) +[2024-06-18 02:02:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 587399168. Throughput: 0: 41282.2. Samples: 587510620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 02:02:51,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:02:55,571][12883] Updated weights for policy 0, policy_version 35861 (0.0032) +[2024-06-18 02:02:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41234.7, 300 sec: 41098.8). Total num frames: 587579392. Throughput: 0: 41134.7. Samples: 587757500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 02:02:56,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 02:02:59,385][12883] Updated weights for policy 0, policy_version 35871 (0.0033) +[2024-06-18 02:03:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41232.9, 300 sec: 41154.4). Total num frames: 587792384. Throughput: 0: 41114.5. Samples: 587875400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 02:03:01,995][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:03:03,755][12883] Updated weights for policy 0, policy_version 35881 (0.0036) +[2024-06-18 02:03:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 588005376. Throughput: 0: 41060.9. Samples: 588128520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 02:03:06,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 02:03:07,327][12883] Updated weights for policy 0, policy_version 35891 (0.0042) +[2024-06-18 02:03:11,646][12883] Updated weights for policy 0, policy_version 35901 (0.0049) +[2024-06-18 02:03:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 588201984. Throughput: 0: 40916.1. Samples: 588370180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 02:03:11,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:03:15,350][12883] Updated weights for policy 0, policy_version 35911 (0.0031) +[2024-06-18 02:03:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 588431360. Throughput: 0: 41024.1. Samples: 588492580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) +[2024-06-18 02:03:16,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 02:03:19,674][12883] Updated weights for policy 0, policy_version 35921 (0.0045) +[2024-06-18 02:03:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 588611584. Throughput: 0: 41043.6. Samples: 588743400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 02:03:21,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 02:03:23,375][12883] Updated weights for policy 0, policy_version 35931 (0.0025) +[2024-06-18 02:03:26,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 588808192. Throughput: 0: 41008.4. Samples: 588987380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 02:03:26,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:03:27,884][12883] Updated weights for policy 0, policy_version 35941 (0.0032) +[2024-06-18 02:03:31,154][12883] Updated weights for policy 0, policy_version 35951 (0.0038) +[2024-06-18 02:03:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 589053952. Throughput: 0: 40961.8. Samples: 589109200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 02:03:31,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 02:03:35,681][12883] Updated weights for policy 0, policy_version 35961 (0.0039) +[2024-06-18 02:03:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.8, 300 sec: 41098.9). Total num frames: 589217792. Throughput: 0: 40964.5. Samples: 589354020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 02:03:36,994][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 02:03:39,439][12883] Updated weights for policy 0, policy_version 35971 (0.0035) +[2024-06-18 02:03:40,608][12862] Signal inference workers to stop experience collection... (8350 times) +[2024-06-18 02:03:40,609][12862] Signal inference workers to resume experience collection... (8350 times) +[2024-06-18 02:03:40,630][12883] InferenceWorker_p0-w0: stopping experience collection (8350 times) +[2024-06-18 02:03:40,630][12883] InferenceWorker_p0-w0: resuming experience collection (8350 times) +[2024-06-18 02:03:41,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 589430784. Throughput: 0: 40925.0. Samples: 589599120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:03:41,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:03:43,777][12883] Updated weights for policy 0, policy_version 35981 (0.0038) +[2024-06-18 02:03:46,994][12645] Fps is (10 sec: 44235.7, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 589660160. Throughput: 0: 41158.6. Samples: 589727540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:03:46,995][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:03:47,228][12883] Updated weights for policy 0, policy_version 35991 (0.0035) +[2024-06-18 02:03:51,858][12883] Updated weights for policy 0, policy_version 36001 (0.0043) +[2024-06-18 02:03:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 589840384. Throughput: 0: 40763.4. Samples: 589962880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:03:51,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 02:03:55,443][12883] Updated weights for policy 0, policy_version 36011 (0.0035) +[2024-06-18 02:03:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 590069760. Throughput: 0: 40854.6. Samples: 590208640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:03:56,994][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 02:03:59,743][12883] Updated weights for policy 0, policy_version 36021 (0.0033) +[2024-06-18 02:04:01,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 590249984. Throughput: 0: 40948.0. Samples: 590335240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:04:01,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:04:03,364][12883] Updated weights for policy 0, policy_version 36031 (0.0023) +[2024-06-18 02:04:06,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 590446592. Throughput: 0: 40781.8. Samples: 590578580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 02:04:06,994][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 02:04:07,843][12883] Updated weights for policy 0, policy_version 36041 (0.0045) +[2024-06-18 02:04:11,605][12883] Updated weights for policy 0, policy_version 36051 (0.0033) +[2024-06-18 02:04:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 590675968. Throughput: 0: 40797.8. Samples: 590823280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 02:04:11,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 02:04:15,696][12883] Updated weights for policy 0, policy_version 36061 (0.0050) +[2024-06-18 02:04:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.8, 300 sec: 41098.8). Total num frames: 590856192. Throughput: 0: 40831.6. Samples: 590946620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 02:04:16,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 02:04:19,711][12883] Updated weights for policy 0, policy_version 36071 (0.0040) +[2024-06-18 02:04:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 591069184. Throughput: 0: 40860.5. Samples: 591192740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 02:04:21,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:04:24,037][12883] Updated weights for policy 0, policy_version 36081 (0.0041) +[2024-06-18 02:04:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 591298560. Throughput: 0: 40729.6. Samples: 591431960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 02:04:27,000][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:04:27,654][12883] Updated weights for policy 0, policy_version 36091 (0.0044) +[2024-06-18 02:04:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40140.8, 300 sec: 41043.3). Total num frames: 591462400. Throughput: 0: 40634.8. Samples: 591556100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 02:04:31,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:04:32,275][12883] Updated weights for policy 0, policy_version 36101 (0.0034) +[2024-06-18 02:04:35,675][12883] Updated weights for policy 0, policy_version 36111 (0.0036) +[2024-06-18 02:04:36,994][12645] Fps is (10 sec: 36044.7, 60 sec: 40686.8, 300 sec: 40987.8). Total num frames: 591659008. Throughput: 0: 40800.0. Samples: 591798880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 02:04:36,994][12645] Avg episode reward: [(0, '0.009')] +[2024-06-18 02:04:37,058][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036113_591675392.pth... +[2024-06-18 02:04:37,114][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035511_581812224.pth +[2024-06-18 02:04:40,274][12883] Updated weights for policy 0, policy_version 36121 (0.0035) +[2024-06-18 02:04:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 591872000. Throughput: 0: 40696.5. Samples: 592039980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 02:04:41,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:04:43,678][12883] Updated weights for policy 0, policy_version 36131 (0.0048) +[2024-06-18 02:04:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 40687.1, 300 sec: 41043.3). Total num frames: 592101376. Throughput: 0: 40640.8. Samples: 592164080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 02:04:46,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 02:04:48,026][12883] Updated weights for policy 0, policy_version 36141 (0.0032) +[2024-06-18 02:04:51,650][12883] Updated weights for policy 0, policy_version 36151 (0.0028) +[2024-06-18 02:04:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 592314368. Throughput: 0: 40702.6. Samples: 592410200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:04:51,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:04:55,986][12883] Updated weights for policy 0, policy_version 36161 (0.0030) +[2024-06-18 02:04:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 592510976. Throughput: 0: 40670.5. Samples: 592653460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:04:56,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 02:04:59,693][12883] Updated weights for policy 0, policy_version 36171 (0.0043) +[2024-06-18 02:05:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 592707584. Throughput: 0: 40688.1. Samples: 592777580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:05:01,994][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 02:05:04,145][12883] Updated weights for policy 0, policy_version 36181 (0.0027) +[2024-06-18 02:05:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 592904192. Throughput: 0: 40768.7. Samples: 593027340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:05:06,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 02:05:08,051][12883] Updated weights for policy 0, policy_version 36191 (0.0030) +[2024-06-18 02:05:11,987][12883] Updated weights for policy 0, policy_version 36201 (0.0038) +[2024-06-18 02:05:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40686.8, 300 sec: 41098.8). Total num frames: 593117184. Throughput: 0: 41055.5. Samples: 593279460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:05:11,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 02:05:15,971][12883] Updated weights for policy 0, policy_version 36211 (0.0039) +[2024-06-18 02:05:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 593330176. Throughput: 0: 41017.9. Samples: 593401900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 02:05:16,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 02:05:19,719][12883] Updated weights for policy 0, policy_version 36221 (0.0046) +[2024-06-18 02:05:21,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 593543168. Throughput: 0: 41080.2. Samples: 593647480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 02:05:21,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:05:23,469][12862] Signal inference workers to stop experience collection... (8400 times) +[2024-06-18 02:05:23,469][12862] Signal inference workers to resume experience collection... (8400 times) +[2024-06-18 02:05:23,511][12883] InferenceWorker_p0-w0: stopping experience collection (8400 times) +[2024-06-18 02:05:23,511][12883] InferenceWorker_p0-w0: resuming experience collection (8400 times) +[2024-06-18 02:05:24,159][12883] Updated weights for policy 0, policy_version 36231 (0.0027) +[2024-06-18 02:05:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40414.0, 300 sec: 40987.8). Total num frames: 593723392. Throughput: 0: 41384.9. Samples: 593902300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 02:05:26,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:05:27,438][12883] Updated weights for policy 0, policy_version 36241 (0.0044) +[2024-06-18 02:05:31,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 593920000. Throughput: 0: 41309.4. Samples: 594023000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 02:05:31,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:05:32,069][12883] Updated weights for policy 0, policy_version 36251 (0.0043) +[2024-06-18 02:05:35,104][12883] Updated weights for policy 0, policy_version 36261 (0.0032) +[2024-06-18 02:05:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42052.4, 300 sec: 41154.4). Total num frames: 594182144. Throughput: 0: 41476.1. Samples: 594276620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 02:05:36,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 02:05:39,681][12883] Updated weights for policy 0, policy_version 36271 (0.0038) +[2024-06-18 02:05:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 594362368. Throughput: 0: 41618.0. Samples: 594526260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 02:05:41,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:05:43,504][12883] Updated weights for policy 0, policy_version 36281 (0.0035) +[2024-06-18 02:05:46,994][12645] Fps is (10 sec: 37682.5, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 594558976. Throughput: 0: 41492.7. Samples: 594644760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 02:05:46,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:05:47,447][12883] Updated weights for policy 0, policy_version 36291 (0.0034) +[2024-06-18 02:05:51,781][12883] Updated weights for policy 0, policy_version 36301 (0.0040) +[2024-06-18 02:05:51,997][12645] Fps is (10 sec: 40945.5, 60 sec: 40957.7, 300 sec: 41043.1). Total num frames: 594771968. Throughput: 0: 41497.3. Samples: 594894860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 02:05:51,998][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:05:55,566][12883] Updated weights for policy 0, policy_version 36311 (0.0027) +[2024-06-18 02:05:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 594984960. Throughput: 0: 41456.1. Samples: 595144980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 02:05:56,994][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 02:05:59,397][12883] Updated weights for policy 0, policy_version 36321 (0.0029) +[2024-06-18 02:06:01,994][12645] Fps is (10 sec: 42612.9, 60 sec: 41506.1, 300 sec: 41154.7). Total num frames: 595197952. Throughput: 0: 41506.1. Samples: 595269680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-18 02:06:01,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:06:01,994][12862] Saving new best policy, reward=0.065! +[2024-06-18 02:06:03,250][12883] Updated weights for policy 0, policy_version 36331 (0.0049) +[2024-06-18 02:06:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 595394560. Throughput: 0: 41576.3. Samples: 595518420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-18 02:06:06,994][12645] Avg episode reward: [(0, '0.032')] +[2024-06-18 02:06:07,157][12883] Updated weights for policy 0, policy_version 36341 (0.0045) +[2024-06-18 02:06:10,899][12883] Updated weights for policy 0, policy_version 36351 (0.0029) +[2024-06-18 02:06:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41098.8). Total num frames: 595607552. Throughput: 0: 41573.3. Samples: 595773100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-18 02:06:11,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:06:15,012][12883] Updated weights for policy 0, policy_version 36361 (0.0041) +[2024-06-18 02:06:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 595820544. Throughput: 0: 41597.3. Samples: 595894880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-18 02:06:16,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:06:18,472][12883] Updated weights for policy 0, policy_version 36371 (0.0041) +[2024-06-18 02:06:21,996][12645] Fps is (10 sec: 42589.0, 60 sec: 41504.6, 300 sec: 41154.1). Total num frames: 596033536. Throughput: 0: 41509.0. Samples: 596144620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) +[2024-06-18 02:06:21,996][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 02:06:21,997][12862] Saving new best policy, reward=0.091! +[2024-06-18 02:06:22,881][12883] Updated weights for policy 0, policy_version 36381 (0.0039) +[2024-06-18 02:06:26,537][12883] Updated weights for policy 0, policy_version 36391 (0.0034) +[2024-06-18 02:06:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 596230144. Throughput: 0: 41689.2. Samples: 596402280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:06:26,994][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 02:06:30,428][12883] Updated weights for policy 0, policy_version 36401 (0.0029) +[2024-06-18 02:06:31,996][12645] Fps is (10 sec: 40959.9, 60 sec: 42050.6, 300 sec: 41154.1). Total num frames: 596443136. Throughput: 0: 41790.9. Samples: 596525440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:06:31,996][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:06:34,152][12883] Updated weights for policy 0, policy_version 36411 (0.0038) +[2024-06-18 02:06:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41506.0, 300 sec: 41265.5). Total num frames: 596672512. Throughput: 0: 41773.8. Samples: 596774540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:06:36,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 02:06:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036418_596672512.pth... +[2024-06-18 02:06:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035815_586792960.pth +[2024-06-18 02:06:38,387][12883] Updated weights for policy 0, policy_version 36421 (0.0034) +[2024-06-18 02:06:40,590][12862] Signal inference workers to stop experience collection... (8450 times) +[2024-06-18 02:06:40,590][12862] Signal inference workers to resume experience collection... (8450 times) +[2024-06-18 02:06:40,635][12883] InferenceWorker_p0-w0: stopping experience collection (8450 times) +[2024-06-18 02:06:40,635][12883] InferenceWorker_p0-w0: resuming experience collection (8450 times) +[2024-06-18 02:06:41,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42052.2, 300 sec: 41265.5). Total num frames: 596885504. Throughput: 0: 41788.9. Samples: 597025480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:06:41,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 02:06:41,999][12883] Updated weights for policy 0, policy_version 36431 (0.0037) +[2024-06-18 02:06:46,137][12883] Updated weights for policy 0, policy_version 36441 (0.0043) +[2024-06-18 02:06:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 41209.9). Total num frames: 597082112. Throughput: 0: 41923.2. Samples: 597156220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 02:06:46,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:06:49,873][12883] Updated weights for policy 0, policy_version 36451 (0.0050) +[2024-06-18 02:06:51,996][12645] Fps is (10 sec: 39311.7, 60 sec: 41779.9, 300 sec: 41265.4). Total num frames: 597278720. Throughput: 0: 41956.9. Samples: 597406580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 02:06:51,997][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:06:53,775][12883] Updated weights for policy 0, policy_version 36461 (0.0029) +[2024-06-18 02:06:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 597491712. Throughput: 0: 41794.7. Samples: 597653860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 02:06:56,994][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 02:06:58,059][12883] Updated weights for policy 0, policy_version 36471 (0.0045) +[2024-06-18 02:07:01,418][12883] Updated weights for policy 0, policy_version 36481 (0.0046) +[2024-06-18 02:07:01,994][12645] Fps is (10 sec: 42609.0, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 597704704. Throughput: 0: 41937.3. Samples: 597782060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 02:07:01,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 02:07:05,821][12883] Updated weights for policy 0, policy_version 36491 (0.0035) +[2024-06-18 02:07:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 597901312. Throughput: 0: 41995.4. Samples: 598034320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 02:07:06,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 02:07:09,080][12883] Updated weights for policy 0, policy_version 36501 (0.0029) +[2024-06-18 02:07:11,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41777.7, 300 sec: 41265.1). Total num frames: 598114304. Throughput: 0: 41763.3. Samples: 598281720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:07:11,997][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 02:07:13,762][12883] Updated weights for policy 0, policy_version 36511 (0.0046) +[2024-06-18 02:07:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41265.5). Total num frames: 598343680. Throughput: 0: 41871.0. Samples: 598409540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:07:16,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:07:17,106][12883] Updated weights for policy 0, policy_version 36521 (0.0040) +[2024-06-18 02:07:21,763][12883] Updated weights for policy 0, policy_version 36531 (0.0028) +[2024-06-18 02:07:21,996][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41320.7). Total num frames: 598540288. Throughput: 0: 41874.5. Samples: 598658980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:07:21,997][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:07:24,961][12883] Updated weights for policy 0, policy_version 36541 (0.0036) +[2024-06-18 02:07:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41265.5). Total num frames: 598753280. Throughput: 0: 41825.7. Samples: 598907640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:07:26,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 02:07:29,811][12883] Updated weights for policy 0, policy_version 36551 (0.0039) +[2024-06-18 02:07:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42053.8, 300 sec: 41265.5). Total num frames: 598966272. Throughput: 0: 41772.4. Samples: 599035980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:07:32,000][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:07:32,787][12883] Updated weights for policy 0, policy_version 36561 (0.0040) +[2024-06-18 02:07:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 599162880. Throughput: 0: 41821.9. Samples: 599288460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:07:36,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:07:37,289][12883] Updated weights for policy 0, policy_version 36571 (0.0044) +[2024-06-18 02:07:40,886][12883] Updated weights for policy 0, policy_version 36581 (0.0035) +[2024-06-18 02:07:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 599392256. Throughput: 0: 41695.0. Samples: 599530140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:07:41,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 02:07:45,081][12883] Updated weights for policy 0, policy_version 36591 (0.0043) +[2024-06-18 02:07:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 599572480. Throughput: 0: 41729.4. Samples: 599659880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:07:46,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:07:48,806][12883] Updated weights for policy 0, policy_version 36601 (0.0031) +[2024-06-18 02:07:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41780.9, 300 sec: 41376.5). Total num frames: 599785472. Throughput: 0: 41622.6. Samples: 599907340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:07:51,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 02:07:53,055][12883] Updated weights for policy 0, policy_version 36611 (0.0035) +[2024-06-18 02:07:56,680][12883] Updated weights for policy 0, policy_version 36621 (0.0033) +[2024-06-18 02:07:56,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.2, 300 sec: 41487.6). Total num frames: 600031232. Throughput: 0: 41796.7. Samples: 600162480. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) +[2024-06-18 02:07:56,994][12645] Avg episode reward: [(0, '0.006')] +[2024-06-18 02:08:00,773][12883] Updated weights for policy 0, policy_version 36631 (0.0040) +[2024-06-18 02:08:01,999][12645] Fps is (10 sec: 42575.5, 60 sec: 41775.4, 300 sec: 41375.8). Total num frames: 600211456. Throughput: 0: 41707.4. Samples: 600286600. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) +[2024-06-18 02:08:02,000][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 02:08:04,478][12883] Updated weights for policy 0, policy_version 36641 (0.0034) +[2024-06-18 02:08:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 600424448. Throughput: 0: 41707.8. Samples: 600535740. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) +[2024-06-18 02:08:06,994][12645] Avg episode reward: [(0, '0.064')] +[2024-06-18 02:08:08,624][12883] Updated weights for policy 0, policy_version 36651 (0.0050) +[2024-06-18 02:08:11,994][12645] Fps is (10 sec: 39342.9, 60 sec: 41507.7, 300 sec: 41265.5). Total num frames: 600604672. Throughput: 0: 41856.9. Samples: 600791200. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) +[2024-06-18 02:08:11,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 02:08:12,013][12862] Signal inference workers to stop experience collection... (8500 times) +[2024-06-18 02:08:12,060][12883] InferenceWorker_p0-w0: stopping experience collection (8500 times) +[2024-06-18 02:08:12,068][12862] Signal inference workers to resume experience collection... (8500 times) +[2024-06-18 02:08:12,079][12883] InferenceWorker_p0-w0: resuming experience collection (8500 times) +[2024-06-18 02:08:12,487][12883] Updated weights for policy 0, policy_version 36661 (0.0035) +[2024-06-18 02:08:16,453][12883] Updated weights for policy 0, policy_version 36671 (0.0043) +[2024-06-18 02:08:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 600817664. Throughput: 0: 41503.6. Samples: 600903640. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) +[2024-06-18 02:08:16,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 02:08:20,410][12883] Updated weights for policy 0, policy_version 36681 (0.0043) +[2024-06-18 02:08:21,996][12645] Fps is (10 sec: 44226.7, 60 sec: 41779.2, 300 sec: 41487.3). Total num frames: 601047040. Throughput: 0: 41424.6. Samples: 601152660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 02:08:21,996][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 02:08:24,567][12883] Updated weights for policy 0, policy_version 36691 (0.0051) +[2024-06-18 02:08:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 601210880. Throughput: 0: 41641.8. Samples: 601404020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 02:08:26,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 02:08:28,501][12883] Updated weights for policy 0, policy_version 36701 (0.0033) +[2024-06-18 02:08:31,994][12645] Fps is (10 sec: 40968.9, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 601456640. Throughput: 0: 41449.6. Samples: 601525120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 02:08:31,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 02:08:32,660][12883] Updated weights for policy 0, policy_version 36711 (0.0043) +[2024-06-18 02:08:36,383][12883] Updated weights for policy 0, policy_version 36721 (0.0034) +[2024-06-18 02:08:36,994][12645] Fps is (10 sec: 45874.5, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 601669632. Throughput: 0: 41467.4. Samples: 601773380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 02:08:36,996][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 02:08:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036723_601669632.pth... +[2024-06-18 02:08:37,084][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036113_591675392.pth +[2024-06-18 02:08:40,752][12883] Updated weights for policy 0, policy_version 36731 (0.0045) +[2024-06-18 02:08:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 601849856. Throughput: 0: 41318.3. Samples: 602021800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 02:08:41,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 02:08:44,193][12883] Updated weights for policy 0, policy_version 36741 (0.0022) +[2024-06-18 02:08:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 602079232. Throughput: 0: 41163.2. Samples: 602138720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:08:46,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:08:48,507][12883] Updated weights for policy 0, policy_version 36751 (0.0040) +[2024-06-18 02:08:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 602275840. Throughput: 0: 41360.1. Samples: 602396940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:08:51,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 02:08:51,997][12883] Updated weights for policy 0, policy_version 36761 (0.0029) +[2024-06-18 02:08:56,167][12883] Updated weights for policy 0, policy_version 36771 (0.0033) +[2024-06-18 02:08:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40687.0, 300 sec: 41432.1). Total num frames: 602472448. Throughput: 0: 41202.7. Samples: 602645320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:08:56,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 02:08:59,879][12883] Updated weights for policy 0, policy_version 36781 (0.0035) +[2024-06-18 02:09:01,996][12645] Fps is (10 sec: 44224.5, 60 sec: 41781.1, 300 sec: 41598.3). Total num frames: 602718208. Throughput: 0: 41392.1. Samples: 602766400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:09:01,997][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 02:09:04,286][12883] Updated weights for policy 0, policy_version 36791 (0.0041) +[2024-06-18 02:09:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 602898432. Throughput: 0: 41542.6. Samples: 603021980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:09:06,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:09:08,014][12883] Updated weights for policy 0, policy_version 36801 (0.0040) +[2024-06-18 02:09:11,994][12645] Fps is (10 sec: 37693.4, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 603095040. Throughput: 0: 41187.1. Samples: 603257440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:09:11,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 02:09:12,064][12883] Updated weights for policy 0, policy_version 36811 (0.0028) +[2024-06-18 02:09:16,207][12883] Updated weights for policy 0, policy_version 36821 (0.0038) +[2024-06-18 02:09:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 603308032. Throughput: 0: 41333.9. Samples: 603385140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:09:16,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:09:19,892][12883] Updated weights for policy 0, policy_version 36831 (0.0034) +[2024-06-18 02:09:21,996][12645] Fps is (10 sec: 40950.6, 60 sec: 40960.0, 300 sec: 41376.2). Total num frames: 603504640. Throughput: 0: 41396.3. Samples: 603636300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:09:21,996][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 02:09:24,052][12883] Updated weights for policy 0, policy_version 36841 (0.0037) +[2024-06-18 02:09:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41654.3). Total num frames: 603750400. Throughput: 0: 41275.6. Samples: 603879200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:09:26,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:09:28,168][12883] Updated weights for policy 0, policy_version 36851 (0.0048) +[2024-06-18 02:09:31,994][12645] Fps is (10 sec: 40969.5, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 603914240. Throughput: 0: 41533.4. Samples: 604007720. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) +[2024-06-18 02:09:31,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 02:09:32,012][12883] Updated weights for policy 0, policy_version 36861 (0.0038) +[2024-06-18 02:09:35,795][12883] Updated weights for policy 0, policy_version 36871 (0.0034) +[2024-06-18 02:09:36,996][12645] Fps is (10 sec: 37674.8, 60 sec: 40958.6, 300 sec: 41542.8). Total num frames: 604127232. Throughput: 0: 41304.1. Samples: 604255720. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) +[2024-06-18 02:09:36,996][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:09:40,014][12883] Updated weights for policy 0, policy_version 36881 (0.0028) +[2024-06-18 02:09:40,435][12862] Signal inference workers to stop experience collection... (8550 times) +[2024-06-18 02:09:40,463][12883] InferenceWorker_p0-w0: stopping experience collection (8550 times) +[2024-06-18 02:09:40,493][12862] Signal inference workers to resume experience collection... (8550 times) +[2024-06-18 02:09:40,493][12883] InferenceWorker_p0-w0: resuming experience collection (8550 times) +[2024-06-18 02:09:41,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42050.7, 300 sec: 41598.4). Total num frames: 604372992. Throughput: 0: 41173.0. Samples: 604498200. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) +[2024-06-18 02:09:41,997][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 02:09:43,680][12883] Updated weights for policy 0, policy_version 36891 (0.0027) +[2024-06-18 02:09:46,994][12645] Fps is (10 sec: 39330.6, 60 sec: 40687.0, 300 sec: 41376.6). Total num frames: 604520448. Throughput: 0: 41328.7. Samples: 604626080. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) +[2024-06-18 02:09:46,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 02:09:47,951][12883] Updated weights for policy 0, policy_version 36901 (0.0033) +[2024-06-18 02:09:51,566][12883] Updated weights for policy 0, policy_version 36911 (0.0041) +[2024-06-18 02:09:51,994][12645] Fps is (10 sec: 37691.9, 60 sec: 41233.0, 300 sec: 41487.7). Total num frames: 604749824. Throughput: 0: 41008.9. Samples: 604867380. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) +[2024-06-18 02:09:51,994][12645] Avg episode reward: [(0, '0.005')] +[2024-06-18 02:09:55,764][12883] Updated weights for policy 0, policy_version 36921 (0.0022) +[2024-06-18 02:09:56,994][12645] Fps is (10 sec: 45874.1, 60 sec: 41779.0, 300 sec: 41598.7). Total num frames: 604979200. Throughput: 0: 41345.1. Samples: 605117980. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 02:09:56,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 02:09:59,649][12883] Updated weights for policy 0, policy_version 36931 (0.0037) +[2024-06-18 02:10:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40688.7, 300 sec: 41543.2). Total num frames: 605159424. Throughput: 0: 41264.4. Samples: 605242040. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 02:10:01,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 02:10:03,572][12883] Updated weights for policy 0, policy_version 36941 (0.0035) +[2024-06-18 02:10:06,994][12645] Fps is (10 sec: 39322.5, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 605372416. Throughput: 0: 41089.7. Samples: 605485240. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 02:10:06,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:10:07,388][12883] Updated weights for policy 0, policy_version 36951 (0.0039) +[2024-06-18 02:10:11,445][12883] Updated weights for policy 0, policy_version 36961 (0.0034) +[2024-06-18 02:10:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 605601792. Throughput: 0: 41264.8. Samples: 605736120. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 02:10:11,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 02:10:15,427][12883] Updated weights for policy 0, policy_version 36971 (0.0032) +[2024-06-18 02:10:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 605798400. Throughput: 0: 41246.6. Samples: 605863820. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 02:10:16,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:10:19,359][12883] Updated weights for policy 0, policy_version 36981 (0.0052) +[2024-06-18 02:10:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41507.7, 300 sec: 41598.7). Total num frames: 605995008. Throughput: 0: 41298.5. Samples: 606114060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:10:21,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 02:10:23,328][12883] Updated weights for policy 0, policy_version 36991 (0.0037) +[2024-06-18 02:10:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 606208000. Throughput: 0: 41489.2. Samples: 606365120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:10:26,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 02:10:27,011][12883] Updated weights for policy 0, policy_version 37001 (0.0033) +[2024-06-18 02:10:31,210][12883] Updated weights for policy 0, policy_version 37011 (0.0034) +[2024-06-18 02:10:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 606420992. Throughput: 0: 41545.2. Samples: 606495620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:10:31,994][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 02:10:34,991][12883] Updated weights for policy 0, policy_version 37021 (0.0041) +[2024-06-18 02:10:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41507.7, 300 sec: 41543.1). Total num frames: 606617600. Throughput: 0: 41695.5. Samples: 606743680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:10:36,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 02:10:37,030][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037026_606633984.pth... +[2024-06-18 02:10:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036418_596672512.pth +[2024-06-18 02:10:39,159][12883] Updated weights for policy 0, policy_version 37031 (0.0044) +[2024-06-18 02:10:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40961.6, 300 sec: 41598.7). Total num frames: 606830592. Throughput: 0: 41622.4. Samples: 606990980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 02:10:41,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:10:42,744][12883] Updated weights for policy 0, policy_version 37041 (0.0031) +[2024-06-18 02:10:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41543.6). Total num frames: 607027200. Throughput: 0: 41608.4. Samples: 607114420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 02:10:46,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:10:47,136][12883] Updated weights for policy 0, policy_version 37051 (0.0041) +[2024-06-18 02:10:50,741][12883] Updated weights for policy 0, policy_version 37061 (0.0033) +[2024-06-18 02:10:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 607240192. Throughput: 0: 41808.9. Samples: 607366640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 02:10:51,994][12645] Avg episode reward: [(0, '0.072')] +[2024-06-18 02:10:54,912][12883] Updated weights for policy 0, policy_version 37071 (0.0051) +[2024-06-18 02:10:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 607469568. Throughput: 0: 41737.4. Samples: 607614300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 02:10:56,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 02:10:58,570][12883] Updated weights for policy 0, policy_version 37081 (0.0054) +[2024-06-18 02:11:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 607666176. Throughput: 0: 41734.7. Samples: 607741880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 02:11:01,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:11:02,740][12883] Updated weights for policy 0, policy_version 37091 (0.0058) +[2024-06-18 02:11:06,573][12883] Updated weights for policy 0, policy_version 37101 (0.0036) +[2024-06-18 02:11:06,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42050.6, 300 sec: 41653.9). Total num frames: 607895552. Throughput: 0: 41714.7. Samples: 607991320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:11:06,997][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 02:11:10,786][12883] Updated weights for policy 0, policy_version 37111 (0.0036) +[2024-06-18 02:11:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 608092160. Throughput: 0: 41684.7. Samples: 608240940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:11:11,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 02:11:14,121][12883] Updated weights for policy 0, policy_version 37121 (0.0035) +[2024-06-18 02:11:16,994][12645] Fps is (10 sec: 37692.0, 60 sec: 41233.1, 300 sec: 41487.9). Total num frames: 608272384. Throughput: 0: 41566.4. Samples: 608366100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:11:16,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:11:18,963][12883] Updated weights for policy 0, policy_version 37131 (0.0037) +[2024-06-18 02:11:21,823][12883] Updated weights for policy 0, policy_version 37141 (0.0037) +[2024-06-18 02:11:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 608518144. Throughput: 0: 41404.9. Samples: 608606900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:11:21,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:11:26,233][12862] Signal inference workers to stop experience collection... (8600 times) +[2024-06-18 02:11:26,233][12862] Signal inference workers to resume experience collection... (8600 times) +[2024-06-18 02:11:26,258][12883] InferenceWorker_p0-w0: stopping experience collection (8600 times) +[2024-06-18 02:11:26,258][12883] InferenceWorker_p0-w0: resuming experience collection (8600 times) +[2024-06-18 02:11:26,961][12883] Updated weights for policy 0, policy_version 37151 (0.0045) +[2024-06-18 02:11:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41487.9). Total num frames: 608681984. Throughput: 0: 41789.8. Samples: 608871520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 02:11:26,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:11:29,993][12883] Updated weights for policy 0, policy_version 37161 (0.0041) +[2024-06-18 02:11:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 608911360. Throughput: 0: 41675.2. Samples: 608989800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 02:11:31,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 02:11:34,699][12883] Updated weights for policy 0, policy_version 37171 (0.0046) +[2024-06-18 02:11:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 609124352. Throughput: 0: 41581.6. Samples: 609237820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 02:11:36,995][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:11:37,842][12883] Updated weights for policy 0, policy_version 37181 (0.0031) +[2024-06-18 02:11:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 609304576. Throughput: 0: 41777.4. Samples: 609494280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 02:11:41,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 02:11:42,384][12883] Updated weights for policy 0, policy_version 37191 (0.0036) +[2024-06-18 02:11:45,601][12883] Updated weights for policy 0, policy_version 37201 (0.0026) +[2024-06-18 02:11:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 41543.5). Total num frames: 609533952. Throughput: 0: 41480.5. Samples: 609608500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 02:11:46,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:11:50,225][12883] Updated weights for policy 0, policy_version 37211 (0.0042) +[2024-06-18 02:11:51,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 609763328. Throughput: 0: 41662.6. Samples: 609866040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 02:11:51,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 02:11:53,981][12883] Updated weights for policy 0, policy_version 37221 (0.0034) +[2024-06-18 02:11:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 609927168. Throughput: 0: 41725.4. Samples: 610118580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:11:56,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 02:11:57,005][12862] Saving new best policy, reward=0.110! +[2024-06-18 02:11:58,210][12883] Updated weights for policy 0, policy_version 37231 (0.0035) +[2024-06-18 02:12:01,668][12883] Updated weights for policy 0, policy_version 37241 (0.0029) +[2024-06-18 02:12:01,996][12645] Fps is (10 sec: 39312.8, 60 sec: 41504.6, 300 sec: 41542.8). Total num frames: 610156544. Throughput: 0: 41559.7. Samples: 610236380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:12:01,996][12645] Avg episode reward: [(0, '0.119')] +[2024-06-18 02:12:02,048][12862] Saving new best policy, reward=0.119! +[2024-06-18 02:12:05,742][12883] Updated weights for policy 0, policy_version 37251 (0.0045) +[2024-06-18 02:12:06,994][12645] Fps is (10 sec: 47513.6, 60 sec: 41780.8, 300 sec: 41654.5). Total num frames: 610402304. Throughput: 0: 42003.6. Samples: 610497060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:12:06,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:12:09,357][12883] Updated weights for policy 0, policy_version 37261 (0.0033) +[2024-06-18 02:12:11,994][12645] Fps is (10 sec: 40969.0, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 610566144. Throughput: 0: 41534.2. Samples: 610740560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:12:11,994][12645] Avg episode reward: [(0, '0.053')] +[2024-06-18 02:12:13,567][12883] Updated weights for policy 0, policy_version 37271 (0.0034) +[2024-06-18 02:12:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 41543.5). Total num frames: 610795520. Throughput: 0: 41560.0. Samples: 610860000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:12:16,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 02:12:17,061][12883] Updated weights for policy 0, policy_version 37281 (0.0039) +[2024-06-18 02:12:21,503][12883] Updated weights for policy 0, policy_version 37291 (0.0042) +[2024-06-18 02:12:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 610992128. Throughput: 0: 41736.6. Samples: 611115960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:12:21,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:12:25,307][12883] Updated weights for policy 0, policy_version 37301 (0.0040) +[2024-06-18 02:12:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 611188736. Throughput: 0: 41536.8. Samples: 611363440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:12:26,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:12:29,466][12883] Updated weights for policy 0, policy_version 37311 (0.0040) +[2024-06-18 02:12:31,995][12645] Fps is (10 sec: 42592.4, 60 sec: 41778.2, 300 sec: 41543.0). Total num frames: 611418112. Throughput: 0: 41689.8. Samples: 611484600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:12:31,995][12645] Avg episode reward: [(0, '0.038')] +[2024-06-18 02:12:32,968][12883] Updated weights for policy 0, policy_version 37321 (0.0029) +[2024-06-18 02:12:36,355][12862] Signal inference workers to stop experience collection... (8650 times) +[2024-06-18 02:12:36,356][12862] Signal inference workers to resume experience collection... (8650 times) +[2024-06-18 02:12:36,373][12883] InferenceWorker_p0-w0: stopping experience collection (8650 times) +[2024-06-18 02:12:36,373][12883] InferenceWorker_p0-w0: resuming experience collection (8650 times) +[2024-06-18 02:12:36,964][12883] Updated weights for policy 0, policy_version 37331 (0.0045) +[2024-06-18 02:12:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 611631104. Throughput: 0: 41520.9. Samples: 611734480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:12:36,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 02:12:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037331_611631104.pth... +[2024-06-18 02:12:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036723_601669632.pth +[2024-06-18 02:12:41,506][12883] Updated weights for policy 0, policy_version 37341 (0.0037) +[2024-06-18 02:12:41,994][12645] Fps is (10 sec: 39327.3, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 611811328. Throughput: 0: 41436.5. Samples: 611983220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:12:41,994][12645] Avg episode reward: [(0, '0.103')] +[2024-06-18 02:12:44,618][12883] Updated weights for policy 0, policy_version 37351 (0.0026) +[2024-06-18 02:12:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 612040704. Throughput: 0: 41485.6. Samples: 612103140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:12:46,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 02:12:49,072][12883] Updated weights for policy 0, policy_version 37361 (0.0041) +[2024-06-18 02:12:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 612237312. Throughput: 0: 41313.5. Samples: 612356160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:12:51,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 02:12:52,390][12883] Updated weights for policy 0, policy_version 37371 (0.0030) +[2024-06-18 02:12:56,989][12883] Updated weights for policy 0, policy_version 37381 (0.0030) +[2024-06-18 02:12:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41488.4). Total num frames: 612450304. Throughput: 0: 41294.7. Samples: 612598820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:12:56,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 02:13:00,383][12883] Updated weights for policy 0, policy_version 37391 (0.0030) +[2024-06-18 02:13:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41507.7, 300 sec: 41432.1). Total num frames: 612646912. Throughput: 0: 41517.8. Samples: 612728300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:13:01,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:13:04,676][12883] Updated weights for policy 0, policy_version 37401 (0.0039) +[2024-06-18 02:13:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 612843520. Throughput: 0: 41301.8. Samples: 612974540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) +[2024-06-18 02:13:06,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 02:13:08,361][12883] Updated weights for policy 0, policy_version 37411 (0.0044) +[2024-06-18 02:13:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 613056512. Throughput: 0: 41239.0. Samples: 613219200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) +[2024-06-18 02:13:11,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 02:13:12,995][12883] Updated weights for policy 0, policy_version 37421 (0.0037) +[2024-06-18 02:13:16,639][12883] Updated weights for policy 0, policy_version 37431 (0.0026) +[2024-06-18 02:13:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41432.4). Total num frames: 613269504. Throughput: 0: 41384.8. Samples: 613346860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) +[2024-06-18 02:13:16,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:13:20,914][12883] Updated weights for policy 0, policy_version 37441 (0.0045) +[2024-06-18 02:13:21,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 613482496. Throughput: 0: 41370.3. Samples: 613596140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) +[2024-06-18 02:13:21,994][12645] Avg episode reward: [(0, '0.072')] +[2024-06-18 02:13:24,687][12883] Updated weights for policy 0, policy_version 37451 (0.0038) +[2024-06-18 02:13:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 613695488. Throughput: 0: 41322.6. Samples: 613842740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) +[2024-06-18 02:13:26,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:13:28,624][12883] Updated weights for policy 0, policy_version 37461 (0.0044) +[2024-06-18 02:13:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.9, 300 sec: 41376.5). Total num frames: 613875712. Throughput: 0: 41426.1. Samples: 613967320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 02:13:31,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:13:32,766][12883] Updated weights for policy 0, policy_version 37471 (0.0029) +[2024-06-18 02:13:36,455][12883] Updated weights for policy 0, policy_version 37481 (0.0036) +[2024-06-18 02:13:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 614105088. Throughput: 0: 41232.3. Samples: 614211620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 02:13:36,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 02:13:40,736][12883] Updated weights for policy 0, policy_version 37491 (0.0034) +[2024-06-18 02:13:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 614318080. Throughput: 0: 41510.5. Samples: 614466800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 02:13:41,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 02:13:44,143][12883] Updated weights for policy 0, policy_version 37501 (0.0028) +[2024-06-18 02:13:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 614514688. Throughput: 0: 41431.5. Samples: 614592720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 02:13:46,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 02:13:48,337][12883] Updated weights for policy 0, policy_version 37511 (0.0039) +[2024-06-18 02:13:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.0, 300 sec: 41543.1). Total num frames: 614727680. Throughput: 0: 41507.4. Samples: 614842380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 02:13:51,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:13:52,337][12862] Signal inference workers to stop experience collection... (8700 times) +[2024-06-18 02:13:52,388][12862] Signal inference workers to resume experience collection... (8700 times) +[2024-06-18 02:13:52,390][12883] InferenceWorker_p0-w0: stopping experience collection (8700 times) +[2024-06-18 02:13:52,396][12883] Updated weights for policy 0, policy_version 37521 (0.0045) +[2024-06-18 02:13:52,407][12883] InferenceWorker_p0-w0: resuming experience collection (8700 times) +[2024-06-18 02:13:56,812][12883] Updated weights for policy 0, policy_version 37531 (0.0033) +[2024-06-18 02:13:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 41321.4). Total num frames: 614907904. Throughput: 0: 41740.1. Samples: 615097500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:13:56,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:14:00,062][12883] Updated weights for policy 0, policy_version 37541 (0.0046) +[2024-06-18 02:14:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 615153664. Throughput: 0: 41540.5. Samples: 615216180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:14:01,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:14:04,874][12883] Updated weights for policy 0, policy_version 37551 (0.0048) +[2024-06-18 02:14:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 615350272. Throughput: 0: 41525.2. Samples: 615464780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:14:06,994][12645] Avg episode reward: [(0, '0.036')] +[2024-06-18 02:14:07,705][12883] Updated weights for policy 0, policy_version 37561 (0.0031) +[2024-06-18 02:14:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 615530496. Throughput: 0: 41640.0. Samples: 615716540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:14:11,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:14:12,802][12883] Updated weights for policy 0, policy_version 37571 (0.0032) +[2024-06-18 02:14:15,547][12883] Updated weights for policy 0, policy_version 37581 (0.0039) +[2024-06-18 02:14:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41599.0). Total num frames: 615776256. Throughput: 0: 41647.7. Samples: 615841460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:14:16,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 02:14:20,551][12883] Updated weights for policy 0, policy_version 37591 (0.0033) +[2024-06-18 02:14:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41506.0, 300 sec: 41432.1). Total num frames: 615972864. Throughput: 0: 41952.8. Samples: 616099500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:14:21,994][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 02:14:23,422][12883] Updated weights for policy 0, policy_version 37601 (0.0048) +[2024-06-18 02:14:26,994][12645] Fps is (10 sec: 37682.3, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 616153088. Throughput: 0: 41766.6. Samples: 616346300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:14:26,995][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:14:28,487][12883] Updated weights for policy 0, policy_version 37611 (0.0037) +[2024-06-18 02:14:31,060][12883] Updated weights for policy 0, policy_version 37621 (0.0035) +[2024-06-18 02:14:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 41654.5). Total num frames: 616415232. Throughput: 0: 41741.3. Samples: 616471080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:14:31,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:14:36,109][12883] Updated weights for policy 0, policy_version 37631 (0.0049) +[2024-06-18 02:14:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41506.1, 300 sec: 41432.4). Total num frames: 616595456. Throughput: 0: 41972.9. Samples: 616731160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:14:36,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:14:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037635_616611840.pth... +[2024-06-18 02:14:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037026_606633984.pth +[2024-06-18 02:14:39,116][12883] Updated weights for policy 0, policy_version 37641 (0.0032) +[2024-06-18 02:14:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 616808448. Throughput: 0: 41705.0. Samples: 616974220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:14:41,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 02:14:43,838][12883] Updated weights for policy 0, policy_version 37651 (0.0036) +[2024-06-18 02:14:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 617021440. Throughput: 0: 41915.5. Samples: 617102380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 02:14:46,994][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 02:14:47,172][12883] Updated weights for policy 0, policy_version 37661 (0.0031) +[2024-06-18 02:14:51,679][12883] Updated weights for policy 0, policy_version 37671 (0.0041) +[2024-06-18 02:14:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 617201664. Throughput: 0: 42016.5. Samples: 617355520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 02:14:51,994][12645] Avg episode reward: [(0, '0.067')] +[2024-06-18 02:14:54,805][12883] Updated weights for policy 0, policy_version 37681 (0.0044) +[2024-06-18 02:14:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41709.8). Total num frames: 617463808. Throughput: 0: 41888.0. Samples: 617601500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 02:14:56,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 02:14:59,310][12883] Updated weights for policy 0, policy_version 37691 (0.0042) +[2024-06-18 02:15:01,994][12645] Fps is (10 sec: 45874.8, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 617660416. Throughput: 0: 42186.1. Samples: 617739840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 02:15:01,994][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 02:15:02,359][12883] Updated weights for policy 0, policy_version 37701 (0.0045) +[2024-06-18 02:15:06,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 617840640. Throughput: 0: 41985.9. Samples: 617988860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 02:15:06,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 02:15:07,267][12883] Updated weights for policy 0, policy_version 37711 (0.0030) +[2024-06-18 02:15:07,938][12862] Signal inference workers to stop experience collection... (8750 times) +[2024-06-18 02:15:07,941][12862] Signal inference workers to resume experience collection... (8750 times) +[2024-06-18 02:15:07,956][12883] InferenceWorker_p0-w0: stopping experience collection (8750 times) +[2024-06-18 02:15:07,956][12883] InferenceWorker_p0-w0: resuming experience collection (8750 times) +[2024-06-18 02:15:10,623][12883] Updated weights for policy 0, policy_version 37721 (0.0042) +[2024-06-18 02:15:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 41709.8). Total num frames: 618102784. Throughput: 0: 41826.7. Samples: 618228500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 02:15:11,994][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 02:15:14,901][12883] Updated weights for policy 0, policy_version 37731 (0.0039) +[2024-06-18 02:15:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 618283008. Throughput: 0: 42175.6. Samples: 618368980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 02:15:16,994][12645] Avg episode reward: [(0, '0.053')] +[2024-06-18 02:15:18,249][12883] Updated weights for policy 0, policy_version 37741 (0.0044) +[2024-06-18 02:15:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 618479616. Throughput: 0: 41810.6. Samples: 618612640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 02:15:21,994][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 02:15:22,477][12883] Updated weights for policy 0, policy_version 37751 (0.0036) +[2024-06-18 02:15:25,877][12883] Updated weights for policy 0, policy_version 37761 (0.0039) +[2024-06-18 02:15:26,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 41765.3). Total num frames: 618741760. Throughput: 0: 41878.0. Samples: 618858740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 02:15:26,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 02:15:30,298][12883] Updated weights for policy 0, policy_version 37771 (0.0034) +[2024-06-18 02:15:31,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 618905600. Throughput: 0: 42050.0. Samples: 618994640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 02:15:31,995][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 02:15:33,722][12883] Updated weights for policy 0, policy_version 37781 (0.0032) +[2024-06-18 02:15:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 619118592. Throughput: 0: 41864.3. Samples: 619239420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:15:36,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 02:15:38,073][12883] Updated weights for policy 0, policy_version 37791 (0.0043) +[2024-06-18 02:15:41,473][12883] Updated weights for policy 0, policy_version 37801 (0.0034) +[2024-06-18 02:15:41,994][12645] Fps is (10 sec: 44238.0, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 619347968. Throughput: 0: 41988.0. Samples: 619490960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:15:41,994][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 02:15:45,624][12883] Updated weights for policy 0, policy_version 37811 (0.0041) +[2024-06-18 02:15:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 619528192. Throughput: 0: 41793.3. Samples: 619620540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:15:46,995][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 02:15:49,249][12883] Updated weights for policy 0, policy_version 37821 (0.0039) +[2024-06-18 02:15:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 41709.8). Total num frames: 619773952. Throughput: 0: 41815.5. Samples: 619870560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:15:51,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:15:53,211][12883] Updated weights for policy 0, policy_version 37831 (0.0034) +[2024-06-18 02:15:57,000][12645] Fps is (10 sec: 44209.7, 60 sec: 41774.9, 300 sec: 41708.9). Total num frames: 619970560. Throughput: 0: 42310.7. Samples: 620132740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 02:15:57,000][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 02:15:57,120][12883] Updated weights for policy 0, policy_version 37841 (0.0032) +[2024-06-18 02:16:01,302][12883] Updated weights for policy 0, policy_version 37851 (0.0038) +[2024-06-18 02:16:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41599.0). Total num frames: 620167168. Throughput: 0: 41746.7. Samples: 620247580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 02:16:01,994][12645] Avg episode reward: [(0, '0.103')] +[2024-06-18 02:16:05,076][12883] Updated weights for policy 0, policy_version 37861 (0.0040) +[2024-06-18 02:16:06,994][12645] Fps is (10 sec: 42625.3, 60 sec: 42598.5, 300 sec: 41709.8). Total num frames: 620396544. Throughput: 0: 41847.2. Samples: 620495760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 02:16:06,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:16:09,181][12883] Updated weights for policy 0, policy_version 37871 (0.0039) +[2024-06-18 02:16:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 620576768. Throughput: 0: 42189.4. Samples: 620757260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 02:16:11,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 02:16:13,100][12883] Updated weights for policy 0, policy_version 37881 (0.0032) +[2024-06-18 02:16:16,994][12645] Fps is (10 sec: 39320.9, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 620789760. Throughput: 0: 41879.7. Samples: 620879220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 02:16:16,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:16:17,031][12883] Updated weights for policy 0, policy_version 37891 (0.0042) +[2024-06-18 02:16:20,677][12883] Updated weights for policy 0, policy_version 37901 (0.0028) +[2024-06-18 02:16:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 621002752. Throughput: 0: 42068.1. Samples: 621132480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) +[2024-06-18 02:16:21,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 02:16:23,709][12862] Signal inference workers to stop experience collection... (8800 times) +[2024-06-18 02:16:23,740][12883] InferenceWorker_p0-w0: stopping experience collection (8800 times) +[2024-06-18 02:16:23,762][12862] Signal inference workers to resume experience collection... (8800 times) +[2024-06-18 02:16:23,768][12883] InferenceWorker_p0-w0: resuming experience collection (8800 times) +[2024-06-18 02:16:24,829][12883] Updated weights for policy 0, policy_version 37911 (0.0032) +[2024-06-18 02:16:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 621215744. Throughput: 0: 42037.3. Samples: 621382640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) +[2024-06-18 02:16:26,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:16:28,845][12883] Updated weights for policy 0, policy_version 37921 (0.0037) +[2024-06-18 02:16:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.5, 300 sec: 41765.3). Total num frames: 621445120. Throughput: 0: 41944.1. Samples: 621508020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) +[2024-06-18 02:16:31,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 02:16:32,357][12883] Updated weights for policy 0, policy_version 37931 (0.0038) +[2024-06-18 02:16:36,575][12883] Updated weights for policy 0, policy_version 37941 (0.0038) +[2024-06-18 02:16:36,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 621625344. Throughput: 0: 41958.9. Samples: 621758720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) +[2024-06-18 02:16:36,995][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:16:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037941_621625344.pth... +[2024-06-18 02:16:37,098][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037331_611631104.pth +[2024-06-18 02:16:40,330][12883] Updated weights for policy 0, policy_version 37951 (0.0028) +[2024-06-18 02:16:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 621854720. Throughput: 0: 41596.4. Samples: 622004320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) +[2024-06-18 02:16:41,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:16:44,283][12883] Updated weights for policy 0, policy_version 37961 (0.0031) +[2024-06-18 02:16:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 622067712. Throughput: 0: 41936.8. Samples: 622134740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 02:16:46,994][12645] Avg episode reward: [(0, '0.044')] +[2024-06-18 02:16:48,084][12883] Updated weights for policy 0, policy_version 37971 (0.0040) +[2024-06-18 02:16:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 622264320. Throughput: 0: 42091.1. Samples: 622389860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 02:16:51,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:16:52,672][12883] Updated weights for policy 0, policy_version 37981 (0.0037) +[2024-06-18 02:16:55,954][12883] Updated weights for policy 0, policy_version 37991 (0.0028) +[2024-06-18 02:16:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41783.5, 300 sec: 41765.6). Total num frames: 622477312. Throughput: 0: 41723.5. Samples: 622634820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 02:16:56,994][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 02:17:00,343][12883] Updated weights for policy 0, policy_version 38001 (0.0034) +[2024-06-18 02:17:01,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42050.7, 300 sec: 41653.9). Total num frames: 622690304. Throughput: 0: 41778.9. Samples: 622759360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 02:17:01,996][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 02:17:03,519][12883] Updated weights for policy 0, policy_version 38011 (0.0042) +[2024-06-18 02:17:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 622919680. Throughput: 0: 41827.0. Samples: 623014700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 02:17:06,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:17:08,046][12883] Updated weights for policy 0, policy_version 38021 (0.0034) +[2024-06-18 02:17:11,747][12883] Updated weights for policy 0, policy_version 38031 (0.0035) +[2024-06-18 02:17:11,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 623099904. Throughput: 0: 41841.4. Samples: 623265500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) +[2024-06-18 02:17:11,997][12645] Avg episode reward: [(0, '0.036')] +[2024-06-18 02:17:15,975][12883] Updated weights for policy 0, policy_version 38041 (0.0036) +[2024-06-18 02:17:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 623312896. Throughput: 0: 41652.0. Samples: 623382360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) +[2024-06-18 02:17:16,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 02:17:19,714][12883] Updated weights for policy 0, policy_version 38051 (0.0044) +[2024-06-18 02:17:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 623509504. Throughput: 0: 41737.1. Samples: 623636880. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) +[2024-06-18 02:17:21,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:17:23,643][12883] Updated weights for policy 0, policy_version 38061 (0.0051) +[2024-06-18 02:17:26,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42050.7, 300 sec: 41765.2). Total num frames: 623738880. Throughput: 0: 41841.1. Samples: 623887260. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) +[2024-06-18 02:17:26,996][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:17:27,327][12883] Updated weights for policy 0, policy_version 38071 (0.0033) +[2024-06-18 02:17:31,459][12883] Updated weights for policy 0, policy_version 38081 (0.0027) +[2024-06-18 02:17:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 623919104. Throughput: 0: 41908.1. Samples: 624020600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) +[2024-06-18 02:17:31,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 02:17:35,455][12883] Updated weights for policy 0, policy_version 38091 (0.0045) +[2024-06-18 02:17:36,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.4, 300 sec: 41820.8). Total num frames: 624148480. Throughput: 0: 41710.6. Samples: 624266840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:17:36,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 02:17:39,242][12883] Updated weights for policy 0, policy_version 38101 (0.0035) +[2024-06-18 02:17:41,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 624377856. Throughput: 0: 41823.5. Samples: 624516880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:17:41,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 02:17:43,255][12883] Updated weights for policy 0, policy_version 38111 (0.0032) +[2024-06-18 02:17:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 624558080. Throughput: 0: 41843.4. Samples: 624642220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:17:46,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 02:17:47,100][12883] Updated weights for policy 0, policy_version 38121 (0.0031) +[2024-06-18 02:17:49,828][12862] Signal inference workers to stop experience collection... (8850 times) +[2024-06-18 02:17:49,853][12883] InferenceWorker_p0-w0: stopping experience collection (8850 times) +[2024-06-18 02:17:49,941][12862] Signal inference workers to resume experience collection... (8850 times) +[2024-06-18 02:17:49,941][12883] InferenceWorker_p0-w0: resuming experience collection (8850 times) +[2024-06-18 02:17:50,919][12883] Updated weights for policy 0, policy_version 38131 (0.0039) +[2024-06-18 02:17:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 624771072. Throughput: 0: 41803.1. Samples: 624895840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:17:51,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 02:17:54,891][12883] Updated weights for policy 0, policy_version 38141 (0.0037) +[2024-06-18 02:17:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 625016832. Throughput: 0: 41828.4. Samples: 625147780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:17:56,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 02:17:58,463][12883] Updated weights for policy 0, policy_version 38151 (0.0039) +[2024-06-18 02:18:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41780.7, 300 sec: 41876.4). Total num frames: 625197056. Throughput: 0: 42200.8. Samples: 625281400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 02:18:01,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:18:02,550][12883] Updated weights for policy 0, policy_version 38161 (0.0032) +[2024-06-18 02:18:05,977][12883] Updated weights for policy 0, policy_version 38171 (0.0047) +[2024-06-18 02:18:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 625410048. Throughput: 0: 42043.0. Samples: 625528820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 02:18:06,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 02:18:10,411][12883] Updated weights for policy 0, policy_version 38181 (0.0043) +[2024-06-18 02:18:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 625639424. Throughput: 0: 42036.3. Samples: 625778800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 02:18:11,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 02:18:13,665][12883] Updated weights for policy 0, policy_version 38191 (0.0035) +[2024-06-18 02:18:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 625819648. Throughput: 0: 42029.8. Samples: 625911940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 02:18:16,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:18:17,985][12883] Updated weights for policy 0, policy_version 38201 (0.0031) +[2024-06-18 02:18:21,507][12883] Updated weights for policy 0, policy_version 38211 (0.0032) +[2024-06-18 02:18:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 626049024. Throughput: 0: 42092.0. Samples: 626160980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 02:18:21,994][12645] Avg episode reward: [(0, '0.072')] +[2024-06-18 02:18:25,930][12883] Updated weights for policy 0, policy_version 38221 (0.0043) +[2024-06-18 02:18:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42053.8, 300 sec: 41987.5). Total num frames: 626262016. Throughput: 0: 42238.3. Samples: 626417600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 02:18:26,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 02:18:29,642][12883] Updated weights for policy 0, policy_version 38231 (0.0026) +[2024-06-18 02:18:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 626442240. Throughput: 0: 42317.8. Samples: 626546520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 02:18:31,994][12645] Avg episode reward: [(0, '0.067')] +[2024-06-18 02:18:33,498][12883] Updated weights for policy 0, policy_version 38241 (0.0031) +[2024-06-18 02:18:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 626688000. Throughput: 0: 42168.0. Samples: 626793400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 02:18:36,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 02:18:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038250_626688000.pth... +[2024-06-18 02:18:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037635_616611840.pth +[2024-06-18 02:18:37,453][12883] Updated weights for policy 0, policy_version 38251 (0.0032) +[2024-06-18 02:18:41,206][12883] Updated weights for policy 0, policy_version 38261 (0.0037) +[2024-06-18 02:18:41,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 626900992. Throughput: 0: 42210.7. Samples: 627047260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 02:18:41,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 02:18:45,191][12883] Updated weights for policy 0, policy_version 38271 (0.0033) +[2024-06-18 02:18:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 627097600. Throughput: 0: 41942.7. Samples: 627168820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 02:18:46,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:18:49,419][12883] Updated weights for policy 0, policy_version 38281 (0.0046) +[2024-06-18 02:18:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 627343360. Throughput: 0: 42180.5. Samples: 627426940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:18:51,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:18:53,232][12883] Updated weights for policy 0, policy_version 38291 (0.0034) +[2024-06-18 02:18:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 627507200. Throughput: 0: 42288.8. Samples: 627681800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:18:56,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 02:18:57,045][12883] Updated weights for policy 0, policy_version 38301 (0.0037) +[2024-06-18 02:19:00,897][12883] Updated weights for policy 0, policy_version 38311 (0.0038) +[2024-06-18 02:19:01,994][12645] Fps is (10 sec: 39319.5, 60 sec: 42325.0, 300 sec: 41987.4). Total num frames: 627736576. Throughput: 0: 42080.3. Samples: 627805580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:19:01,995][12645] Avg episode reward: [(0, '0.082')] +[2024-06-18 02:19:04,630][12883] Updated weights for policy 0, policy_version 38321 (0.0041) +[2024-06-18 02:19:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 627949568. Throughput: 0: 42228.4. Samples: 628061260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:19:06,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 02:19:07,464][12862] Signal inference workers to stop experience collection... (8900 times) +[2024-06-18 02:19:07,464][12862] Signal inference workers to resume experience collection... (8900 times) +[2024-06-18 02:19:07,480][12883] InferenceWorker_p0-w0: stopping experience collection (8900 times) +[2024-06-18 02:19:07,480][12883] InferenceWorker_p0-w0: resuming experience collection (8900 times) +[2024-06-18 02:19:08,436][12883] Updated weights for policy 0, policy_version 38331 (0.0036) +[2024-06-18 02:19:12,000][12645] Fps is (10 sec: 40936.9, 60 sec: 41774.9, 300 sec: 41931.0). Total num frames: 628146176. Throughput: 0: 42223.6. Samples: 628317920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:19:12,000][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 02:19:12,449][12883] Updated weights for policy 0, policy_version 38341 (0.0038) +[2024-06-18 02:19:16,098][12883] Updated weights for policy 0, policy_version 38351 (0.0027) +[2024-06-18 02:19:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 628375552. Throughput: 0: 42076.4. Samples: 628439960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:19:16,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:19:20,241][12883] Updated weights for policy 0, policy_version 38361 (0.0034) +[2024-06-18 02:19:21,994][12645] Fps is (10 sec: 42624.7, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 628572160. Throughput: 0: 42219.1. Samples: 628693260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:19:21,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 02:19:23,769][12883] Updated weights for policy 0, policy_version 38371 (0.0026) +[2024-06-18 02:19:26,996][12645] Fps is (10 sec: 39313.0, 60 sec: 41777.7, 300 sec: 41876.1). Total num frames: 628768768. Throughput: 0: 42182.4. Samples: 628945560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:19:26,996][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 02:19:28,170][12883] Updated weights for policy 0, policy_version 38381 (0.0044) +[2024-06-18 02:19:31,555][12883] Updated weights for policy 0, policy_version 38391 (0.0031) +[2024-06-18 02:19:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 628998144. Throughput: 0: 42312.0. Samples: 629072860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:19:31,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 02:19:35,695][12883] Updated weights for policy 0, policy_version 38401 (0.0033) +[2024-06-18 02:19:36,994][12645] Fps is (10 sec: 42607.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 629194752. Throughput: 0: 42156.4. Samples: 629323980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:19:36,994][12645] Avg episode reward: [(0, '0.036')] +[2024-06-18 02:19:39,597][12883] Updated weights for policy 0, policy_version 38411 (0.0042) +[2024-06-18 02:19:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 629407744. Throughput: 0: 41980.4. Samples: 629570920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:19:41,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:19:43,547][12883] Updated weights for policy 0, policy_version 38421 (0.0040) +[2024-06-18 02:19:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 629620736. Throughput: 0: 42025.4. Samples: 629696700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:19:47,000][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 02:19:47,642][12883] Updated weights for policy 0, policy_version 38431 (0.0047) +[2024-06-18 02:19:51,687][12883] Updated weights for policy 0, policy_version 38441 (0.0029) +[2024-06-18 02:19:51,996][12645] Fps is (10 sec: 40950.9, 60 sec: 41231.5, 300 sec: 41876.1). Total num frames: 629817344. Throughput: 0: 41835.3. Samples: 629943940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:19:51,997][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 02:19:55,492][12883] Updated weights for policy 0, policy_version 38451 (0.0040) +[2024-06-18 02:19:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 630046720. Throughput: 0: 41644.0. Samples: 630191640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:19:56,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 02:19:59,442][12883] Updated weights for policy 0, policy_version 38461 (0.0030) +[2024-06-18 02:20:01,994][12645] Fps is (10 sec: 40969.0, 60 sec: 41506.5, 300 sec: 41987.5). Total num frames: 630226944. Throughput: 0: 41737.3. Samples: 630318140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:20:01,995][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 02:20:03,548][12883] Updated weights for policy 0, policy_version 38471 (0.0039) +[2024-06-18 02:20:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 630439936. Throughput: 0: 41625.8. Samples: 630566420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:20:06,994][12645] Avg episode reward: [(0, '0.044')] +[2024-06-18 02:20:07,536][12883] Updated weights for policy 0, policy_version 38481 (0.0042) +[2024-06-18 02:20:11,801][12883] Updated weights for policy 0, policy_version 38491 (0.0032) +[2024-06-18 02:20:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41783.5, 300 sec: 41931.9). Total num frames: 630652928. Throughput: 0: 41638.0. Samples: 630819180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:20:11,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:20:15,301][12883] Updated weights for policy 0, policy_version 38501 (0.0039) +[2024-06-18 02:20:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 630882304. Throughput: 0: 41475.1. Samples: 630939240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:20:16,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 02:20:19,481][12883] Updated weights for policy 0, policy_version 38511 (0.0033) +[2024-06-18 02:20:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 631062528. Throughput: 0: 41433.9. Samples: 631188500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:20:21,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:20:23,098][12883] Updated weights for policy 0, policy_version 38521 (0.0028) +[2024-06-18 02:20:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41780.6, 300 sec: 41932.0). Total num frames: 631275520. Throughput: 0: 41554.1. Samples: 631440860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 02:20:26,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 02:20:27,249][12883] Updated weights for policy 0, policy_version 38531 (0.0038) +[2024-06-18 02:20:30,781][12883] Updated weights for policy 0, policy_version 38541 (0.0039) +[2024-06-18 02:20:31,994][12645] Fps is (10 sec: 44235.9, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 631504896. Throughput: 0: 41694.6. Samples: 631572960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 02:20:31,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 02:20:31,995][12862] Saving new best policy, reward=0.130! +[2024-06-18 02:20:35,282][12883] Updated weights for policy 0, policy_version 38551 (0.0026) +[2024-06-18 02:20:36,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 631685120. Throughput: 0: 41768.8. Samples: 631823440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 02:20:36,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 02:20:37,083][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038556_631701504.pth... +[2024-06-18 02:20:37,145][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037941_621625344.pth +[2024-06-18 02:20:37,339][12862] Signal inference workers to stop experience collection... (8950 times) +[2024-06-18 02:20:37,339][12862] Signal inference workers to resume experience collection... (8950 times) +[2024-06-18 02:20:37,387][12883] InferenceWorker_p0-w0: stopping experience collection (8950 times) +[2024-06-18 02:20:37,387][12883] InferenceWorker_p0-w0: resuming experience collection (8950 times) +[2024-06-18 02:20:38,547][12883] Updated weights for policy 0, policy_version 38561 (0.0030) +[2024-06-18 02:20:42,000][12645] Fps is (10 sec: 40934.8, 60 sec: 41774.9, 300 sec: 41986.6). Total num frames: 631914496. Throughput: 0: 41688.8. Samples: 632067900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 02:20:42,000][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 02:20:43,103][12883] Updated weights for policy 0, policy_version 38571 (0.0042) +[2024-06-18 02:20:46,324][12883] Updated weights for policy 0, policy_version 38581 (0.0031) +[2024-06-18 02:20:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 632127488. Throughput: 0: 41856.0. Samples: 632201660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 02:20:46,994][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 02:20:50,893][12883] Updated weights for policy 0, policy_version 38591 (0.0039) +[2024-06-18 02:20:51,994][12645] Fps is (10 sec: 40985.9, 60 sec: 41780.8, 300 sec: 41877.3). Total num frames: 632324096. Throughput: 0: 42024.0. Samples: 632457500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:20:51,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 02:20:53,957][12883] Updated weights for policy 0, policy_version 38601 (0.0028) +[2024-06-18 02:20:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 632553472. Throughput: 0: 41804.8. Samples: 632700400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:20:56,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 02:20:58,781][12883] Updated weights for policy 0, policy_version 38611 (0.0041) +[2024-06-18 02:21:01,942][12883] Updated weights for policy 0, policy_version 38621 (0.0032) +[2024-06-18 02:21:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 632766464. Throughput: 0: 42077.4. Samples: 632832720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:21:01,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 02:21:06,451][12883] Updated weights for policy 0, policy_version 38631 (0.0030) +[2024-06-18 02:21:06,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 632946688. Throughput: 0: 42093.8. Samples: 633082720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:21:06,994][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 02:21:09,968][12883] Updated weights for policy 0, policy_version 38641 (0.0036) +[2024-06-18 02:21:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 633192448. Throughput: 0: 42040.6. Samples: 633332680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:21:11,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 02:21:14,154][12883] Updated weights for policy 0, policy_version 38651 (0.0026) +[2024-06-18 02:21:16,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 633405440. Throughput: 0: 42020.0. Samples: 633463860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:21:16,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:21:17,713][12883] Updated weights for policy 0, policy_version 38661 (0.0037) +[2024-06-18 02:21:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 41779.0, 300 sec: 41876.4). Total num frames: 633569280. Throughput: 0: 41953.6. Samples: 633711360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:21:21,994][12645] Avg episode reward: [(0, '0.014')] +[2024-06-18 02:21:22,281][12883] Updated weights for policy 0, policy_version 38671 (0.0037) +[2024-06-18 02:21:25,663][12883] Updated weights for policy 0, policy_version 38681 (0.0031) +[2024-06-18 02:21:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 633798656. Throughput: 0: 42047.1. Samples: 633959760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:21:26,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 02:21:30,250][12883] Updated weights for policy 0, policy_version 38691 (0.0036) +[2024-06-18 02:21:31,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42052.4, 300 sec: 42043.1). Total num frames: 634028032. Throughput: 0: 41936.2. Samples: 634088780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:21:31,994][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 02:21:33,323][12883] Updated weights for policy 0, policy_version 38701 (0.0042) +[2024-06-18 02:21:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 634208256. Throughput: 0: 41707.9. Samples: 634334360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:21:36,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:21:38,053][12883] Updated weights for policy 0, policy_version 38711 (0.0040) +[2024-06-18 02:21:40,933][12883] Updated weights for policy 0, policy_version 38721 (0.0029) +[2024-06-18 02:21:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41783.6, 300 sec: 41876.4). Total num frames: 634421248. Throughput: 0: 41980.5. Samples: 634589520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 02:21:41,994][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 02:21:45,761][12883] Updated weights for policy 0, policy_version 38731 (0.0045) +[2024-06-18 02:21:46,705][12862] Signal inference workers to stop experience collection... (9000 times) +[2024-06-18 02:21:46,705][12862] Signal inference workers to resume experience collection... (9000 times) +[2024-06-18 02:21:46,751][12883] InferenceWorker_p0-w0: stopping experience collection (9000 times) +[2024-06-18 02:21:46,752][12883] InferenceWorker_p0-w0: resuming experience collection (9000 times) +[2024-06-18 02:21:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 634650624. Throughput: 0: 41905.4. Samples: 634718460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 02:21:46,994][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 02:21:48,899][12883] Updated weights for policy 0, policy_version 38741 (0.0046) +[2024-06-18 02:21:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 634830848. Throughput: 0: 41805.2. Samples: 634963960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 02:21:51,994][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 02:21:53,508][12883] Updated weights for policy 0, policy_version 38751 (0.0046) +[2024-06-18 02:21:56,586][12883] Updated weights for policy 0, policy_version 38761 (0.0030) +[2024-06-18 02:21:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 41987.8). Total num frames: 635076608. Throughput: 0: 41780.1. Samples: 635212780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 02:21:56,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 02:22:01,231][12883] Updated weights for policy 0, policy_version 38771 (0.0036) +[2024-06-18 02:22:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 635240448. Throughput: 0: 41741.8. Samples: 635342240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 02:22:01,994][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 02:22:04,242][12883] Updated weights for policy 0, policy_version 38781 (0.0034) +[2024-06-18 02:22:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 635469824. Throughput: 0: 41856.2. Samples: 635594880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:22:06,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:22:09,153][12883] Updated weights for policy 0, policy_version 38791 (0.0027) +[2024-06-18 02:22:11,843][12883] Updated weights for policy 0, policy_version 38801 (0.0038) +[2024-06-18 02:22:11,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 635715584. Throughput: 0: 41726.6. Samples: 635837460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:22:11,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 02:22:16,963][12883] Updated weights for policy 0, policy_version 38811 (0.0045) +[2024-06-18 02:22:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 635879424. Throughput: 0: 41750.1. Samples: 635967540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:22:16,994][12645] Avg episode reward: [(0, '0.036')] +[2024-06-18 02:22:20,163][12883] Updated weights for policy 0, policy_version 38821 (0.0030) +[2024-06-18 02:22:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 41987.8). Total num frames: 636125184. Throughput: 0: 41961.0. Samples: 636222600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:22:21,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 02:22:24,584][12883] Updated weights for policy 0, policy_version 38831 (0.0042) +[2024-06-18 02:22:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 636321792. Throughput: 0: 41829.7. Samples: 636471860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:22:26,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 02:22:27,982][12883] Updated weights for policy 0, policy_version 38841 (0.0036) +[2024-06-18 02:22:31,994][12645] Fps is (10 sec: 36044.9, 60 sec: 40960.0, 300 sec: 41820.9). Total num frames: 636485632. Throughput: 0: 41586.6. Samples: 636589860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 02:22:31,994][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 02:22:32,904][12883] Updated weights for policy 0, policy_version 38851 (0.0050) +[2024-06-18 02:22:35,893][12883] Updated weights for policy 0, policy_version 38861 (0.0033) +[2024-06-18 02:22:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 636731392. Throughput: 0: 41645.8. Samples: 636838020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 02:22:36,994][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 02:22:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038863_636731392.pth... +[2024-06-18 02:22:37,107][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038250_626688000.pth +[2024-06-18 02:22:40,641][12883] Updated weights for policy 0, policy_version 38871 (0.0036) +[2024-06-18 02:22:41,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 636944384. Throughput: 0: 41640.7. Samples: 637086620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 02:22:41,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 02:22:44,261][12883] Updated weights for policy 0, policy_version 38881 (0.0031) +[2024-06-18 02:22:46,996][12645] Fps is (10 sec: 39312.6, 60 sec: 41231.5, 300 sec: 41876.1). Total num frames: 637124608. Throughput: 0: 41691.8. Samples: 637218460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 02:22:46,997][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 02:22:48,373][12883] Updated weights for policy 0, policy_version 38891 (0.0030) +[2024-06-18 02:22:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 637337600. Throughput: 0: 41503.9. Samples: 637462560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 02:22:51,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:22:52,250][12883] Updated weights for policy 0, policy_version 38901 (0.0028) +[2024-06-18 02:22:56,200][12883] Updated weights for policy 0, policy_version 38911 (0.0036) +[2024-06-18 02:22:56,994][12645] Fps is (10 sec: 42608.4, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 637550592. Throughput: 0: 41672.6. Samples: 637712720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 02:22:56,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:22:59,959][12883] Updated weights for policy 0, policy_version 38921 (0.0028) +[2024-06-18 02:23:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 637763584. Throughput: 0: 41635.2. Samples: 637841120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 02:23:01,994][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 02:23:03,484][12862] Signal inference workers to stop experience collection... (9050 times) +[2024-06-18 02:23:03,544][12883] InferenceWorker_p0-w0: stopping experience collection (9050 times) +[2024-06-18 02:23:03,601][12862] Signal inference workers to resume experience collection... (9050 times) +[2024-06-18 02:23:03,601][12883] InferenceWorker_p0-w0: resuming experience collection (9050 times) +[2024-06-18 02:23:04,219][12883] Updated weights for policy 0, policy_version 38931 (0.0030) +[2024-06-18 02:23:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 637960192. Throughput: 0: 41422.2. Samples: 638086600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 02:23:06,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:23:07,605][12883] Updated weights for policy 0, policy_version 38941 (0.0035) +[2024-06-18 02:23:11,889][12883] Updated weights for policy 0, policy_version 38951 (0.0035) +[2024-06-18 02:23:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40960.1, 300 sec: 41876.4). Total num frames: 638173184. Throughput: 0: 41603.6. Samples: 638344020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 02:23:11,994][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 02:23:15,290][12883] Updated weights for policy 0, policy_version 38961 (0.0057) +[2024-06-18 02:23:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 638386176. Throughput: 0: 41673.4. Samples: 638465160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 02:23:16,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:23:19,505][12883] Updated weights for policy 0, policy_version 38971 (0.0054) +[2024-06-18 02:23:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 638582784. Throughput: 0: 41712.9. Samples: 638715100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:23:21,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 02:23:22,807][12883] Updated weights for policy 0, policy_version 38981 (0.0038) +[2024-06-18 02:23:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 638812160. Throughput: 0: 41814.4. Samples: 638968260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:23:26,994][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 02:23:27,391][12883] Updated weights for policy 0, policy_version 38991 (0.0036) +[2024-06-18 02:23:31,010][12883] Updated weights for policy 0, policy_version 39001 (0.0028) +[2024-06-18 02:23:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 639008768. Throughput: 0: 41683.8. Samples: 639094140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:23:31,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 02:23:35,048][12883] Updated weights for policy 0, policy_version 39011 (0.0023) +[2024-06-18 02:23:36,996][12645] Fps is (10 sec: 40950.3, 60 sec: 41504.6, 300 sec: 41765.0). Total num frames: 639221760. Throughput: 0: 41762.0. Samples: 639341940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:23:36,996][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 02:23:38,677][12883] Updated weights for policy 0, policy_version 39021 (0.0028) +[2024-06-18 02:23:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 639401984. Throughput: 0: 41891.4. Samples: 639597840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:23:41,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 02:23:42,795][12883] Updated weights for policy 0, policy_version 39031 (0.0039) +[2024-06-18 02:23:46,927][12883] Updated weights for policy 0, policy_version 39041 (0.0033) +[2024-06-18 02:23:46,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42053.8, 300 sec: 41709.8). Total num frames: 639647744. Throughput: 0: 41677.2. Samples: 639716600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:23:46,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:23:50,741][12883] Updated weights for policy 0, policy_version 39051 (0.0035) +[2024-06-18 02:23:52,000][12645] Fps is (10 sec: 44209.9, 60 sec: 41774.9, 300 sec: 41820.0). Total num frames: 639844352. Throughput: 0: 41760.0. Samples: 639966060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:23:52,000][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 02:23:55,021][12883] Updated weights for policy 0, policy_version 39061 (0.0034) +[2024-06-18 02:23:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41506.1, 300 sec: 41709.9). Total num frames: 640040960. Throughput: 0: 41659.1. Samples: 640218680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:23:56,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 02:23:58,586][12883] Updated weights for policy 0, policy_version 39071 (0.0038) +[2024-06-18 02:24:01,994][12645] Fps is (10 sec: 42625.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 640270336. Throughput: 0: 41647.1. Samples: 640339280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:24:01,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:24:02,586][12883] Updated weights for policy 0, policy_version 39081 (0.0029) +[2024-06-18 02:24:05,941][12862] Signal inference workers to stop experience collection... (9100 times) +[2024-06-18 02:24:05,942][12862] Signal inference workers to resume experience collection... (9100 times) +[2024-06-18 02:24:05,988][12883] InferenceWorker_p0-w0: stopping experience collection (9100 times) +[2024-06-18 02:24:05,988][12883] InferenceWorker_p0-w0: resuming experience collection (9100 times) +[2024-06-18 02:24:06,651][12883] Updated weights for policy 0, policy_version 39091 (0.0041) +[2024-06-18 02:24:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41766.2). Total num frames: 640466944. Throughput: 0: 41692.9. Samples: 640591280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 02:24:06,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:24:10,412][12883] Updated weights for policy 0, policy_version 39101 (0.0044) +[2024-06-18 02:24:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 640679936. Throughput: 0: 41595.0. Samples: 640840040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 02:24:11,994][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 02:24:14,610][12883] Updated weights for policy 0, policy_version 39111 (0.0033) +[2024-06-18 02:24:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 640892928. Throughput: 0: 41592.1. Samples: 640965780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 02:24:16,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 02:24:18,296][12883] Updated weights for policy 0, policy_version 39121 (0.0037) +[2024-06-18 02:24:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41765.6). Total num frames: 641089536. Throughput: 0: 41597.1. Samples: 641213720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 02:24:21,994][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 02:24:22,729][12883] Updated weights for policy 0, policy_version 39131 (0.0043) +[2024-06-18 02:24:26,072][12883] Updated weights for policy 0, policy_version 39141 (0.0032) +[2024-06-18 02:24:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 641302528. Throughput: 0: 41470.3. Samples: 641464000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 02:24:26,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 02:24:30,511][12883] Updated weights for policy 0, policy_version 39151 (0.0033) +[2024-06-18 02:24:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 641515520. Throughput: 0: 41616.1. Samples: 641589320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 02:24:31,994][12645] Avg episode reward: [(0, '0.018')] +[2024-06-18 02:24:33,958][12883] Updated weights for policy 0, policy_version 39161 (0.0022) +[2024-06-18 02:24:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41507.7, 300 sec: 41709.8). Total num frames: 641712128. Throughput: 0: 41759.6. Samples: 641844980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) +[2024-06-18 02:24:36,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 02:24:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039167_641712128.pth... +[2024-06-18 02:24:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038556_631701504.pth +[2024-06-18 02:24:38,305][12883] Updated weights for policy 0, policy_version 39171 (0.0038) +[2024-06-18 02:24:41,818][12883] Updated weights for policy 0, policy_version 39181 (0.0031) +[2024-06-18 02:24:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 641941504. Throughput: 0: 41728.9. Samples: 642096480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) +[2024-06-18 02:24:41,994][12645] Avg episode reward: [(0, '0.067')] +[2024-06-18 02:24:46,059][12883] Updated weights for policy 0, policy_version 39191 (0.0031) +[2024-06-18 02:24:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.2, 300 sec: 41765.6). Total num frames: 642138112. Throughput: 0: 41935.4. Samples: 642226380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) +[2024-06-18 02:24:46,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 02:24:49,803][12883] Updated weights for policy 0, policy_version 39201 (0.0029) +[2024-06-18 02:24:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41783.6, 300 sec: 41709.8). Total num frames: 642351104. Throughput: 0: 41804.0. Samples: 642472460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) +[2024-06-18 02:24:51,994][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 02:24:53,837][12883] Updated weights for policy 0, policy_version 39211 (0.0044) +[2024-06-18 02:24:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 642564096. Throughput: 0: 41890.2. Samples: 642725100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) +[2024-06-18 02:24:56,994][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 02:24:57,426][12883] Updated weights for policy 0, policy_version 39221 (0.0043) +[2024-06-18 02:25:01,515][12883] Updated weights for policy 0, policy_version 39231 (0.0033) +[2024-06-18 02:25:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 642760704. Throughput: 0: 41905.7. Samples: 642851540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 02:25:02,000][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:25:05,380][12883] Updated weights for policy 0, policy_version 39241 (0.0039) +[2024-06-18 02:25:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 642990080. Throughput: 0: 42048.1. Samples: 643105880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 02:25:06,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:25:09,350][12883] Updated weights for policy 0, policy_version 39251 (0.0032) +[2024-06-18 02:25:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 643203072. Throughput: 0: 42128.0. Samples: 643359760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 02:25:11,994][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 02:25:13,054][12883] Updated weights for policy 0, policy_version 39261 (0.0025) +[2024-06-18 02:25:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 643399680. Throughput: 0: 42044.9. Samples: 643481340. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 02:25:16,994][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 02:25:17,056][12883] Updated weights for policy 0, policy_version 39271 (0.0040) +[2024-06-18 02:25:17,960][12862] Signal inference workers to stop experience collection... (9150 times) +[2024-06-18 02:25:17,964][12862] Signal inference workers to resume experience collection... (9150 times) +[2024-06-18 02:25:17,993][12883] InferenceWorker_p0-w0: stopping experience collection (9150 times) +[2024-06-18 02:25:17,993][12883] InferenceWorker_p0-w0: resuming experience collection (9150 times) +[2024-06-18 02:25:20,891][12883] Updated weights for policy 0, policy_version 39281 (0.0039) +[2024-06-18 02:25:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 41931.9). Total num frames: 643645440. Throughput: 0: 42010.6. Samples: 643735460. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) +[2024-06-18 02:25:21,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:25:24,967][12883] Updated weights for policy 0, policy_version 39291 (0.0038) +[2024-06-18 02:25:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 643825664. Throughput: 0: 42068.5. Samples: 643989560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 02:25:26,994][12645] Avg episode reward: [(0, '0.038')] +[2024-06-18 02:25:28,471][12883] Updated weights for policy 0, policy_version 39301 (0.0030) +[2024-06-18 02:25:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 644038656. Throughput: 0: 41779.6. Samples: 644106460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 02:25:31,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 02:25:33,083][12883] Updated weights for policy 0, policy_version 39311 (0.0035) +[2024-06-18 02:25:36,288][12883] Updated weights for policy 0, policy_version 39321 (0.0041) +[2024-06-18 02:25:36,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 41932.8). Total num frames: 644284416. Throughput: 0: 42181.2. Samples: 644370620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 02:25:36,994][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 02:25:40,726][12883] Updated weights for policy 0, policy_version 39331 (0.0028) +[2024-06-18 02:25:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 644448256. Throughput: 0: 42101.4. Samples: 644619660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 02:25:41,994][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 02:25:44,194][12883] Updated weights for policy 0, policy_version 39341 (0.0037) +[2024-06-18 02:25:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 644661248. Throughput: 0: 42012.5. Samples: 644742100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 02:25:46,996][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 02:25:48,303][12883] Updated weights for policy 0, policy_version 39351 (0.0046) +[2024-06-18 02:25:51,854][12883] Updated weights for policy 0, policy_version 39361 (0.0039) +[2024-06-18 02:25:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 644890624. Throughput: 0: 42130.2. Samples: 645001740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:25:51,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 02:25:55,839][12883] Updated weights for policy 0, policy_version 39371 (0.0039) +[2024-06-18 02:25:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 645054464. Throughput: 0: 42125.2. Samples: 645255400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:25:56,995][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 02:25:59,479][12883] Updated weights for policy 0, policy_version 39381 (0.0040) +[2024-06-18 02:26:01,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42323.8, 300 sec: 41876.1). Total num frames: 645300224. Throughput: 0: 42050.8. Samples: 645373720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:26:01,996][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 02:26:04,092][12883] Updated weights for policy 0, policy_version 39391 (0.0030) +[2024-06-18 02:26:06,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 645529600. Throughput: 0: 42110.2. Samples: 645630420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:26:06,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:26:07,150][12883] Updated weights for policy 0, policy_version 39401 (0.0037) +[2024-06-18 02:26:11,708][12883] Updated weights for policy 0, policy_version 39411 (0.0038) +[2024-06-18 02:26:11,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 645709824. Throughput: 0: 42080.0. Samples: 645883160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 02:26:11,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 02:26:14,986][12883] Updated weights for policy 0, policy_version 39421 (0.0027) +[2024-06-18 02:26:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 645955584. Throughput: 0: 42204.4. Samples: 646005660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 02:26:16,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:26:19,504][12883] Updated weights for policy 0, policy_version 39431 (0.0034) +[2024-06-18 02:26:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 646135808. Throughput: 0: 41997.1. Samples: 646260480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 02:26:21,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 02:26:22,089][12862] Saving new best policy, reward=0.134! +[2024-06-18 02:26:23,188][12883] Updated weights for policy 0, policy_version 39441 (0.0035) +[2024-06-18 02:26:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 646332416. Throughput: 0: 42103.1. Samples: 646514300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 02:26:26,994][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 02:26:27,711][12883] Updated weights for policy 0, policy_version 39451 (0.0031) +[2024-06-18 02:26:28,344][12862] Signal inference workers to stop experience collection... (9200 times) +[2024-06-18 02:26:28,344][12862] Signal inference workers to resume experience collection... (9200 times) +[2024-06-18 02:26:28,368][12883] InferenceWorker_p0-w0: stopping experience collection (9200 times) +[2024-06-18 02:26:28,368][12883] InferenceWorker_p0-w0: resuming experience collection (9200 times) +[2024-06-18 02:26:30,948][12883] Updated weights for policy 0, policy_version 39461 (0.0038) +[2024-06-18 02:26:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 646578176. Throughput: 0: 42045.5. Samples: 646634140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 02:26:31,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 02:26:35,554][12883] Updated weights for policy 0, policy_version 39471 (0.0023) +[2024-06-18 02:26:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 646774784. Throughput: 0: 42027.6. Samples: 646892980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 02:26:36,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:26:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039476_646774784.pth... +[2024-06-18 02:26:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038863_636731392.pth +[2024-06-18 02:26:39,039][12883] Updated weights for policy 0, policy_version 39481 (0.0037) +[2024-06-18 02:26:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 646971392. Throughput: 0: 41835.7. Samples: 647138000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 02:26:41,994][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 02:26:42,995][12883] Updated weights for policy 0, policy_version 39491 (0.0028) +[2024-06-18 02:26:46,705][12883] Updated weights for policy 0, policy_version 39501 (0.0032) +[2024-06-18 02:26:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 647184384. Throughput: 0: 42009.7. Samples: 647264060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 02:26:46,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 02:26:50,771][12883] Updated weights for policy 0, policy_version 39511 (0.0026) +[2024-06-18 02:26:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41709.7). Total num frames: 647380992. Throughput: 0: 41919.5. Samples: 647516800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 02:26:51,994][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 02:26:54,222][12883] Updated weights for policy 0, policy_version 39521 (0.0041) +[2024-06-18 02:26:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 647593984. Throughput: 0: 41907.1. Samples: 647768980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 02:26:56,994][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 02:26:58,590][12883] Updated weights for policy 0, policy_version 39531 (0.0027) +[2024-06-18 02:27:01,745][12883] Updated weights for policy 0, policy_version 39541 (0.0036) +[2024-06-18 02:27:01,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42326.9, 300 sec: 41931.9). Total num frames: 647839744. Throughput: 0: 42022.7. Samples: 647896680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 02:27:01,994][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 02:27:06,636][12883] Updated weights for policy 0, policy_version 39551 (0.0039) +[2024-06-18 02:27:06,997][12645] Fps is (10 sec: 42584.9, 60 sec: 41504.0, 300 sec: 41709.3). Total num frames: 648019968. Throughput: 0: 41877.9. Samples: 648145120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:27:06,997][12645] Avg episode reward: [(0, '0.003')] +[2024-06-18 02:27:09,855][12883] Updated weights for policy 0, policy_version 39561 (0.0033) +[2024-06-18 02:27:11,994][12645] Fps is (10 sec: 37682.9, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 648216576. Throughput: 0: 41822.1. Samples: 648396300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:27:11,994][12645] Avg episode reward: [(0, '0.036')] +[2024-06-18 02:27:14,420][12883] Updated weights for policy 0, policy_version 39571 (0.0047) +[2024-06-18 02:27:16,994][12645] Fps is (10 sec: 44250.4, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 648462336. Throughput: 0: 41933.6. Samples: 648521160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:27:16,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:27:17,733][12883] Updated weights for policy 0, policy_version 39581 (0.0029) +[2024-06-18 02:27:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 648642560. Throughput: 0: 41932.8. Samples: 648779960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:27:21,994][12645] Avg episode reward: [(0, '0.012')] +[2024-06-18 02:27:22,062][12883] Updated weights for policy 0, policy_version 39591 (0.0033) +[2024-06-18 02:27:25,254][12883] Updated weights for policy 0, policy_version 39601 (0.0030) +[2024-06-18 02:27:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 648855552. Throughput: 0: 41973.7. Samples: 649026820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:27:26,994][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 02:27:29,749][12883] Updated weights for policy 0, policy_version 39611 (0.0030) +[2024-06-18 02:27:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 649101312. Throughput: 0: 42048.8. Samples: 649156260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 02:27:31,994][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 02:27:32,958][12883] Updated weights for policy 0, policy_version 39621 (0.0033) +[2024-06-18 02:27:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 649265152. Throughput: 0: 42093.8. Samples: 649411020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 02:27:36,994][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 02:27:37,690][12883] Updated weights for policy 0, policy_version 39631 (0.0036) +[2024-06-18 02:27:40,647][12883] Updated weights for policy 0, policy_version 39641 (0.0032) +[2024-06-18 02:27:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 649510912. Throughput: 0: 41804.4. Samples: 649650180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 02:27:41,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 02:27:41,999][12862] Saving new best policy, reward=0.171! +[2024-06-18 02:27:45,612][12883] Updated weights for policy 0, policy_version 39651 (0.0028) +[2024-06-18 02:27:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 649723904. Throughput: 0: 41978.6. Samples: 649785720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 02:27:46,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 02:27:47,729][12862] Signal inference workers to stop experience collection... (9250 times) +[2024-06-18 02:27:47,730][12862] Signal inference workers to resume experience collection... (9250 times) +[2024-06-18 02:27:47,775][12883] InferenceWorker_p0-w0: stopping experience collection (9250 times) +[2024-06-18 02:27:47,775][12883] InferenceWorker_p0-w0: resuming experience collection (9250 times) +[2024-06-18 02:27:48,441][12883] Updated weights for policy 0, policy_version 39661 (0.0051) +[2024-06-18 02:27:51,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41820.8). Total num frames: 649887744. Throughput: 0: 41990.5. Samples: 650034560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 02:27:51,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 02:27:53,478][12883] Updated weights for policy 0, policy_version 39671 (0.0035) +[2024-06-18 02:27:56,147][12883] Updated weights for policy 0, policy_version 39681 (0.0030) +[2024-06-18 02:27:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 650133504. Throughput: 0: 41913.9. Samples: 650282420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 02:27:56,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 02:28:01,108][12883] Updated weights for policy 0, policy_version 39691 (0.0039) +[2024-06-18 02:28:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 650330112. Throughput: 0: 42240.2. Samples: 650421960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 02:28:01,994][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 02:28:03,998][12883] Updated weights for policy 0, policy_version 39701 (0.0037) +[2024-06-18 02:28:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41781.4, 300 sec: 41876.4). Total num frames: 650526720. Throughput: 0: 41915.5. Samples: 650666160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 02:28:06,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 02:28:08,980][12883] Updated weights for policy 0, policy_version 39711 (0.0036) +[2024-06-18 02:28:11,699][12883] Updated weights for policy 0, policy_version 39721 (0.0028) +[2024-06-18 02:28:11,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42043.0). Total num frames: 650788864. Throughput: 0: 41851.9. Samples: 650910160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 02:28:11,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 02:28:16,936][12883] Updated weights for policy 0, policy_version 39731 (0.0042) +[2024-06-18 02:28:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 650952704. Throughput: 0: 41936.5. Samples: 651043400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 02:28:16,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:28:19,822][12883] Updated weights for policy 0, policy_version 39741 (0.0035) +[2024-06-18 02:28:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 651165696. Throughput: 0: 41809.0. Samples: 651292420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 02:28:21,994][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 02:28:24,805][12883] Updated weights for policy 0, policy_version 39751 (0.0028) +[2024-06-18 02:28:26,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 651427840. Throughput: 0: 42040.5. Samples: 651542000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 02:28:26,994][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 02:28:27,452][12883] Updated weights for policy 0, policy_version 39761 (0.0037) +[2024-06-18 02:28:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 41876.7). Total num frames: 651575296. Throughput: 0: 41949.5. Samples: 651673440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 02:28:31,994][12645] Avg episode reward: [(0, '0.064')] +[2024-06-18 02:28:32,553][12883] Updated weights for policy 0, policy_version 39771 (0.0033) +[2024-06-18 02:28:35,673][12883] Updated weights for policy 0, policy_version 39781 (0.0038) +[2024-06-18 02:28:36,996][12645] Fps is (10 sec: 37674.8, 60 sec: 42323.8, 300 sec: 42042.7). Total num frames: 651804672. Throughput: 0: 41956.1. Samples: 651922680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 02:28:36,996][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 02:28:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039783_651804672.pth... +[2024-06-18 02:28:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039167_641712128.pth +[2024-06-18 02:28:40,456][12883] Updated weights for policy 0, policy_version 39791 (0.0039) +[2024-06-18 02:28:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 652034048. Throughput: 0: 41931.6. Samples: 652169340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 02:28:41,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 02:28:43,609][12883] Updated weights for policy 0, policy_version 39801 (0.0034) +[2024-06-18 02:28:46,994][12645] Fps is (10 sec: 39330.5, 60 sec: 41233.1, 300 sec: 41877.3). Total num frames: 652197888. Throughput: 0: 41702.2. Samples: 652298560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 02:28:46,994][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 02:28:48,288][12883] Updated weights for policy 0, policy_version 39811 (0.0047) +[2024-06-18 02:28:49,642][12862] Signal inference workers to stop experience collection... (9300 times) +[2024-06-18 02:28:49,680][12883] InferenceWorker_p0-w0: stopping experience collection (9300 times) +[2024-06-18 02:28:49,753][12862] Signal inference workers to resume experience collection... (9300 times) +[2024-06-18 02:28:49,753][12883] InferenceWorker_p0-w0: resuming experience collection (9300 times) +[2024-06-18 02:28:51,278][12883] Updated weights for policy 0, policy_version 39821 (0.0037) +[2024-06-18 02:28:51,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 652427264. Throughput: 0: 41694.6. Samples: 652542420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 02:28:51,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 02:28:55,948][12883] Updated weights for policy 0, policy_version 39831 (0.0027) +[2024-06-18 02:28:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 652623872. Throughput: 0: 42116.6. Samples: 652805400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 02:28:56,994][12645] Avg episode reward: [(0, '0.070')] +[2024-06-18 02:28:58,948][12883] Updated weights for policy 0, policy_version 39841 (0.0029) +[2024-06-18 02:29:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 652836864. Throughput: 0: 41796.5. Samples: 652924240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 02:29:01,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 02:29:03,806][12883] Updated weights for policy 0, policy_version 39851 (0.0027) +[2024-06-18 02:29:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 653066240. Throughput: 0: 41834.6. Samples: 653174980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 02:29:07,007][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 02:29:07,464][12883] Updated weights for policy 0, policy_version 39861 (0.0049) +[2024-06-18 02:29:11,595][12883] Updated weights for policy 0, policy_version 39871 (0.0033) +[2024-06-18 02:29:11,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40959.9, 300 sec: 41876.4). Total num frames: 653246464. Throughput: 0: 41889.2. Samples: 653427020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 02:29:12,003][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 02:29:15,256][12883] Updated weights for policy 0, policy_version 39881 (0.0048) +[2024-06-18 02:29:16,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42323.8, 300 sec: 42042.7). Total num frames: 653492224. Throughput: 0: 41616.1. Samples: 653546260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:29:16,996][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 02:29:19,622][12883] Updated weights for policy 0, policy_version 39891 (0.0028) +[2024-06-18 02:29:21,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 653688832. Throughput: 0: 41847.0. Samples: 653805700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:29:21,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 02:29:22,950][12883] Updated weights for policy 0, policy_version 39901 (0.0034) +[2024-06-18 02:29:26,993][12645] Fps is (10 sec: 39330.8, 60 sec: 40960.1, 300 sec: 41932.0). Total num frames: 653885440. Throughput: 0: 41877.3. Samples: 654053820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:29:26,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 02:29:27,093][12883] Updated weights for policy 0, policy_version 39911 (0.0041) +[2024-06-18 02:29:30,857][12883] Updated weights for policy 0, policy_version 39921 (0.0036) +[2024-06-18 02:29:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 654147584. Throughput: 0: 41877.3. Samples: 654183040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:29:31,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:29:35,346][12883] Updated weights for policy 0, policy_version 39931 (0.0038) +[2024-06-18 02:29:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41780.8, 300 sec: 41931.9). Total num frames: 654311424. Throughput: 0: 42101.0. Samples: 654436960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 02:29:37,000][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:29:38,454][12883] Updated weights for policy 0, policy_version 39941 (0.0033) +[2024-06-18 02:29:41,994][12645] Fps is (10 sec: 36045.0, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 654508032. Throughput: 0: 41739.5. Samples: 654683680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:29:41,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 02:29:42,844][12883] Updated weights for policy 0, policy_version 39951 (0.0023) +[2024-06-18 02:29:46,556][12883] Updated weights for policy 0, policy_version 39961 (0.0036) +[2024-06-18 02:29:46,999][12645] Fps is (10 sec: 42577.1, 60 sec: 42321.8, 300 sec: 41986.8). Total num frames: 654737408. Throughput: 0: 41806.0. Samples: 654805720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:29:46,999][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 02:29:50,982][12883] Updated weights for policy 0, policy_version 39971 (0.0037) +[2024-06-18 02:29:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 654950400. Throughput: 0: 41850.6. Samples: 655058260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:29:51,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 02:29:54,346][12883] Updated weights for policy 0, policy_version 39981 (0.0034) +[2024-06-18 02:29:56,994][12645] Fps is (10 sec: 42619.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 655163392. Throughput: 0: 41809.9. Samples: 655308460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:29:56,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 02:29:58,541][12883] Updated weights for policy 0, policy_version 39991 (0.0037) +[2024-06-18 02:30:01,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 655360000. Throughput: 0: 41993.2. Samples: 655435860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:30:01,994][12645] Avg episode reward: [(0, '0.032')] +[2024-06-18 02:30:02,210][12883] Updated weights for policy 0, policy_version 40001 (0.0033) +[2024-06-18 02:30:06,130][12883] Updated weights for policy 0, policy_version 40011 (0.0030) +[2024-06-18 02:30:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 655556608. Throughput: 0: 41922.1. Samples: 655692200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:30:06,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:30:10,036][12883] Updated weights for policy 0, policy_version 40021 (0.0049) +[2024-06-18 02:30:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 655769600. Throughput: 0: 41794.0. Samples: 655934560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:30:11,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 02:30:14,018][12883] Updated weights for policy 0, policy_version 40031 (0.0034) +[2024-06-18 02:30:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41507.6, 300 sec: 41820.8). Total num frames: 655982592. Throughput: 0: 41770.2. Samples: 656062700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:30:16,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 02:30:17,755][12883] Updated weights for policy 0, policy_version 40041 (0.0050) +[2024-06-18 02:30:18,227][12862] Signal inference workers to stop experience collection... (9350 times) +[2024-06-18 02:30:18,275][12883] InferenceWorker_p0-w0: stopping experience collection (9350 times) +[2024-06-18 02:30:18,278][12862] Signal inference workers to resume experience collection... (9350 times) +[2024-06-18 02:30:18,292][12883] InferenceWorker_p0-w0: resuming experience collection (9350 times) +[2024-06-18 02:30:21,525][12883] Updated weights for policy 0, policy_version 40051 (0.0037) +[2024-06-18 02:30:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 656195584. Throughput: 0: 41762.7. Samples: 656316280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:30:21,994][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 02:30:25,307][12883] Updated weights for policy 0, policy_version 40061 (0.0032) +[2024-06-18 02:30:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 656424960. Throughput: 0: 41871.0. Samples: 656567880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:30:26,996][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 02:30:29,869][12883] Updated weights for policy 0, policy_version 40071 (0.0032) +[2024-06-18 02:30:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 656605184. Throughput: 0: 41951.3. Samples: 656693320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) +[2024-06-18 02:30:31,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 02:30:33,293][12883] Updated weights for policy 0, policy_version 40081 (0.0033) +[2024-06-18 02:30:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 656834560. Throughput: 0: 41844.6. Samples: 656941260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) +[2024-06-18 02:30:36,994][12645] Avg episode reward: [(0, '0.008')] +[2024-06-18 02:30:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040090_656834560.pth... +[2024-06-18 02:30:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039476_646774784.pth +[2024-06-18 02:30:37,600][12883] Updated weights for policy 0, policy_version 40091 (0.0034) +[2024-06-18 02:30:41,223][12883] Updated weights for policy 0, policy_version 40101 (0.0035) +[2024-06-18 02:30:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 657031168. Throughput: 0: 41860.4. Samples: 657192180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) +[2024-06-18 02:30:41,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 02:30:45,262][12883] Updated weights for policy 0, policy_version 40111 (0.0037) +[2024-06-18 02:30:46,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41782.6, 300 sec: 41876.4). Total num frames: 657244160. Throughput: 0: 41709.7. Samples: 657312800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) +[2024-06-18 02:30:46,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 02:30:48,922][12883] Updated weights for policy 0, policy_version 40121 (0.0030) +[2024-06-18 02:30:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 657457152. Throughput: 0: 41678.7. Samples: 657567740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) +[2024-06-18 02:30:51,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 02:30:53,312][12883] Updated weights for policy 0, policy_version 40131 (0.0040) +[2024-06-18 02:30:56,837][12883] Updated weights for policy 0, policy_version 40141 (0.0046) +[2024-06-18 02:30:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 657670144. Throughput: 0: 41687.6. Samples: 657810500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:30:56,994][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 02:31:00,878][12883] Updated weights for policy 0, policy_version 40151 (0.0043) +[2024-06-18 02:31:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 657850368. Throughput: 0: 41729.9. Samples: 657940540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:31:01,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:31:04,634][12883] Updated weights for policy 0, policy_version 40161 (0.0028) +[2024-06-18 02:31:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 658063360. Throughput: 0: 41661.2. Samples: 658191040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:31:06,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:31:08,807][12883] Updated weights for policy 0, policy_version 40171 (0.0028) +[2024-06-18 02:31:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 658276352. Throughput: 0: 41741.0. Samples: 658446220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:31:11,994][12645] Avg episode reward: [(0, '0.126')] +[2024-06-18 02:31:12,553][12883] Updated weights for policy 0, policy_version 40181 (0.0032) +[2024-06-18 02:31:16,407][12883] Updated weights for policy 0, policy_version 40191 (0.0035) +[2024-06-18 02:31:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 41777.7, 300 sec: 41876.1). Total num frames: 658489344. Throughput: 0: 41689.1. Samples: 658569420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:31:16,997][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 02:31:17,013][12862] Saving new best policy, reward=0.189! +[2024-06-18 02:31:20,391][12883] Updated weights for policy 0, policy_version 40201 (0.0036) +[2024-06-18 02:31:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 658685952. Throughput: 0: 41755.0. Samples: 658820240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:31:21,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:31:24,257][12883] Updated weights for policy 0, policy_version 40211 (0.0047) +[2024-06-18 02:31:26,994][12645] Fps is (10 sec: 40969.2, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 658898944. Throughput: 0: 41821.8. Samples: 659074160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:31:26,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:31:28,369][12883] Updated weights for policy 0, policy_version 40221 (0.0028) +[2024-06-18 02:31:31,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42050.7, 300 sec: 41876.1). Total num frames: 659128320. Throughput: 0: 42022.4. Samples: 659203900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:31:31,997][12645] Avg episode reward: [(0, '0.112')] +[2024-06-18 02:31:32,341][12883] Updated weights for policy 0, policy_version 40231 (0.0032) +[2024-06-18 02:31:36,178][12883] Updated weights for policy 0, policy_version 40241 (0.0043) +[2024-06-18 02:31:36,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 659341312. Throughput: 0: 41941.0. Samples: 659455180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:31:36,997][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 02:31:40,038][12883] Updated weights for policy 0, policy_version 40251 (0.0045) +[2024-06-18 02:31:41,994][12645] Fps is (10 sec: 40969.4, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 659537920. Throughput: 0: 42048.9. Samples: 659702700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:31:41,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 02:31:44,074][12883] Updated weights for policy 0, policy_version 40261 (0.0037) +[2024-06-18 02:31:46,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 659750912. Throughput: 0: 42002.2. Samples: 659830640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:31:46,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 02:31:47,746][12883] Updated weights for policy 0, policy_version 40271 (0.0036) +[2024-06-18 02:31:51,967][12883] Updated weights for policy 0, policy_version 40281 (0.0029) +[2024-06-18 02:31:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 659963904. Throughput: 0: 42136.5. Samples: 660087180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:31:51,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:31:52,127][12862] Signal inference workers to stop experience collection... (9400 times) +[2024-06-18 02:31:52,128][12862] Signal inference workers to resume experience collection... (9400 times) +[2024-06-18 02:31:52,173][12883] InferenceWorker_p0-w0: stopping experience collection (9400 times) +[2024-06-18 02:31:52,173][12883] InferenceWorker_p0-w0: resuming experience collection (9400 times) +[2024-06-18 02:31:55,753][12883] Updated weights for policy 0, policy_version 40291 (0.0031) +[2024-06-18 02:31:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 660160512. Throughput: 0: 42085.2. Samples: 660340060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:31:56,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 02:31:59,487][12883] Updated weights for policy 0, policy_version 40301 (0.0040) +[2024-06-18 02:32:01,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42596.8, 300 sec: 41987.6). Total num frames: 660406272. Throughput: 0: 42123.6. Samples: 660464980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:32:01,996][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 02:32:03,451][12883] Updated weights for policy 0, policy_version 40311 (0.0033) +[2024-06-18 02:32:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 660602880. Throughput: 0: 42261.3. Samples: 660722000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:32:06,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 02:32:07,067][12883] Updated weights for policy 0, policy_version 40321 (0.0039) +[2024-06-18 02:32:11,158][12883] Updated weights for policy 0, policy_version 40331 (0.0031) +[2024-06-18 02:32:11,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 660815872. Throughput: 0: 42140.9. Samples: 660970500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:32:11,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:32:15,061][12883] Updated weights for policy 0, policy_version 40341 (0.0036) +[2024-06-18 02:32:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42326.8, 300 sec: 41987.5). Total num frames: 661028864. Throughput: 0: 42070.4. Samples: 661096980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:32:16,995][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 02:32:18,784][12883] Updated weights for policy 0, policy_version 40351 (0.0041) +[2024-06-18 02:32:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 661209088. Throughput: 0: 42214.1. Samples: 661354720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:32:21,994][12645] Avg episode reward: [(0, '0.112')] +[2024-06-18 02:32:22,747][12883] Updated weights for policy 0, policy_version 40361 (0.0027) +[2024-06-18 02:32:26,779][12883] Updated weights for policy 0, policy_version 40371 (0.0032) +[2024-06-18 02:32:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 661438464. Throughput: 0: 42171.1. Samples: 661600400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:32:26,994][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 02:32:30,482][12883] Updated weights for policy 0, policy_version 40381 (0.0033) +[2024-06-18 02:32:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42326.9, 300 sec: 42043.0). Total num frames: 661667840. Throughput: 0: 42167.1. Samples: 661728160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:32:31,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:32:34,710][12883] Updated weights for policy 0, policy_version 40391 (0.0036) +[2024-06-18 02:32:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41780.8, 300 sec: 41820.9). Total num frames: 661848064. Throughput: 0: 42147.1. Samples: 661983800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 02:32:36,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 02:32:37,141][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040397_661864448.pth... +[2024-06-18 02:32:37,202][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039783_651804672.pth +[2024-06-18 02:32:38,092][12883] Updated weights for policy 0, policy_version 40401 (0.0026) +[2024-06-18 02:32:41,995][12645] Fps is (10 sec: 40956.3, 60 sec: 42324.7, 300 sec: 41876.3). Total num frames: 662077440. Throughput: 0: 42163.6. Samples: 662237460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:32:41,995][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 02:32:42,457][12883] Updated weights for policy 0, policy_version 40411 (0.0037) +[2024-06-18 02:32:46,059][12883] Updated weights for policy 0, policy_version 40421 (0.0025) +[2024-06-18 02:32:46,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 662323200. Throughput: 0: 42312.8. Samples: 662368960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:32:46,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 02:32:50,163][12883] Updated weights for policy 0, policy_version 40431 (0.0024) +[2024-06-18 02:32:51,994][12645] Fps is (10 sec: 39325.0, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 662470656. Throughput: 0: 42072.9. Samples: 662615280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:32:51,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:32:53,640][12883] Updated weights for policy 0, policy_version 40441 (0.0044) +[2024-06-18 02:32:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 662700032. Throughput: 0: 42103.6. Samples: 662865160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:32:56,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:32:57,947][12883] Updated weights for policy 0, policy_version 40451 (0.0039) +[2024-06-18 02:33:01,503][12883] Updated weights for policy 0, policy_version 40461 (0.0034) +[2024-06-18 02:33:01,994][12645] Fps is (10 sec: 47514.6, 60 sec: 42327.0, 300 sec: 42098.6). Total num frames: 662945792. Throughput: 0: 42272.7. Samples: 662999240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:33:01,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 02:33:05,644][12883] Updated weights for policy 0, policy_version 40471 (0.0037) +[2024-06-18 02:33:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 663126016. Throughput: 0: 42094.8. Samples: 663248980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 02:33:06,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 02:33:09,358][12883] Updated weights for policy 0, policy_version 40481 (0.0028) +[2024-06-18 02:33:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 663339008. Throughput: 0: 42197.9. Samples: 663499300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 02:33:11,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 02:33:13,491][12883] Updated weights for policy 0, policy_version 40491 (0.0038) +[2024-06-18 02:33:16,209][12862] Signal inference workers to stop experience collection... (9450 times) +[2024-06-18 02:33:16,215][12862] Signal inference workers to resume experience collection... (9450 times) +[2024-06-18 02:33:16,242][12883] InferenceWorker_p0-w0: stopping experience collection (9450 times) +[2024-06-18 02:33:16,242][12883] InferenceWorker_p0-w0: resuming experience collection (9450 times) +[2024-06-18 02:33:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 663552000. Throughput: 0: 42174.2. Samples: 663626000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 02:33:16,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 02:33:17,023][12883] Updated weights for policy 0, policy_version 40501 (0.0030) +[2024-06-18 02:33:21,195][12883] Updated weights for policy 0, policy_version 40511 (0.0034) +[2024-06-18 02:33:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 41765.3). Total num frames: 663748608. Throughput: 0: 42132.5. Samples: 663879760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 02:33:21,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:33:24,688][12883] Updated weights for policy 0, policy_version 40521 (0.0037) +[2024-06-18 02:33:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 663977984. Throughput: 0: 42128.4. Samples: 664133200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 02:33:26,994][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 02:33:29,080][12883] Updated weights for policy 0, policy_version 40531 (0.0034) +[2024-06-18 02:33:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41987.8). Total num frames: 664190976. Throughput: 0: 42089.4. Samples: 664262980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:33:31,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 02:33:32,334][12883] Updated weights for policy 0, policy_version 40541 (0.0037) +[2024-06-18 02:33:36,850][12883] Updated weights for policy 0, policy_version 40551 (0.0049) +[2024-06-18 02:33:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 664387584. Throughput: 0: 41998.7. Samples: 664505220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:33:36,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 02:33:40,341][12883] Updated weights for policy 0, policy_version 40561 (0.0041) +[2024-06-18 02:33:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42053.0, 300 sec: 42043.0). Total num frames: 664600576. Throughput: 0: 42139.6. Samples: 664761440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:33:41,994][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 02:33:44,494][12883] Updated weights for policy 0, policy_version 40571 (0.0034) +[2024-06-18 02:33:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41932.0). Total num frames: 664797184. Throughput: 0: 41926.6. Samples: 664885940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:33:46,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 02:33:47,853][12883] Updated weights for policy 0, policy_version 40581 (0.0031) +[2024-06-18 02:33:51,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.9, 300 sec: 42042.7). Total num frames: 665026560. Throughput: 0: 42126.3. Samples: 665144760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:33:51,996][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 02:33:52,186][12883] Updated weights for policy 0, policy_version 40591 (0.0036) +[2024-06-18 02:33:55,736][12883] Updated weights for policy 0, policy_version 40601 (0.0034) +[2024-06-18 02:33:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 665239552. Throughput: 0: 42036.0. Samples: 665390920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:33:56,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 02:33:59,983][12883] Updated weights for policy 0, policy_version 40611 (0.0030) +[2024-06-18 02:34:01,996][12645] Fps is (10 sec: 42598.3, 60 sec: 41777.5, 300 sec: 41987.2). Total num frames: 665452544. Throughput: 0: 42109.0. Samples: 665521000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:34:01,996][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 02:34:03,533][12883] Updated weights for policy 0, policy_version 40621 (0.0027) +[2024-06-18 02:34:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 665649152. Throughput: 0: 42101.8. Samples: 665774340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:34:06,994][12645] Avg episode reward: [(0, '0.064')] +[2024-06-18 02:34:07,812][12883] Updated weights for policy 0, policy_version 40631 (0.0039) +[2024-06-18 02:34:11,166][12883] Updated weights for policy 0, policy_version 40641 (0.0040) +[2024-06-18 02:34:11,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 665878528. Throughput: 0: 42037.8. Samples: 666024900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:34:11,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 02:34:15,559][12883] Updated weights for policy 0, policy_version 40651 (0.0027) +[2024-06-18 02:34:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 666075136. Throughput: 0: 42088.0. Samples: 666156940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:34:16,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:34:19,276][12883] Updated weights for policy 0, policy_version 40661 (0.0030) +[2024-06-18 02:34:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.1, 300 sec: 41987.4). Total num frames: 666271744. Throughput: 0: 42344.4. Samples: 666410720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 02:34:21,994][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 02:34:23,732][12883] Updated weights for policy 0, policy_version 40671 (0.0038) +[2024-06-18 02:34:26,985][12883] Updated weights for policy 0, policy_version 40681 (0.0036) +[2024-06-18 02:34:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 666517504. Throughput: 0: 42295.5. Samples: 666664740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 02:34:26,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:34:31,317][12883] Updated weights for policy 0, policy_version 40691 (0.0049) +[2024-06-18 02:34:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 666714112. Throughput: 0: 42414.3. Samples: 666794580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 02:34:31,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 02:34:32,100][12862] Signal inference workers to stop experience collection... (9500 times) +[2024-06-18 02:34:32,101][12862] Signal inference workers to resume experience collection... (9500 times) +[2024-06-18 02:34:32,118][12883] InferenceWorker_p0-w0: stopping experience collection (9500 times) +[2024-06-18 02:34:32,119][12883] InferenceWorker_p0-w0: resuming experience collection (9500 times) +[2024-06-18 02:34:34,642][12883] Updated weights for policy 0, policy_version 40701 (0.0035) +[2024-06-18 02:34:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 666927104. Throughput: 0: 42235.0. Samples: 667045240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 02:34:36,994][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 02:34:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040706_666927104.pth... +[2024-06-18 02:34:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040090_656834560.pth +[2024-06-18 02:34:38,854][12883] Updated weights for policy 0, policy_version 40711 (0.0031) +[2024-06-18 02:34:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42043.7). Total num frames: 667140096. Throughput: 0: 42346.6. Samples: 667296520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 02:34:41,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 02:34:42,581][12883] Updated weights for policy 0, policy_version 40721 (0.0028) +[2024-06-18 02:34:46,712][12883] Updated weights for policy 0, policy_version 40731 (0.0036) +[2024-06-18 02:34:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 667336704. Throughput: 0: 42356.7. Samples: 667426960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 02:34:46,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 02:34:50,318][12883] Updated weights for policy 0, policy_version 40741 (0.0034) +[2024-06-18 02:34:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42053.9, 300 sec: 41987.5). Total num frames: 667549696. Throughput: 0: 42282.6. Samples: 667677060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:34:51,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 02:34:54,430][12883] Updated weights for policy 0, policy_version 40751 (0.0025) +[2024-06-18 02:34:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 667779072. Throughput: 0: 42299.0. Samples: 667928360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:34:56,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 02:34:57,991][12883] Updated weights for policy 0, policy_version 40761 (0.0039) +[2024-06-18 02:35:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42327.0, 300 sec: 42154.1). Total num frames: 667992064. Throughput: 0: 42309.8. Samples: 668060880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:35:01,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:35:01,997][12883] Updated weights for policy 0, policy_version 40771 (0.0033) +[2024-06-18 02:35:05,488][12883] Updated weights for policy 0, policy_version 40781 (0.0040) +[2024-06-18 02:35:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 668188672. Throughput: 0: 42228.5. Samples: 668311000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:35:06,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 02:35:09,544][12883] Updated weights for policy 0, policy_version 40791 (0.0039) +[2024-06-18 02:35:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 668418048. Throughput: 0: 42271.5. Samples: 668566960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 02:35:11,999][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:35:13,557][12883] Updated weights for policy 0, policy_version 40801 (0.0036) +[2024-06-18 02:35:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 668631040. Throughput: 0: 42236.9. Samples: 668695240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:35:16,994][12645] Avg episode reward: [(0, '0.013')] +[2024-06-18 02:35:17,246][12883] Updated weights for policy 0, policy_version 40811 (0.0040) +[2024-06-18 02:35:21,339][12883] Updated weights for policy 0, policy_version 40821 (0.0028) +[2024-06-18 02:35:21,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42869.9, 300 sec: 42098.2). Total num frames: 668844032. Throughput: 0: 42325.5. Samples: 668949980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:35:21,997][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:35:25,043][12883] Updated weights for policy 0, policy_version 40831 (0.0029) +[2024-06-18 02:35:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 669057024. Throughput: 0: 42286.7. Samples: 669199420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:35:26,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:35:29,057][12883] Updated weights for policy 0, policy_version 40841 (0.0041) +[2024-06-18 02:35:31,994][12645] Fps is (10 sec: 39330.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 669237248. Throughput: 0: 42250.3. Samples: 669328220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:35:31,994][12645] Avg episode reward: [(0, '0.126')] +[2024-06-18 02:35:32,597][12883] Updated weights for policy 0, policy_version 40851 (0.0035) +[2024-06-18 02:35:36,836][12883] Updated weights for policy 0, policy_version 40861 (0.0044) +[2024-06-18 02:35:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 669466624. Throughput: 0: 42426.2. Samples: 669586240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 02:35:36,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:35:40,489][12883] Updated weights for policy 0, policy_version 40871 (0.0033) +[2024-06-18 02:35:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 669679616. Throughput: 0: 42303.7. Samples: 669832020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 02:35:41,994][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 02:35:44,622][12883] Updated weights for policy 0, policy_version 40881 (0.0028) +[2024-06-18 02:35:46,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42323.8, 300 sec: 42098.2). Total num frames: 669876224. Throughput: 0: 42337.4. Samples: 669966160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 02:35:46,996][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:35:48,370][12883] Updated weights for policy 0, policy_version 40891 (0.0033) +[2024-06-18 02:35:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 670089216. Throughput: 0: 42424.1. Samples: 670220080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 02:35:51,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 02:35:52,503][12883] Updated weights for policy 0, policy_version 40901 (0.0038) +[2024-06-18 02:35:56,174][12883] Updated weights for policy 0, policy_version 40911 (0.0034) +[2024-06-18 02:35:56,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 670318592. Throughput: 0: 42240.9. Samples: 670467800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 02:35:56,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:36:00,250][12883] Updated weights for policy 0, policy_version 40921 (0.0041) +[2024-06-18 02:36:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 670531584. Throughput: 0: 42393.3. Samples: 670602940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 02:36:01,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 02:36:03,728][12883] Updated weights for policy 0, policy_version 40931 (0.0029) +[2024-06-18 02:36:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 670711808. Throughput: 0: 42172.3. Samples: 670847640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 02:36:06,994][12645] Avg episode reward: [(0, '0.124')] +[2024-06-18 02:36:08,359][12883] Updated weights for policy 0, policy_version 40941 (0.0033) +[2024-06-18 02:36:10,224][12862] Signal inference workers to stop experience collection... (9550 times) +[2024-06-18 02:36:10,264][12883] InferenceWorker_p0-w0: stopping experience collection (9550 times) +[2024-06-18 02:36:10,271][12862] Signal inference workers to resume experience collection... (9550 times) +[2024-06-18 02:36:10,281][12883] InferenceWorker_p0-w0: resuming experience collection (9550 times) +[2024-06-18 02:36:11,856][12883] Updated weights for policy 0, policy_version 40951 (0.0039) +[2024-06-18 02:36:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42209.9). Total num frames: 670941184. Throughput: 0: 42227.5. Samples: 671099660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:36:11,994][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 02:36:16,015][12883] Updated weights for policy 0, policy_version 40961 (0.0041) +[2024-06-18 02:36:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 671170560. Throughput: 0: 42233.3. Samples: 671228720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:36:16,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 02:36:19,765][12883] Updated weights for policy 0, policy_version 40971 (0.0037) +[2024-06-18 02:36:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42053.7, 300 sec: 42265.1). Total num frames: 671367168. Throughput: 0: 42015.0. Samples: 671476920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:36:21,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 02:36:23,809][12883] Updated weights for policy 0, policy_version 40981 (0.0032) +[2024-06-18 02:36:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42210.0). Total num frames: 671580160. Throughput: 0: 42224.4. Samples: 671732120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:36:26,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 02:36:27,564][12883] Updated weights for policy 0, policy_version 40991 (0.0037) +[2024-06-18 02:36:31,423][12883] Updated weights for policy 0, policy_version 41001 (0.0042) +[2024-06-18 02:36:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42154.4). Total num frames: 671776768. Throughput: 0: 42087.4. Samples: 671860000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:36:31,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 02:36:35,240][12883] Updated weights for policy 0, policy_version 41011 (0.0047) +[2024-06-18 02:36:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 671989760. Throughput: 0: 42134.7. Samples: 672116140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 02:36:36,994][12645] Avg episode reward: [(0, '0.103')] +[2024-06-18 02:36:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041016_672006144.pth... +[2024-06-18 02:36:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040397_661864448.pth +[2024-06-18 02:36:39,422][12883] Updated weights for policy 0, policy_version 41021 (0.0041) +[2024-06-18 02:36:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 672219136. Throughput: 0: 42123.6. Samples: 672363360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 02:36:41,994][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 02:36:42,809][12883] Updated weights for policy 0, policy_version 41031 (0.0032) +[2024-06-18 02:36:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42053.8, 300 sec: 42154.1). Total num frames: 672399360. Throughput: 0: 42026.2. Samples: 672494120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 02:36:46,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:36:47,010][12883] Updated weights for policy 0, policy_version 41041 (0.0023) +[2024-06-18 02:36:50,938][12883] Updated weights for policy 0, policy_version 41051 (0.0038) +[2024-06-18 02:36:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 672628736. Throughput: 0: 42059.2. Samples: 672740300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 02:36:51,994][12645] Avg episode reward: [(0, '0.082')] +[2024-06-18 02:36:54,627][12883] Updated weights for policy 0, policy_version 41061 (0.0040) +[2024-06-18 02:36:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 672841728. Throughput: 0: 42105.9. Samples: 672994420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 02:36:56,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 02:36:58,639][12883] Updated weights for policy 0, policy_version 41071 (0.0035) +[2024-06-18 02:37:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 673038336. Throughput: 0: 42071.0. Samples: 673121920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:37:01,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 02:37:02,438][12883] Updated weights for policy 0, policy_version 41081 (0.0033) +[2024-06-18 02:37:06,389][12883] Updated weights for policy 0, policy_version 41091 (0.0034) +[2024-06-18 02:37:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 673267712. Throughput: 0: 42109.9. Samples: 673371860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:37:06,994][12645] Avg episode reward: [(0, '0.098')] +[2024-06-18 02:37:10,351][12883] Updated weights for policy 0, policy_version 41101 (0.0046) +[2024-06-18 02:37:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 673480704. Throughput: 0: 41867.9. Samples: 673616180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:37:11,994][12645] Avg episode reward: [(0, '0.104')] +[2024-06-18 02:37:14,127][12883] Updated weights for policy 0, policy_version 41111 (0.0025) +[2024-06-18 02:37:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 42209.7). Total num frames: 673660928. Throughput: 0: 41934.3. Samples: 673747040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:37:16,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 02:37:18,180][12883] Updated weights for policy 0, policy_version 41121 (0.0036) +[2024-06-18 02:37:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 673873920. Throughput: 0: 41863.5. Samples: 674000000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 02:37:21,994][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 02:37:22,434][12883] Updated weights for policy 0, policy_version 41131 (0.0037) +[2024-06-18 02:37:25,881][12883] Updated weights for policy 0, policy_version 41141 (0.0039) +[2024-06-18 02:37:26,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 674119680. Throughput: 0: 41915.4. Samples: 674249560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:37:26,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 02:37:30,173][12883] Updated weights for policy 0, policy_version 41151 (0.0033) +[2024-06-18 02:37:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 674299904. Throughput: 0: 42017.3. Samples: 674384900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:37:31,994][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 02:37:33,751][12883] Updated weights for policy 0, policy_version 41161 (0.0034) +[2024-06-18 02:37:34,479][12862] Signal inference workers to stop experience collection... (9600 times) +[2024-06-18 02:37:34,480][12862] Signal inference workers to resume experience collection... (9600 times) +[2024-06-18 02:37:34,504][12883] InferenceWorker_p0-w0: stopping experience collection (9600 times) +[2024-06-18 02:37:34,505][12883] InferenceWorker_p0-w0: resuming experience collection (9600 times) +[2024-06-18 02:37:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42209.8). Total num frames: 674529280. Throughput: 0: 41992.0. Samples: 674629940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:37:36,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 02:37:37,732][12883] Updated weights for policy 0, policy_version 41171 (0.0048) +[2024-06-18 02:37:41,489][12883] Updated weights for policy 0, policy_version 41181 (0.0022) +[2024-06-18 02:37:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 674725888. Throughput: 0: 42074.7. Samples: 674887780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:37:41,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 02:37:45,748][12883] Updated weights for policy 0, policy_version 41191 (0.0042) +[2024-06-18 02:37:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 674922496. Throughput: 0: 41896.5. Samples: 675007260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:37:46,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 02:37:49,444][12883] Updated weights for policy 0, policy_version 41201 (0.0030) +[2024-06-18 02:37:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 675184640. Throughput: 0: 41918.1. Samples: 675258180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:37:51,994][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 02:37:53,456][12883] Updated weights for policy 0, policy_version 41211 (0.0039) +[2024-06-18 02:37:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.0, 300 sec: 41987.4). Total num frames: 675332096. Throughput: 0: 42375.1. Samples: 675523060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:37:56,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:37:57,436][12883] Updated weights for policy 0, policy_version 41221 (0.0034) +[2024-06-18 02:38:00,996][12883] Updated weights for policy 0, policy_version 41231 (0.0042) +[2024-06-18 02:38:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 675561472. Throughput: 0: 41891.4. Samples: 675632160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:38:02,000][12645] Avg episode reward: [(0, '0.067')] +[2024-06-18 02:38:05,147][12883] Updated weights for policy 0, policy_version 41241 (0.0033) +[2024-06-18 02:38:06,994][12645] Fps is (10 sec: 49152.6, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 675823616. Throughput: 0: 42217.3. Samples: 675899780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:38:06,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 02:38:08,693][12883] Updated weights for policy 0, policy_version 41251 (0.0026) +[2024-06-18 02:38:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 675954688. Throughput: 0: 42495.2. Samples: 676161840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:38:11,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 02:38:12,828][12883] Updated weights for policy 0, policy_version 41261 (0.0033) +[2024-06-18 02:38:16,736][12883] Updated weights for policy 0, policy_version 41271 (0.0047) +[2024-06-18 02:38:16,994][12645] Fps is (10 sec: 36044.3, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 676184064. Throughput: 0: 41972.7. Samples: 676273680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:38:16,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:38:20,517][12883] Updated weights for policy 0, policy_version 41281 (0.0035) +[2024-06-18 02:38:21,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 676429824. Throughput: 0: 42452.4. Samples: 676540300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 02:38:21,994][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 02:38:24,229][12883] Updated weights for policy 0, policy_version 41291 (0.0040) +[2024-06-18 02:38:26,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 676610048. Throughput: 0: 42447.5. Samples: 676797920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 02:38:26,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 02:38:28,087][12883] Updated weights for policy 0, policy_version 41301 (0.0043) +[2024-06-18 02:38:31,617][12883] Updated weights for policy 0, policy_version 41311 (0.0026) +[2024-06-18 02:38:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 676839424. Throughput: 0: 42436.1. Samples: 676916880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 02:38:31,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 02:38:35,673][12883] Updated weights for policy 0, policy_version 41321 (0.0029) +[2024-06-18 02:38:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 677052416. Throughput: 0: 42498.7. Samples: 677170620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 02:38:36,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 02:38:37,124][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041325_677068800.pth... +[2024-06-18 02:38:37,176][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040706_666927104.pth +[2024-06-18 02:38:38,074][12862] Signal inference workers to stop experience collection... (9650 times) +[2024-06-18 02:38:38,074][12862] Signal inference workers to resume experience collection... (9650 times) +[2024-06-18 02:38:38,118][12883] InferenceWorker_p0-w0: stopping experience collection (9650 times) +[2024-06-18 02:38:38,118][12883] InferenceWorker_p0-w0: resuming experience collection (9650 times) +[2024-06-18 02:38:39,355][12883] Updated weights for policy 0, policy_version 41331 (0.0035) +[2024-06-18 02:38:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 677249024. Throughput: 0: 42256.2. Samples: 677424580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 02:38:41,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 02:38:43,395][12883] Updated weights for policy 0, policy_version 41341 (0.0030) +[2024-06-18 02:38:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42154.4). Total num frames: 677462016. Throughput: 0: 42540.1. Samples: 677546460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:38:46,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:38:47,547][12883] Updated weights for policy 0, policy_version 41351 (0.0037) +[2024-06-18 02:38:50,992][12883] Updated weights for policy 0, policy_version 41361 (0.0042) +[2024-06-18 02:38:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 677707776. Throughput: 0: 42469.8. Samples: 677810920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:38:51,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:38:55,007][12883] Updated weights for policy 0, policy_version 41371 (0.0030) +[2024-06-18 02:38:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42210.0). Total num frames: 677904384. Throughput: 0: 42178.3. Samples: 678059860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:38:56,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 02:38:58,584][12883] Updated weights for policy 0, policy_version 41381 (0.0030) +[2024-06-18 02:39:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 678100992. Throughput: 0: 42482.3. Samples: 678185380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:39:01,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 02:39:02,936][12883] Updated weights for policy 0, policy_version 41391 (0.0038) +[2024-06-18 02:39:06,281][12883] Updated weights for policy 0, policy_version 41401 (0.0032) +[2024-06-18 02:39:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 678330368. Throughput: 0: 42280.8. Samples: 678442940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:39:06,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 02:39:07,004][12862] Saving new best policy, reward=0.193! +[2024-06-18 02:39:10,841][12883] Updated weights for policy 0, policy_version 41411 (0.0031) +[2024-06-18 02:39:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42265.1). Total num frames: 678543360. Throughput: 0: 42078.5. Samples: 678691460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:39:11,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 02:39:13,903][12883] Updated weights for policy 0, policy_version 41421 (0.0031) +[2024-06-18 02:39:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 678739968. Throughput: 0: 42339.9. Samples: 678822180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:39:16,994][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 02:39:18,519][12883] Updated weights for policy 0, policy_version 41431 (0.0041) +[2024-06-18 02:39:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 678952960. Throughput: 0: 42229.8. Samples: 679070960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:39:21,996][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:39:22,115][12883] Updated weights for policy 0, policy_version 41441 (0.0040) +[2024-06-18 02:39:26,313][12883] Updated weights for policy 0, policy_version 41451 (0.0046) +[2024-06-18 02:39:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 679165952. Throughput: 0: 42324.5. Samples: 679329180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:39:26,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 02:39:29,636][12883] Updated weights for policy 0, policy_version 41461 (0.0032) +[2024-06-18 02:39:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 679362560. Throughput: 0: 42357.6. Samples: 679452560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:39:31,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 02:39:33,907][12883] Updated weights for policy 0, policy_version 41471 (0.0034) +[2024-06-18 02:39:36,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42323.8, 300 sec: 42209.3). Total num frames: 679591936. Throughput: 0: 42162.3. Samples: 679708320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:39:36,996][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 02:39:37,363][12883] Updated weights for policy 0, policy_version 41481 (0.0037) +[2024-06-18 02:39:41,631][12883] Updated weights for policy 0, policy_version 41491 (0.0044) +[2024-06-18 02:39:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 679788544. Throughput: 0: 42252.7. Samples: 679961240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:39:41,994][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 02:39:45,365][12883] Updated weights for policy 0, policy_version 41501 (0.0032) +[2024-06-18 02:39:46,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 679985152. Throughput: 0: 42256.5. Samples: 680086920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:39:46,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 02:39:49,279][12883] Updated weights for policy 0, policy_version 41511 (0.0044) +[2024-06-18 02:39:51,998][12645] Fps is (10 sec: 44219.9, 60 sec: 42049.5, 300 sec: 42209.1). Total num frames: 680230912. Throughput: 0: 42197.7. Samples: 680342000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:39:51,998][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 02:39:52,961][12883] Updated weights for policy 0, policy_version 41521 (0.0036) +[2024-06-18 02:39:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 680427520. Throughput: 0: 42247.2. Samples: 680592580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:39:56,995][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 02:39:57,127][12883] Updated weights for policy 0, policy_version 41531 (0.0040) +[2024-06-18 02:40:01,280][12883] Updated weights for policy 0, policy_version 41541 (0.0036) +[2024-06-18 02:40:01,996][12645] Fps is (10 sec: 40967.1, 60 sec: 42323.8, 300 sec: 42209.3). Total num frames: 680640512. Throughput: 0: 42157.6. Samples: 680719360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 02:40:01,996][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 02:40:04,886][12883] Updated weights for policy 0, policy_version 41551 (0.0036) +[2024-06-18 02:40:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 680853504. Throughput: 0: 42291.2. Samples: 680974060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) +[2024-06-18 02:40:06,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:40:09,140][12883] Updated weights for policy 0, policy_version 41561 (0.0024) +[2024-06-18 02:40:11,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 681066496. Throughput: 0: 42135.5. Samples: 681225280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) +[2024-06-18 02:40:11,994][12645] Avg episode reward: [(0, '0.072')] +[2024-06-18 02:40:12,694][12883] Updated weights for policy 0, policy_version 41571 (0.0038) +[2024-06-18 02:40:15,970][12862] Signal inference workers to stop experience collection... (9700 times) +[2024-06-18 02:40:15,970][12862] Signal inference workers to resume experience collection... (9700 times) +[2024-06-18 02:40:16,010][12883] InferenceWorker_p0-w0: stopping experience collection (9700 times) +[2024-06-18 02:40:16,010][12883] InferenceWorker_p0-w0: resuming experience collection (9700 times) +[2024-06-18 02:40:16,912][12883] Updated weights for policy 0, policy_version 41581 (0.0045) +[2024-06-18 02:40:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42098.9). Total num frames: 681263104. Throughput: 0: 42093.0. Samples: 681346740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) +[2024-06-18 02:40:16,994][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 02:40:20,499][12883] Updated weights for policy 0, policy_version 41591 (0.0038) +[2024-06-18 02:40:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 681492480. Throughput: 0: 42029.3. Samples: 681599540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) +[2024-06-18 02:40:21,994][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 02:40:24,567][12883] Updated weights for policy 0, policy_version 41601 (0.0033) +[2024-06-18 02:40:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 681705472. Throughput: 0: 42141.5. Samples: 681857600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) +[2024-06-18 02:40:26,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:40:28,366][12883] Updated weights for policy 0, policy_version 41611 (0.0034) +[2024-06-18 02:40:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 681902080. Throughput: 0: 42017.8. Samples: 681977720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:40:31,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 02:40:32,292][12883] Updated weights for policy 0, policy_version 41621 (0.0031) +[2024-06-18 02:40:36,147][12883] Updated weights for policy 0, policy_version 41631 (0.0034) +[2024-06-18 02:40:36,997][12645] Fps is (10 sec: 40945.7, 60 sec: 42051.4, 300 sec: 42153.6). Total num frames: 682115072. Throughput: 0: 42078.3. Samples: 682235500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:40:36,998][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 02:40:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041633_682115072.pth... +[2024-06-18 02:40:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041016_672006144.pth +[2024-06-18 02:40:39,744][12883] Updated weights for policy 0, policy_version 41641 (0.0033) +[2024-06-18 02:40:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42210.0). Total num frames: 682328064. Throughput: 0: 42113.8. Samples: 682487700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:40:41,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 02:40:43,923][12883] Updated weights for policy 0, policy_version 41651 (0.0031) +[2024-06-18 02:40:46,994][12645] Fps is (10 sec: 44252.0, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 682557440. Throughput: 0: 42133.2. Samples: 682615260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:40:46,994][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 02:40:47,490][12883] Updated weights for policy 0, policy_version 41661 (0.0027) +[2024-06-18 02:40:51,885][12883] Updated weights for policy 0, policy_version 41671 (0.0030) +[2024-06-18 02:40:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41782.0, 300 sec: 42098.6). Total num frames: 682737664. Throughput: 0: 41993.8. Samples: 682863780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:40:51,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 02:40:55,405][12883] Updated weights for policy 0, policy_version 41681 (0.0029) +[2024-06-18 02:40:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 682950656. Throughput: 0: 42007.2. Samples: 683115600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:40:56,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 02:40:59,570][12883] Updated weights for policy 0, policy_version 41691 (0.0049) +[2024-06-18 02:41:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 683180032. Throughput: 0: 42177.3. Samples: 683244720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:41:01,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 02:41:03,238][12883] Updated weights for policy 0, policy_version 41701 (0.0028) +[2024-06-18 02:41:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 683360256. Throughput: 0: 42180.4. Samples: 683497660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:41:06,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 02:41:07,064][12862] Saving new best policy, reward=0.209! +[2024-06-18 02:41:07,390][12883] Updated weights for policy 0, policy_version 41711 (0.0024) +[2024-06-18 02:41:11,038][12883] Updated weights for policy 0, policy_version 41721 (0.0036) +[2024-06-18 02:41:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 683573248. Throughput: 0: 41980.8. Samples: 683746740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:41:11,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 02:41:15,339][12883] Updated weights for policy 0, policy_version 41731 (0.0034) +[2024-06-18 02:41:16,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 683819008. Throughput: 0: 42151.9. Samples: 683874560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:41:16,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 02:41:18,723][12883] Updated weights for policy 0, policy_version 41741 (0.0034) +[2024-06-18 02:41:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 684015616. Throughput: 0: 41999.2. Samples: 684125320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 02:41:21,994][12645] Avg episode reward: [(0, '0.053')] +[2024-06-18 02:41:23,257][12883] Updated weights for policy 0, policy_version 41751 (0.0039) +[2024-06-18 02:41:26,377][12883] Updated weights for policy 0, policy_version 41761 (0.0044) +[2024-06-18 02:41:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 684228608. Throughput: 0: 41946.5. Samples: 684375300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:41:26,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 02:41:31,065][12883] Updated weights for policy 0, policy_version 41771 (0.0033) +[2024-06-18 02:41:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 684425216. Throughput: 0: 41972.9. Samples: 684504040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:41:31,994][12645] Avg episode reward: [(0, '0.104')] +[2024-06-18 02:41:34,348][12883] Updated weights for policy 0, policy_version 41781 (0.0047) +[2024-06-18 02:41:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42327.8, 300 sec: 42154.1). Total num frames: 684654592. Throughput: 0: 42066.2. Samples: 684756760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:41:36,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:41:38,758][12883] Updated weights for policy 0, policy_version 41791 (0.0035) +[2024-06-18 02:41:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 684851200. Throughput: 0: 42071.6. Samples: 685008820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:41:41,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 02:41:42,120][12883] Updated weights for policy 0, policy_version 41801 (0.0047) +[2024-06-18 02:41:46,354][12883] Updated weights for policy 0, policy_version 41811 (0.0029) +[2024-06-18 02:41:46,994][12645] Fps is (10 sec: 39320.9, 60 sec: 41506.0, 300 sec: 42098.5). Total num frames: 685047808. Throughput: 0: 41887.9. Samples: 685129680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 02:41:46,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 02:41:49,972][12883] Updated weights for policy 0, policy_version 41821 (0.0034) +[2024-06-18 02:41:51,160][12862] Signal inference workers to stop experience collection... (9750 times) +[2024-06-18 02:41:51,160][12862] Signal inference workers to resume experience collection... (9750 times) +[2024-06-18 02:41:51,173][12883] InferenceWorker_p0-w0: stopping experience collection (9750 times) +[2024-06-18 02:41:51,174][12883] InferenceWorker_p0-w0: resuming experience collection (9750 times) +[2024-06-18 02:41:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 685277184. Throughput: 0: 41962.7. Samples: 685385980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:41:51,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 02:41:54,069][12883] Updated weights for policy 0, policy_version 41831 (0.0037) +[2024-06-18 02:41:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 685473792. Throughput: 0: 42104.5. Samples: 685641440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:41:56,994][12645] Avg episode reward: [(0, '0.016')] +[2024-06-18 02:41:58,045][12883] Updated weights for policy 0, policy_version 41841 (0.0035) +[2024-06-18 02:42:01,783][12883] Updated weights for policy 0, policy_version 41851 (0.0037) +[2024-06-18 02:42:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 685703168. Throughput: 0: 41858.2. Samples: 685758180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:42:01,994][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 02:42:05,646][12883] Updated weights for policy 0, policy_version 41861 (0.0030) +[2024-06-18 02:42:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 685916160. Throughput: 0: 41978.2. Samples: 686014340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:42:06,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:42:09,465][12883] Updated weights for policy 0, policy_version 41871 (0.0029) +[2024-06-18 02:42:11,998][12645] Fps is (10 sec: 39306.6, 60 sec: 42049.6, 300 sec: 42153.5). Total num frames: 686096384. Throughput: 0: 42147.6. Samples: 686272100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:42:11,998][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 02:42:13,463][12883] Updated weights for policy 0, policy_version 41881 (0.0026) +[2024-06-18 02:42:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 686325760. Throughput: 0: 41831.9. Samples: 686386480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:42:16,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 02:42:17,207][12883] Updated weights for policy 0, policy_version 41891 (0.0029) +[2024-06-18 02:42:21,600][12883] Updated weights for policy 0, policy_version 41901 (0.0030) +[2024-06-18 02:42:21,994][12645] Fps is (10 sec: 42615.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 686522368. Throughput: 0: 41909.8. Samples: 686642700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:42:21,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 02:42:24,970][12883] Updated weights for policy 0, policy_version 41911 (0.0040) +[2024-06-18 02:42:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 686735360. Throughput: 0: 41875.9. Samples: 686893240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:42:26,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 02:42:29,285][12883] Updated weights for policy 0, policy_version 41921 (0.0033) +[2024-06-18 02:42:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 686964736. Throughput: 0: 41977.5. Samples: 687018660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:42:31,994][12645] Avg episode reward: [(0, '0.126')] +[2024-06-18 02:42:32,594][12883] Updated weights for policy 0, policy_version 41931 (0.0031) +[2024-06-18 02:42:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 687128576. Throughput: 0: 41962.3. Samples: 687274280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:42:36,994][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 02:42:37,166][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041941_687161344.pth... +[2024-06-18 02:42:37,170][12883] Updated weights for policy 0, policy_version 41941 (0.0037) +[2024-06-18 02:42:37,222][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041325_677068800.pth +[2024-06-18 02:42:40,398][12883] Updated weights for policy 0, policy_version 41951 (0.0027) +[2024-06-18 02:42:41,994][12645] Fps is (10 sec: 37681.3, 60 sec: 41505.7, 300 sec: 42098.5). Total num frames: 687341568. Throughput: 0: 41801.3. Samples: 687522520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 02:42:41,995][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 02:42:45,179][12883] Updated weights for policy 0, policy_version 41961 (0.0030) +[2024-06-18 02:42:46,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 687587328. Throughput: 0: 41995.2. Samples: 687647960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:42:46,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 02:42:48,142][12883] Updated weights for policy 0, policy_version 41971 (0.0036) +[2024-06-18 02:42:51,994][12645] Fps is (10 sec: 40961.6, 60 sec: 41233.0, 300 sec: 42098.6). Total num frames: 687751168. Throughput: 0: 41930.2. Samples: 687901200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:42:51,995][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 02:42:52,771][12883] Updated weights for policy 0, policy_version 41981 (0.0043) +[2024-06-18 02:42:56,128][12883] Updated weights for policy 0, policy_version 41991 (0.0032) +[2024-06-18 02:42:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 687980544. Throughput: 0: 41756.9. Samples: 688151000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:42:56,994][12645] Avg episode reward: [(0, '0.070')] +[2024-06-18 02:43:00,662][12883] Updated weights for policy 0, policy_version 42001 (0.0038) +[2024-06-18 02:43:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 688209920. Throughput: 0: 42094.2. Samples: 688280720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:43:01,995][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 02:43:04,297][12883] Updated weights for policy 0, policy_version 42011 (0.0032) +[2024-06-18 02:43:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 42154.1). Total num frames: 688390144. Throughput: 0: 41776.8. Samples: 688522660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:43:06,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:43:08,710][12883] Updated weights for policy 0, policy_version 42021 (0.0030) +[2024-06-18 02:43:10,289][12862] Signal inference workers to stop experience collection... (9800 times) +[2024-06-18 02:43:10,316][12883] InferenceWorker_p0-w0: stopping experience collection (9800 times) +[2024-06-18 02:43:10,400][12862] Signal inference workers to resume experience collection... (9800 times) +[2024-06-18 02:43:10,400][12883] InferenceWorker_p0-w0: resuming experience collection (9800 times) +[2024-06-18 02:43:11,935][12883] Updated weights for policy 0, policy_version 42031 (0.0022) +[2024-06-18 02:43:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42328.0, 300 sec: 42209.6). Total num frames: 688635904. Throughput: 0: 41777.7. Samples: 688773240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 02:43:11,994][12645] Avg episode reward: [(0, '0.119')] +[2024-06-18 02:43:16,690][12883] Updated weights for policy 0, policy_version 42041 (0.0034) +[2024-06-18 02:43:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 688816128. Throughput: 0: 41833.8. Samples: 688901180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 02:43:16,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 02:43:19,909][12883] Updated weights for policy 0, policy_version 42051 (0.0034) +[2024-06-18 02:43:21,996][12645] Fps is (10 sec: 39312.9, 60 sec: 41777.6, 300 sec: 42098.2). Total num frames: 689029120. Throughput: 0: 41624.5. Samples: 689147480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 02:43:21,997][12645] Avg episode reward: [(0, '0.098')] +[2024-06-18 02:43:24,213][12883] Updated weights for policy 0, policy_version 42061 (0.0037) +[2024-06-18 02:43:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 689274880. Throughput: 0: 41630.2. Samples: 689395860. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 02:43:26,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 02:43:27,407][12883] Updated weights for policy 0, policy_version 42071 (0.0033) +[2024-06-18 02:43:31,994][12645] Fps is (10 sec: 40969.8, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 689438720. Throughput: 0: 41787.2. Samples: 689528380. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 02:43:31,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:43:32,352][12883] Updated weights for policy 0, policy_version 42081 (0.0027) +[2024-06-18 02:43:35,167][12883] Updated weights for policy 0, policy_version 42091 (0.0033) +[2024-06-18 02:43:36,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 689651712. Throughput: 0: 41548.4. Samples: 689770880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) +[2024-06-18 02:43:36,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 02:43:40,262][12883] Updated weights for policy 0, policy_version 42101 (0.0030) +[2024-06-18 02:43:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.6, 300 sec: 42043.0). Total num frames: 689864704. Throughput: 0: 41685.4. Samples: 690026840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 02:43:41,994][12645] Avg episode reward: [(0, '0.036')] +[2024-06-18 02:43:43,139][12883] Updated weights for policy 0, policy_version 42111 (0.0044) +[2024-06-18 02:43:46,997][12645] Fps is (10 sec: 42586.5, 60 sec: 41504.1, 300 sec: 41931.5). Total num frames: 690077696. Throughput: 0: 41596.5. Samples: 690152680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 02:43:46,997][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 02:43:47,790][12883] Updated weights for policy 0, policy_version 42121 (0.0029) +[2024-06-18 02:43:51,162][12883] Updated weights for policy 0, policy_version 42131 (0.0027) +[2024-06-18 02:43:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 690307072. Throughput: 0: 41766.2. Samples: 690402140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 02:43:51,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 02:43:55,642][12883] Updated weights for policy 0, policy_version 42141 (0.0035) +[2024-06-18 02:43:56,994][12645] Fps is (10 sec: 42610.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 690503680. Throughput: 0: 41828.9. Samples: 690655540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 02:43:56,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 02:43:58,948][12883] Updated weights for policy 0, policy_version 42151 (0.0056) +[2024-06-18 02:44:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 690683904. Throughput: 0: 41630.6. Samples: 690774560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 02:44:01,994][12645] Avg episode reward: [(0, '0.044')] +[2024-06-18 02:44:03,377][12883] Updated weights for policy 0, policy_version 42161 (0.0040) +[2024-06-18 02:44:06,637][12883] Updated weights for policy 0, policy_version 42171 (0.0033) +[2024-06-18 02:44:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 690929664. Throughput: 0: 41855.0. Samples: 691030860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 02:44:06,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 02:44:11,344][12883] Updated weights for policy 0, policy_version 42181 (0.0034) +[2024-06-18 02:44:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 691109888. Throughput: 0: 41896.9. Samples: 691281220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 02:44:11,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 02:44:14,643][12883] Updated weights for policy 0, policy_version 42191 (0.0033) +[2024-06-18 02:44:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 691306496. Throughput: 0: 41595.4. Samples: 691400180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 02:44:16,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 02:44:19,507][12883] Updated weights for policy 0, policy_version 42201 (0.0051) +[2024-06-18 02:44:21,996][12645] Fps is (10 sec: 42588.9, 60 sec: 41779.2, 300 sec: 41931.6). Total num frames: 691535872. Throughput: 0: 41767.8. Samples: 691650520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 02:44:21,996][12645] Avg episode reward: [(0, '0.044')] +[2024-06-18 02:44:22,562][12883] Updated weights for policy 0, policy_version 42211 (0.0043) +[2024-06-18 02:44:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40959.9, 300 sec: 41931.9). Total num frames: 691732480. Throughput: 0: 41855.0. Samples: 691910320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 02:44:26,995][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 02:44:27,110][12883] Updated weights for policy 0, policy_version 42221 (0.0031) +[2024-06-18 02:44:30,574][12883] Updated weights for policy 0, policy_version 42231 (0.0053) +[2024-06-18 02:44:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.2, 300 sec: 41932.2). Total num frames: 691961856. Throughput: 0: 41741.4. Samples: 692030920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:44:31,994][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 02:44:34,600][12883] Updated weights for policy 0, policy_version 42241 (0.0028) +[2024-06-18 02:44:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 692174848. Throughput: 0: 41788.5. Samples: 692282620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:44:36,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 02:44:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042247_692174848.pth... +[2024-06-18 02:44:37,098][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041633_682115072.pth +[2024-06-18 02:44:38,598][12883] Updated weights for policy 0, policy_version 42251 (0.0047) +[2024-06-18 02:44:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 692387840. Throughput: 0: 41867.9. Samples: 692539600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:44:42,000][12645] Avg episode reward: [(0, '0.067')] +[2024-06-18 02:44:42,442][12883] Updated weights for policy 0, policy_version 42261 (0.0034) +[2024-06-18 02:44:46,550][12883] Updated weights for policy 0, policy_version 42271 (0.0038) +[2024-06-18 02:44:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42054.2, 300 sec: 41932.5). Total num frames: 692600832. Throughput: 0: 41906.7. Samples: 692660360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:44:46,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:44:50,072][12883] Updated weights for policy 0, policy_version 42281 (0.0038) +[2024-06-18 02:44:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 692813824. Throughput: 0: 41909.0. Samples: 692916760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:44:51,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 02:44:54,169][12883] Updated weights for policy 0, policy_version 42291 (0.0033) +[2024-06-18 02:44:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41876.7). Total num frames: 692994048. Throughput: 0: 41868.0. Samples: 693165280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:44:56,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 02:44:58,507][12883] Updated weights for policy 0, policy_version 42301 (0.0043) +[2024-06-18 02:44:59,302][12862] Signal inference workers to stop experience collection... (9850 times) +[2024-06-18 02:44:59,302][12862] Signal inference workers to resume experience collection... (9850 times) +[2024-06-18 02:44:59,325][12883] InferenceWorker_p0-w0: stopping experience collection (9850 times) +[2024-06-18 02:44:59,325][12883] InferenceWorker_p0-w0: resuming experience collection (9850 times) +[2024-06-18 02:45:01,879][12883] Updated weights for policy 0, policy_version 42311 (0.0052) +[2024-06-18 02:45:01,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 693223424. Throughput: 0: 41944.0. Samples: 693287660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:45:01,994][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 02:45:06,260][12883] Updated weights for policy 0, policy_version 42321 (0.0047) +[2024-06-18 02:45:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 693420032. Throughput: 0: 41980.4. Samples: 693539540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:45:06,994][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 02:45:09,456][12883] Updated weights for policy 0, policy_version 42331 (0.0047) +[2024-06-18 02:45:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 693633024. Throughput: 0: 41956.6. Samples: 693798360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:45:11,994][12645] Avg episode reward: [(0, '0.022')] +[2024-06-18 02:45:13,990][12883] Updated weights for policy 0, policy_version 42341 (0.0038) +[2024-06-18 02:45:16,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 693862400. Throughput: 0: 41989.8. Samples: 693920460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:45:16,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:45:17,627][12883] Updated weights for policy 0, policy_version 42351 (0.0030) +[2024-06-18 02:45:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41507.7, 300 sec: 41765.3). Total num frames: 694026240. Throughput: 0: 41819.1. Samples: 694164480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 02:45:21,999][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 02:45:22,025][12883] Updated weights for policy 0, policy_version 42361 (0.0031) +[2024-06-18 02:45:25,636][12883] Updated weights for policy 0, policy_version 42371 (0.0032) +[2024-06-18 02:45:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 694255616. Throughput: 0: 41722.2. Samples: 694417100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:45:26,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 02:45:29,657][12883] Updated weights for policy 0, policy_version 42381 (0.0043) +[2024-06-18 02:45:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 41932.4). Total num frames: 694484992. Throughput: 0: 41812.0. Samples: 694541900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:45:31,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 02:45:33,407][12883] Updated weights for policy 0, policy_version 42391 (0.0030) +[2024-06-18 02:45:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 694665216. Throughput: 0: 41709.2. Samples: 694793680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:45:36,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 02:45:37,567][12883] Updated weights for policy 0, policy_version 42401 (0.0027) +[2024-06-18 02:45:41,143][12883] Updated weights for policy 0, policy_version 42411 (0.0035) +[2024-06-18 02:45:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 694894592. Throughput: 0: 41637.7. Samples: 695038980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:45:41,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 02:45:45,199][12883] Updated weights for policy 0, policy_version 42421 (0.0031) +[2024-06-18 02:45:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 695107584. Throughput: 0: 41885.5. Samples: 695172500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:45:46,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 02:45:48,932][12883] Updated weights for policy 0, policy_version 42431 (0.0036) +[2024-06-18 02:45:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 695287808. Throughput: 0: 41959.5. Samples: 695427720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 02:45:51,994][12645] Avg episode reward: [(0, '0.017')] +[2024-06-18 02:45:53,253][12883] Updated weights for policy 0, policy_version 42441 (0.0045) +[2024-06-18 02:45:56,813][12883] Updated weights for policy 0, policy_version 42451 (0.0041) +[2024-06-18 02:45:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 695517184. Throughput: 0: 41662.1. Samples: 695673160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 02:45:56,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:46:00,974][12883] Updated weights for policy 0, policy_version 42461 (0.0043) +[2024-06-18 02:46:01,993][12645] Fps is (10 sec: 42598.8, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 695713792. Throughput: 0: 41736.6. Samples: 695798600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 02:46:01,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:46:04,547][12883] Updated weights for policy 0, policy_version 42471 (0.0039) +[2024-06-18 02:46:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41778.9, 300 sec: 41876.4). Total num frames: 695926784. Throughput: 0: 41930.4. Samples: 696051360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 02:46:06,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 02:46:08,638][12883] Updated weights for policy 0, policy_version 42481 (0.0027) +[2024-06-18 02:46:11,995][12645] Fps is (10 sec: 44228.9, 60 sec: 42051.1, 300 sec: 41820.6). Total num frames: 696156160. Throughput: 0: 41847.0. Samples: 696300280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 02:46:11,996][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 02:46:12,258][12883] Updated weights for policy 0, policy_version 42491 (0.0036) +[2024-06-18 02:46:16,246][12883] Updated weights for policy 0, policy_version 42501 (0.0029) +[2024-06-18 02:46:16,996][12645] Fps is (10 sec: 44227.9, 60 sec: 41777.6, 300 sec: 41876.1). Total num frames: 696369152. Throughput: 0: 42051.3. Samples: 696434300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 02:46:17,005][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 02:46:20,020][12883] Updated weights for policy 0, policy_version 42511 (0.0040) +[2024-06-18 02:46:21,994][12645] Fps is (10 sec: 40966.3, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 696565760. Throughput: 0: 42127.5. Samples: 696689420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:46:21,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:46:24,052][12883] Updated weights for policy 0, policy_version 42521 (0.0038) +[2024-06-18 02:46:26,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 696778752. Throughput: 0: 42176.0. Samples: 696936900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:46:26,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 02:46:27,987][12883] Updated weights for policy 0, policy_version 42531 (0.0037) +[2024-06-18 02:46:31,990][12883] Updated weights for policy 0, policy_version 42541 (0.0029) +[2024-06-18 02:46:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 696991744. Throughput: 0: 42055.1. Samples: 697064980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:46:31,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 02:46:35,884][12883] Updated weights for policy 0, policy_version 42551 (0.0029) +[2024-06-18 02:46:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 697204736. Throughput: 0: 41954.6. Samples: 697315680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:46:36,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 02:46:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042555_697221120.pth... +[2024-06-18 02:46:37,103][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041941_687161344.pth +[2024-06-18 02:46:39,807][12883] Updated weights for policy 0, policy_version 42561 (0.0032) +[2024-06-18 02:46:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 697401344. Throughput: 0: 42119.9. Samples: 697568560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:46:41,995][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 02:46:42,896][12862] Signal inference workers to stop experience collection... (9900 times) +[2024-06-18 02:46:42,921][12883] InferenceWorker_p0-w0: stopping experience collection (9900 times) +[2024-06-18 02:46:43,006][12862] Signal inference workers to resume experience collection... (9900 times) +[2024-06-18 02:46:43,006][12883] InferenceWorker_p0-w0: resuming experience collection (9900 times) +[2024-06-18 02:46:43,371][12883] Updated weights for policy 0, policy_version 42571 (0.0031) +[2024-06-18 02:46:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 697597952. Throughput: 0: 42127.3. Samples: 697694340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 02:46:46,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:46:47,587][12883] Updated weights for policy 0, policy_version 42581 (0.0031) +[2024-06-18 02:46:51,002][12883] Updated weights for policy 0, policy_version 42591 (0.0027) +[2024-06-18 02:46:51,994][12645] Fps is (10 sec: 44238.0, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 697843712. Throughput: 0: 42265.2. Samples: 697953280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 02:46:51,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 02:46:55,340][12883] Updated weights for policy 0, policy_version 42601 (0.0039) +[2024-06-18 02:46:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 698040320. Throughput: 0: 42257.5. Samples: 698201800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 02:46:56,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 02:46:59,078][12883] Updated weights for policy 0, policy_version 42611 (0.0037) +[2024-06-18 02:47:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 698253312. Throughput: 0: 42138.1. Samples: 698330420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 02:47:01,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 02:47:03,074][12883] Updated weights for policy 0, policy_version 42621 (0.0028) +[2024-06-18 02:47:06,904][12883] Updated weights for policy 0, policy_version 42631 (0.0033) +[2024-06-18 02:47:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 41932.5). Total num frames: 698466304. Throughput: 0: 42013.8. Samples: 698580040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 02:47:06,994][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 02:47:10,919][12883] Updated weights for policy 0, policy_version 42641 (0.0037) +[2024-06-18 02:47:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.4, 300 sec: 41876.4). Total num frames: 698679296. Throughput: 0: 42225.3. Samples: 698837040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 02:47:11,994][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 02:47:14,555][12883] Updated weights for policy 0, policy_version 42651 (0.0031) +[2024-06-18 02:47:16,996][12645] Fps is (10 sec: 42589.5, 60 sec: 42052.3, 300 sec: 41931.6). Total num frames: 698892288. Throughput: 0: 42264.6. Samples: 698966980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:47:16,996][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 02:47:18,575][12883] Updated weights for policy 0, policy_version 42661 (0.0026) +[2024-06-18 02:47:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 699105280. Throughput: 0: 42261.3. Samples: 699217440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:47:21,994][12645] Avg episode reward: [(0, '0.137')] +[2024-06-18 02:47:22,172][12883] Updated weights for policy 0, policy_version 42671 (0.0040) +[2024-06-18 02:47:26,159][12883] Updated weights for policy 0, policy_version 42681 (0.0023) +[2024-06-18 02:47:26,998][12645] Fps is (10 sec: 40951.7, 60 sec: 42049.3, 300 sec: 41820.3). Total num frames: 699301888. Throughput: 0: 42388.6. Samples: 699476220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:47:26,998][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 02:47:29,722][12883] Updated weights for policy 0, policy_version 42691 (0.0046) +[2024-06-18 02:47:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 699514880. Throughput: 0: 42376.9. Samples: 699601300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:47:31,994][12645] Avg episode reward: [(0, '0.044')] +[2024-06-18 02:47:33,839][12883] Updated weights for policy 0, policy_version 42701 (0.0041) +[2024-06-18 02:47:36,994][12645] Fps is (10 sec: 44255.7, 60 sec: 42325.4, 300 sec: 42043.1). Total num frames: 699744256. Throughput: 0: 42348.8. Samples: 699858980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 02:47:36,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 02:47:37,491][12883] Updated weights for policy 0, policy_version 42711 (0.0028) +[2024-06-18 02:47:41,653][12883] Updated weights for policy 0, policy_version 42721 (0.0040) +[2024-06-18 02:47:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 41876.4). Total num frames: 699940864. Throughput: 0: 42436.1. Samples: 700111420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 02:47:41,994][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 02:47:45,351][12883] Updated weights for policy 0, policy_version 42731 (0.0029) +[2024-06-18 02:47:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42098.6). Total num frames: 700170240. Throughput: 0: 42360.0. Samples: 700236620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 02:47:46,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:47:49,432][12883] Updated weights for policy 0, policy_version 42741 (0.0036) +[2024-06-18 02:47:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 700366848. Throughput: 0: 42338.7. Samples: 700485280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 02:47:51,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 02:47:53,313][12883] Updated weights for policy 0, policy_version 42751 (0.0040) +[2024-06-18 02:47:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 700579840. Throughput: 0: 42347.3. Samples: 700742660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 02:47:56,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 02:47:57,454][12883] Updated weights for policy 0, policy_version 42761 (0.0030) +[2024-06-18 02:48:01,118][12883] Updated weights for policy 0, policy_version 42771 (0.0045) +[2024-06-18 02:48:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 700792832. Throughput: 0: 42227.4. Samples: 700867120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 02:48:01,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 02:48:05,394][12883] Updated weights for policy 0, policy_version 42781 (0.0032) +[2024-06-18 02:48:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 700989440. Throughput: 0: 42199.1. Samples: 701116400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 02:48:06,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 02:48:08,893][12883] Updated weights for policy 0, policy_version 42791 (0.0031) +[2024-06-18 02:48:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 701202432. Throughput: 0: 42077.7. Samples: 701369540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:48:11,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 02:48:13,132][12883] Updated weights for policy 0, policy_version 42801 (0.0046) +[2024-06-18 02:48:16,633][12883] Updated weights for policy 0, policy_version 42811 (0.0040) +[2024-06-18 02:48:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42053.8, 300 sec: 41987.8). Total num frames: 701415424. Throughput: 0: 42082.7. Samples: 701495020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:48:16,994][12645] Avg episode reward: [(0, '0.038')] +[2024-06-18 02:48:20,744][12883] Updated weights for policy 0, policy_version 42821 (0.0028) +[2024-06-18 02:48:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 41931.9). Total num frames: 701644800. Throughput: 0: 42058.3. Samples: 701751600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:48:21,994][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 02:48:24,602][12883] Updated weights for policy 0, policy_version 42831 (0.0041) +[2024-06-18 02:48:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42601.4, 300 sec: 42098.5). Total num frames: 701857792. Throughput: 0: 41973.3. Samples: 702000220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:48:26,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 02:48:28,247][12883] Updated weights for policy 0, policy_version 42841 (0.0043) +[2024-06-18 02:48:31,025][12862] Signal inference workers to stop experience collection... (9950 times) +[2024-06-18 02:48:31,025][12862] Signal inference workers to resume experience collection... (9950 times) +[2024-06-18 02:48:31,055][12883] InferenceWorker_p0-w0: stopping experience collection (9950 times) +[2024-06-18 02:48:31,055][12883] InferenceWorker_p0-w0: resuming experience collection (9950 times) +[2024-06-18 02:48:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 702038016. Throughput: 0: 42033.2. Samples: 702128120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 02:48:31,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 02:48:32,100][12862] Saving new best policy, reward=0.363! +[2024-06-18 02:48:32,351][12883] Updated weights for policy 0, policy_version 42851 (0.0034) +[2024-06-18 02:48:35,781][12883] Updated weights for policy 0, policy_version 42861 (0.0042) +[2024-06-18 02:48:37,000][12645] Fps is (10 sec: 40934.5, 60 sec: 42047.9, 300 sec: 42042.1). Total num frames: 702267392. Throughput: 0: 41940.5. Samples: 702372860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:48:37,000][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 02:48:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042863_702267392.pth... +[2024-06-18 02:48:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042247_692174848.pth +[2024-06-18 02:48:40,360][12883] Updated weights for policy 0, policy_version 42871 (0.0030) +[2024-06-18 02:48:41,996][12645] Fps is (10 sec: 44227.3, 60 sec: 42323.7, 300 sec: 42043.1). Total num frames: 702480384. Throughput: 0: 41926.3. Samples: 702629440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:48:41,996][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 02:48:43,497][12883] Updated weights for policy 0, policy_version 42881 (0.0028) +[2024-06-18 02:48:46,994][12645] Fps is (10 sec: 40985.8, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 702676992. Throughput: 0: 42118.7. Samples: 702762460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:48:46,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 02:48:47,790][12883] Updated weights for policy 0, policy_version 42891 (0.0028) +[2024-06-18 02:48:50,867][12883] Updated weights for policy 0, policy_version 42901 (0.0036) +[2024-06-18 02:48:51,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 702889984. Throughput: 0: 42044.5. Samples: 703008400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:48:51,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 02:48:55,426][12883] Updated weights for policy 0, policy_version 42911 (0.0040) +[2024-06-18 02:48:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 703119360. Throughput: 0: 42030.7. Samples: 703260920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:48:56,994][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 02:48:58,969][12883] Updated weights for policy 0, policy_version 42921 (0.0040) +[2024-06-18 02:49:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 703299584. Throughput: 0: 42068.0. Samples: 703388080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:49:01,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 02:49:03,514][12883] Updated weights for policy 0, policy_version 42931 (0.0043) +[2024-06-18 02:49:06,765][12883] Updated weights for policy 0, policy_version 42941 (0.0032) +[2024-06-18 02:49:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 703545344. Throughput: 0: 42019.5. Samples: 703642480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:49:06,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 02:49:11,139][12883] Updated weights for policy 0, policy_version 42951 (0.0033) +[2024-06-18 02:49:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 703741952. Throughput: 0: 41978.7. Samples: 703889260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:49:11,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 02:49:14,511][12883] Updated weights for policy 0, policy_version 42961 (0.0046) +[2024-06-18 02:49:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42043.3). Total num frames: 703938560. Throughput: 0: 41933.8. Samples: 704015140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:49:16,994][12645] Avg episode reward: [(0, '0.096')] +[2024-06-18 02:49:18,754][12883] Updated weights for policy 0, policy_version 42971 (0.0041) +[2024-06-18 02:49:21,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.7, 300 sec: 42209.3). Total num frames: 704184320. Throughput: 0: 42146.9. Samples: 704269300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:49:21,996][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 02:49:22,583][12883] Updated weights for policy 0, policy_version 42981 (0.0027) +[2024-06-18 02:49:26,446][12883] Updated weights for policy 0, policy_version 42991 (0.0038) +[2024-06-18 02:49:26,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 704380928. Throughput: 0: 42034.7. Samples: 704521000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:49:26,996][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 02:49:30,666][12883] Updated weights for policy 0, policy_version 43001 (0.0025) +[2024-06-18 02:49:31,994][12645] Fps is (10 sec: 36052.9, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 704544768. Throughput: 0: 41898.2. Samples: 704647880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:49:31,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 02:49:34,071][12883] Updated weights for policy 0, policy_version 43011 (0.0038) +[2024-06-18 02:49:36,994][12645] Fps is (10 sec: 39330.3, 60 sec: 41783.6, 300 sec: 41987.5). Total num frames: 704774144. Throughput: 0: 42019.1. Samples: 704899260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:49:36,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 02:49:38,215][12883] Updated weights for policy 0, policy_version 43021 (0.0040) +[2024-06-18 02:49:41,164][12862] Signal inference workers to stop experience collection... (10000 times) +[2024-06-18 02:49:41,223][12862] Signal inference workers to resume experience collection... (10000 times) +[2024-06-18 02:49:41,224][12883] InferenceWorker_p0-w0: stopping experience collection (10000 times) +[2024-06-18 02:49:41,240][12883] InferenceWorker_p0-w0: resuming experience collection (10000 times) +[2024-06-18 02:49:41,698][12883] Updated weights for policy 0, policy_version 43031 (0.0027) +[2024-06-18 02:49:41,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42326.9, 300 sec: 42098.6). Total num frames: 705019904. Throughput: 0: 42038.7. Samples: 705152660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:49:41,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 02:49:46,509][12883] Updated weights for policy 0, policy_version 43041 (0.0048) +[2024-06-18 02:49:46,995][12645] Fps is (10 sec: 40956.6, 60 sec: 41778.6, 300 sec: 41931.8). Total num frames: 705183744. Throughput: 0: 42122.3. Samples: 705283620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:49:46,995][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 02:49:49,390][12883] Updated weights for policy 0, policy_version 43051 (0.0040) +[2024-06-18 02:49:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 705429504. Throughput: 0: 42148.4. Samples: 705539160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:49:51,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 02:49:53,938][12883] Updated weights for policy 0, policy_version 43061 (0.0029) +[2024-06-18 02:49:56,996][12645] Fps is (10 sec: 47507.1, 60 sec: 42323.8, 300 sec: 42153.8). Total num frames: 705658880. Throughput: 0: 42249.0. Samples: 705790560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 02:49:56,996][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 02:49:57,225][12883] Updated weights for policy 0, policy_version 43071 (0.0032) +[2024-06-18 02:50:01,763][12883] Updated weights for policy 0, policy_version 43081 (0.0025) +[2024-06-18 02:50:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 705839104. Throughput: 0: 42325.7. Samples: 705919800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:50:01,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 02:50:05,238][12883] Updated weights for policy 0, policy_version 43091 (0.0022) +[2024-06-18 02:50:06,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 706068480. Throughput: 0: 42295.3. Samples: 706172500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:50:06,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 02:50:09,297][12883] Updated weights for policy 0, policy_version 43101 (0.0033) +[2024-06-18 02:50:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 706281472. Throughput: 0: 42490.5. Samples: 706432980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:50:11,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 02:50:12,820][12883] Updated weights for policy 0, policy_version 43111 (0.0036) +[2024-06-18 02:50:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 706478080. Throughput: 0: 42458.1. Samples: 706558500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:50:16,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 02:50:17,266][12883] Updated weights for policy 0, policy_version 43121 (0.0038) +[2024-06-18 02:50:20,996][12883] Updated weights for policy 0, policy_version 43131 (0.0034) +[2024-06-18 02:50:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41780.8, 300 sec: 42154.1). Total num frames: 706691072. Throughput: 0: 42445.0. Samples: 706809280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:50:21,994][12645] Avg episode reward: [(0, '0.082')] +[2024-06-18 02:50:24,934][12883] Updated weights for policy 0, policy_version 43141 (0.0033) +[2024-06-18 02:50:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42053.9, 300 sec: 42098.6). Total num frames: 706904064. Throughput: 0: 42555.2. Samples: 707067640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:50:26,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 02:50:28,651][12883] Updated weights for policy 0, policy_version 43151 (0.0030) +[2024-06-18 02:50:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 707117056. Throughput: 0: 42324.4. Samples: 707188180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:50:31,994][12645] Avg episode reward: [(0, '0.015')] +[2024-06-18 02:50:32,703][12883] Updated weights for policy 0, policy_version 43161 (0.0022) +[2024-06-18 02:50:36,200][12883] Updated weights for policy 0, policy_version 43171 (0.0033) +[2024-06-18 02:50:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 707330048. Throughput: 0: 42291.9. Samples: 707442300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:50:36,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 02:50:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043172_707330048.pth... +[2024-06-18 02:50:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042555_697221120.pth +[2024-06-18 02:50:40,374][12883] Updated weights for policy 0, policy_version 43181 (0.0030) +[2024-06-18 02:50:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 707543040. Throughput: 0: 42377.6. Samples: 707697460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:50:41,994][12645] Avg episode reward: [(0, '0.049')] +[2024-06-18 02:50:44,256][12883] Updated weights for policy 0, policy_version 43191 (0.0031) +[2024-06-18 02:50:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42872.1, 300 sec: 42265.2). Total num frames: 707756032. Throughput: 0: 42284.1. Samples: 707822580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:50:46,994][12645] Avg episode reward: [(0, '0.038')] +[2024-06-18 02:50:48,262][12883] Updated weights for policy 0, policy_version 43201 (0.0043) +[2024-06-18 02:50:49,159][12862] Signal inference workers to stop experience collection... (10050 times) +[2024-06-18 02:50:49,195][12883] InferenceWorker_p0-w0: stopping experience collection (10050 times) +[2024-06-18 02:50:49,206][12862] Signal inference workers to resume experience collection... (10050 times) +[2024-06-18 02:50:49,216][12883] InferenceWorker_p0-w0: resuming experience collection (10050 times) +[2024-06-18 02:50:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 707952640. Throughput: 0: 42450.8. Samples: 708082780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 02:50:51,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 02:50:52,299][12883] Updated weights for policy 0, policy_version 43211 (0.0034) +[2024-06-18 02:50:56,206][12883] Updated weights for policy 0, policy_version 43221 (0.0031) +[2024-06-18 02:50:56,993][12645] Fps is (10 sec: 40960.5, 60 sec: 41780.8, 300 sec: 42209.6). Total num frames: 708165632. Throughput: 0: 42257.0. Samples: 708334540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 02:50:56,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 02:50:59,926][12883] Updated weights for policy 0, policy_version 43231 (0.0030) +[2024-06-18 02:51:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 708378624. Throughput: 0: 42250.3. Samples: 708459760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 02:51:01,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 02:51:03,882][12883] Updated weights for policy 0, policy_version 43241 (0.0039) +[2024-06-18 02:51:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 42098.8). Total num frames: 708575232. Throughput: 0: 42309.6. Samples: 708713220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 02:51:06,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 02:51:07,631][12883] Updated weights for policy 0, policy_version 43251 (0.0027) +[2024-06-18 02:51:11,621][12883] Updated weights for policy 0, policy_version 43261 (0.0024) +[2024-06-18 02:51:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42154.4). Total num frames: 708804608. Throughput: 0: 42253.5. Samples: 708969060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 02:51:11,994][12645] Avg episode reward: [(0, '0.064')] +[2024-06-18 02:51:15,136][12883] Updated weights for policy 0, policy_version 43271 (0.0044) +[2024-06-18 02:51:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 709033984. Throughput: 0: 42407.8. Samples: 709096540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) +[2024-06-18 02:51:16,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 02:51:19,045][12883] Updated weights for policy 0, policy_version 43281 (0.0036) +[2024-06-18 02:51:21,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 709230592. Throughput: 0: 42422.2. Samples: 709351300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:51:21,995][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 02:51:22,818][12883] Updated weights for policy 0, policy_version 43291 (0.0033) +[2024-06-18 02:51:26,673][12883] Updated weights for policy 0, policy_version 43301 (0.0032) +[2024-06-18 02:51:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 709443584. Throughput: 0: 42468.9. Samples: 709608560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:51:26,995][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 02:51:30,552][12883] Updated weights for policy 0, policy_version 43311 (0.0028) +[2024-06-18 02:51:31,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 709689344. Throughput: 0: 42540.5. Samples: 709736900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:51:31,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:51:34,547][12883] Updated weights for policy 0, policy_version 43321 (0.0027) +[2024-06-18 02:51:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 709853184. Throughput: 0: 42347.0. Samples: 709988400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:51:36,994][12645] Avg episode reward: [(0, '0.070')] +[2024-06-18 02:51:38,287][12883] Updated weights for policy 0, policy_version 43331 (0.0031) +[2024-06-18 02:51:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 710066176. Throughput: 0: 42370.1. Samples: 710241200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:51:41,994][12645] Avg episode reward: [(0, '0.082')] +[2024-06-18 02:51:42,452][12883] Updated weights for policy 0, policy_version 43341 (0.0028) +[2024-06-18 02:51:46,075][12883] Updated weights for policy 0, policy_version 43351 (0.0037) +[2024-06-18 02:51:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 710295552. Throughput: 0: 42338.2. Samples: 710364980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 02:51:46,994][12645] Avg episode reward: [(0, '0.098')] +[2024-06-18 02:51:50,099][12883] Updated weights for policy 0, policy_version 43361 (0.0037) +[2024-06-18 02:51:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 710508544. Throughput: 0: 42401.3. Samples: 710621280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:51:51,996][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 02:51:53,816][12883] Updated weights for policy 0, policy_version 43371 (0.0043) +[2024-06-18 02:51:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 710705152. Throughput: 0: 42209.1. Samples: 710868460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:51:56,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 02:51:57,869][12883] Updated weights for policy 0, policy_version 43381 (0.0027) +[2024-06-18 02:52:01,422][12883] Updated weights for policy 0, policy_version 43391 (0.0027) +[2024-06-18 02:52:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 710934528. Throughput: 0: 42168.6. Samples: 710994120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:52:01,994][12645] Avg episode reward: [(0, '0.020')] +[2024-06-18 02:52:05,660][12883] Updated weights for policy 0, policy_version 43401 (0.0032) +[2024-06-18 02:52:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 711114752. Throughput: 0: 42154.3. Samples: 711248240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:52:06,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 02:52:09,297][12883] Updated weights for policy 0, policy_version 43411 (0.0037) +[2024-06-18 02:52:09,663][12862] Signal inference workers to stop experience collection... (10100 times) +[2024-06-18 02:52:09,663][12862] Signal inference workers to resume experience collection... (10100 times) +[2024-06-18 02:52:09,689][12883] InferenceWorker_p0-w0: stopping experience collection (10100 times) +[2024-06-18 02:52:09,689][12883] InferenceWorker_p0-w0: resuming experience collection (10100 times) +[2024-06-18 02:52:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.4, 300 sec: 42154.4). Total num frames: 711327744. Throughput: 0: 42016.9. Samples: 711499320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 02:52:11,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 02:52:13,829][12883] Updated weights for policy 0, policy_version 43421 (0.0028) +[2024-06-18 02:52:16,990][12883] Updated weights for policy 0, policy_version 43431 (0.0027) +[2024-06-18 02:52:16,995][12645] Fps is (10 sec: 45868.4, 60 sec: 42324.4, 300 sec: 42265.0). Total num frames: 711573504. Throughput: 0: 41997.7. Samples: 711626860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 02:52:16,996][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 02:52:21,719][12883] Updated weights for policy 0, policy_version 43441 (0.0037) +[2024-06-18 02:52:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42210.2). Total num frames: 711753728. Throughput: 0: 42018.7. Samples: 711879240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 02:52:21,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 02:52:24,576][12883] Updated weights for policy 0, policy_version 43451 (0.0029) +[2024-06-18 02:52:26,994][12645] Fps is (10 sec: 39327.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 711966720. Throughput: 0: 42012.9. Samples: 712131780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 02:52:26,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 02:52:29,186][12883] Updated weights for policy 0, policy_version 43461 (0.0040) +[2024-06-18 02:52:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 42154.1). Total num frames: 712179712. Throughput: 0: 42044.9. Samples: 712257000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 02:52:31,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 02:52:32,866][12883] Updated weights for policy 0, policy_version 43471 (0.0036) +[2024-06-18 02:52:36,990][12883] Updated weights for policy 0, policy_version 43481 (0.0038) +[2024-06-18 02:52:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 712392704. Throughput: 0: 41921.9. Samples: 712507760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 02:52:36,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 02:52:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043481_712392704.pth... +[2024-06-18 02:52:37,060][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042863_702267392.pth +[2024-06-18 02:52:40,724][12883] Updated weights for policy 0, policy_version 43491 (0.0036) +[2024-06-18 02:52:42,000][12645] Fps is (10 sec: 42572.5, 60 sec: 42320.9, 300 sec: 42153.2). Total num frames: 712605696. Throughput: 0: 41907.1. Samples: 712754540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 02:52:42,000][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 02:52:44,747][12883] Updated weights for policy 0, policy_version 43501 (0.0026) +[2024-06-18 02:52:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 712802304. Throughput: 0: 42034.1. Samples: 712885660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 02:52:46,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 02:52:48,474][12883] Updated weights for policy 0, policy_version 43511 (0.0036) +[2024-06-18 02:52:51,994][12645] Fps is (10 sec: 42624.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 713031680. Throughput: 0: 42092.4. Samples: 713142400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 02:52:51,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 02:52:52,162][12883] Updated weights for policy 0, policy_version 43521 (0.0032) +[2024-06-18 02:52:56,234][12883] Updated weights for policy 0, policy_version 43531 (0.0037) +[2024-06-18 02:52:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 713244672. Throughput: 0: 42004.0. Samples: 713389500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 02:52:56,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 02:53:00,117][12883] Updated weights for policy 0, policy_version 43541 (0.0036) +[2024-06-18 02:53:01,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41777.6, 300 sec: 42209.3). Total num frames: 713441280. Throughput: 0: 42052.2. Samples: 713519240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 02:53:01,996][12645] Avg episode reward: [(0, '0.082')] +[2024-06-18 02:53:03,955][12883] Updated weights for policy 0, policy_version 43551 (0.0037) +[2024-06-18 02:53:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 713654272. Throughput: 0: 42093.4. Samples: 713773440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 02:53:06,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 02:53:07,863][12883] Updated weights for policy 0, policy_version 43561 (0.0025) +[2024-06-18 02:53:11,610][12883] Updated weights for policy 0, policy_version 43571 (0.0035) +[2024-06-18 02:53:11,994][12645] Fps is (10 sec: 45886.0, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 713900032. Throughput: 0: 42139.7. Samples: 714028060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:53:11,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 02:53:15,440][12883] Updated weights for policy 0, policy_version 43581 (0.0043) +[2024-06-18 02:53:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41780.2, 300 sec: 42154.1). Total num frames: 714080256. Throughput: 0: 42291.7. Samples: 714160120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:53:16,994][12645] Avg episode reward: [(0, '0.019')] +[2024-06-18 02:53:19,339][12883] Updated weights for policy 0, policy_version 43591 (0.0039) +[2024-06-18 02:53:21,994][12645] Fps is (10 sec: 36044.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 714260480. Throughput: 0: 42209.7. Samples: 714407200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:53:21,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 02:53:23,258][12883] Updated weights for policy 0, policy_version 43601 (0.0040) +[2024-06-18 02:53:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 714522624. Throughput: 0: 42349.0. Samples: 714659980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:53:26,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 02:53:26,998][12883] Updated weights for policy 0, policy_version 43611 (0.0029) +[2024-06-18 02:53:30,787][12883] Updated weights for policy 0, policy_version 43621 (0.0028) +[2024-06-18 02:53:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42210.5). Total num frames: 714719232. Throughput: 0: 42451.5. Samples: 714795980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:53:31,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 02:53:34,085][12862] Signal inference workers to stop experience collection... (10150 times) +[2024-06-18 02:53:34,132][12883] InferenceWorker_p0-w0: stopping experience collection (10150 times) +[2024-06-18 02:53:34,141][12862] Signal inference workers to resume experience collection... (10150 times) +[2024-06-18 02:53:34,151][12883] InferenceWorker_p0-w0: resuming experience collection (10150 times) +[2024-06-18 02:53:34,719][12883] Updated weights for policy 0, policy_version 43631 (0.0034) +[2024-06-18 02:53:36,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41779.2, 300 sec: 42098.9). Total num frames: 714899456. Throughput: 0: 42213.8. Samples: 715042020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 02:53:36,994][12645] Avg episode reward: [(0, '0.072')] +[2024-06-18 02:53:38,557][12883] Updated weights for policy 0, policy_version 43641 (0.0030) +[2024-06-18 02:53:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42602.9, 300 sec: 42320.7). Total num frames: 715161600. Throughput: 0: 42413.4. Samples: 715298100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 02:53:41,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 02:53:42,178][12883] Updated weights for policy 0, policy_version 43651 (0.0027) +[2024-06-18 02:53:46,227][12883] Updated weights for policy 0, policy_version 43661 (0.0031) +[2024-06-18 02:53:46,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 715358208. Throughput: 0: 42418.6. Samples: 715427980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 02:53:46,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 02:53:50,218][12883] Updated weights for policy 0, policy_version 43671 (0.0025) +[2024-06-18 02:53:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 715554816. Throughput: 0: 42375.0. Samples: 715680320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 02:53:51,994][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 02:53:54,159][12883] Updated weights for policy 0, policy_version 43681 (0.0039) +[2024-06-18 02:53:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 715784192. Throughput: 0: 42410.9. Samples: 715936560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 02:53:56,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 02:53:57,876][12883] Updated weights for policy 0, policy_version 43691 (0.0034) +[2024-06-18 02:54:01,743][12883] Updated weights for policy 0, policy_version 43701 (0.0025) +[2024-06-18 02:54:01,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42873.0, 300 sec: 42265.1). Total num frames: 716013568. Throughput: 0: 42482.5. Samples: 716071840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 02:54:01,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 02:54:05,500][12883] Updated weights for policy 0, policy_version 43711 (0.0040) +[2024-06-18 02:54:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 716177408. Throughput: 0: 42456.4. Samples: 716317740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 02:54:06,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 02:54:09,351][12883] Updated weights for policy 0, policy_version 43721 (0.0028) +[2024-06-18 02:54:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 716423168. Throughput: 0: 42574.1. Samples: 716575820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 02:54:11,995][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 02:54:13,444][12883] Updated weights for policy 0, policy_version 43731 (0.0031) +[2024-06-18 02:54:16,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42210.0). Total num frames: 716636160. Throughput: 0: 42486.8. Samples: 716707880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 02:54:16,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 02:54:17,094][12883] Updated weights for policy 0, policy_version 43741 (0.0034) +[2024-06-18 02:54:21,223][12883] Updated weights for policy 0, policy_version 43751 (0.0035) +[2024-06-18 02:54:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42154.4). Total num frames: 716816384. Throughput: 0: 42451.9. Samples: 716952360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 02:54:21,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 02:54:24,295][12862] Signal inference workers to stop experience collection... (10200 times) +[2024-06-18 02:54:24,296][12862] Signal inference workers to resume experience collection... (10200 times) +[2024-06-18 02:54:24,339][12883] InferenceWorker_p0-w0: stopping experience collection (10200 times) +[2024-06-18 02:54:24,339][12883] InferenceWorker_p0-w0: resuming experience collection (10200 times) +[2024-06-18 02:54:24,757][12883] Updated weights for policy 0, policy_version 43761 (0.0028) +[2024-06-18 02:54:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 717062144. Throughput: 0: 42643.1. Samples: 717217040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 02:54:26,994][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 02:54:28,846][12883] Updated weights for policy 0, policy_version 43771 (0.0027) +[2024-06-18 02:54:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 717258752. Throughput: 0: 42687.9. Samples: 717348940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 02:54:31,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 02:54:32,571][12883] Updated weights for policy 0, policy_version 43781 (0.0022) +[2024-06-18 02:54:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 717455360. Throughput: 0: 42461.8. Samples: 717591100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:54:36,994][12645] Avg episode reward: [(0, '0.038')] +[2024-06-18 02:54:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043790_717455360.pth... +[2024-06-18 02:54:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043172_707330048.pth +[2024-06-18 02:54:37,235][12883] Updated weights for policy 0, policy_version 43791 (0.0044) +[2024-06-18 02:54:40,413][12883] Updated weights for policy 0, policy_version 43801 (0.0032) +[2024-06-18 02:54:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42431.9). Total num frames: 717701120. Throughput: 0: 42451.7. Samples: 717846880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:54:41,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 02:54:44,896][12883] Updated weights for policy 0, policy_version 43811 (0.0030) +[2024-06-18 02:54:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 717881344. Throughput: 0: 42289.1. Samples: 717974840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:54:46,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 02:54:48,229][12883] Updated weights for policy 0, policy_version 43821 (0.0042) +[2024-06-18 02:54:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42209.9). Total num frames: 718110720. Throughput: 0: 42256.1. Samples: 718219260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:54:51,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 02:54:52,798][12883] Updated weights for policy 0, policy_version 43831 (0.0035) +[2024-06-18 02:54:55,825][12883] Updated weights for policy 0, policy_version 43841 (0.0033) +[2024-06-18 02:54:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 718340096. Throughput: 0: 42301.3. Samples: 718479380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 02:54:56,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 02:55:00,551][12883] Updated weights for policy 0, policy_version 43851 (0.0039) +[2024-06-18 02:55:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 718520320. Throughput: 0: 42416.3. Samples: 718616620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:55:01,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 02:55:03,400][12883] Updated weights for policy 0, policy_version 43861 (0.0037) +[2024-06-18 02:55:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 718733312. Throughput: 0: 42401.4. Samples: 718860420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:55:06,994][12645] Avg episode reward: [(0, '0.139')] +[2024-06-18 02:55:08,202][12883] Updated weights for policy 0, policy_version 43871 (0.0033) +[2024-06-18 02:55:11,342][12883] Updated weights for policy 0, policy_version 43881 (0.0029) +[2024-06-18 02:55:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 718979072. Throughput: 0: 42110.6. Samples: 719112020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:55:11,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 02:55:15,918][12883] Updated weights for policy 0, policy_version 43891 (0.0035) +[2024-06-18 02:55:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 719175680. Throughput: 0: 42106.2. Samples: 719243720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:55:16,994][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 02:55:19,175][12883] Updated weights for policy 0, policy_version 43901 (0.0031) +[2024-06-18 02:55:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 719388672. Throughput: 0: 42197.7. Samples: 719490000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:55:21,994][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 02:55:23,639][12883] Updated weights for policy 0, policy_version 43911 (0.0038) +[2024-06-18 02:55:26,979][12883] Updated weights for policy 0, policy_version 43921 (0.0043) +[2024-06-18 02:55:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 719601664. Throughput: 0: 42312.4. Samples: 719750940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 02:55:26,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 02:55:31,313][12883] Updated weights for policy 0, policy_version 43931 (0.0028) +[2024-06-18 02:55:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 719798272. Throughput: 0: 42120.3. Samples: 719870260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) +[2024-06-18 02:55:31,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 02:55:34,858][12883] Updated weights for policy 0, policy_version 43941 (0.0031) +[2024-06-18 02:55:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 720027648. Throughput: 0: 42338.3. Samples: 720124480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) +[2024-06-18 02:55:36,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 02:55:38,442][12862] Signal inference workers to stop experience collection... (10250 times) +[2024-06-18 02:55:38,443][12862] Signal inference workers to resume experience collection... (10250 times) +[2024-06-18 02:55:38,480][12883] InferenceWorker_p0-w0: stopping experience collection (10250 times) +[2024-06-18 02:55:38,480][12883] InferenceWorker_p0-w0: resuming experience collection (10250 times) +[2024-06-18 02:55:38,821][12883] Updated weights for policy 0, policy_version 43951 (0.0032) +[2024-06-18 02:55:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 720224256. Throughput: 0: 42340.5. Samples: 720384700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) +[2024-06-18 02:55:42,000][12645] Avg episode reward: [(0, '0.104')] +[2024-06-18 02:55:42,557][12883] Updated weights for policy 0, policy_version 43961 (0.0039) +[2024-06-18 02:55:46,492][12883] Updated weights for policy 0, policy_version 43971 (0.0031) +[2024-06-18 02:55:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 720420864. Throughput: 0: 42059.5. Samples: 720509300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) +[2024-06-18 02:55:46,994][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 02:55:49,935][12883] Updated weights for policy 0, policy_version 43981 (0.0028) +[2024-06-18 02:55:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 720666624. Throughput: 0: 42225.4. Samples: 720760560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) +[2024-06-18 02:55:51,994][12645] Avg episode reward: [(0, '0.137')] +[2024-06-18 02:55:54,687][12883] Updated weights for policy 0, policy_version 43991 (0.0038) +[2024-06-18 02:55:56,998][12645] Fps is (10 sec: 44219.6, 60 sec: 42049.5, 300 sec: 42320.1). Total num frames: 720863232. Throughput: 0: 42256.6. Samples: 721013740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:55:56,998][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 02:55:58,219][12883] Updated weights for policy 0, policy_version 44001 (0.0041) +[2024-06-18 02:56:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 721059840. Throughput: 0: 42122.2. Samples: 721139220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:56:01,994][12645] Avg episode reward: [(0, '0.174')] +[2024-06-18 02:56:02,259][12883] Updated weights for policy 0, policy_version 44011 (0.0029) +[2024-06-18 02:56:05,873][12883] Updated weights for policy 0, policy_version 44021 (0.0054) +[2024-06-18 02:56:06,994][12645] Fps is (10 sec: 40976.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 721272832. Throughput: 0: 42240.4. Samples: 721390820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:56:06,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 02:56:10,131][12883] Updated weights for policy 0, policy_version 44031 (0.0038) +[2024-06-18 02:56:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 721485824. Throughput: 0: 42104.8. Samples: 721645660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:56:11,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 02:56:13,699][12883] Updated weights for policy 0, policy_version 44041 (0.0044) +[2024-06-18 02:56:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 721698816. Throughput: 0: 42140.5. Samples: 721766580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:56:16,994][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 02:56:17,867][12883] Updated weights for policy 0, policy_version 44051 (0.0034) +[2024-06-18 02:56:21,418][12883] Updated weights for policy 0, policy_version 44061 (0.0028) +[2024-06-18 02:56:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 721895424. Throughput: 0: 41941.7. Samples: 722011860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 02:56:21,994][12645] Avg episode reward: [(0, '0.064')] +[2024-06-18 02:56:25,644][12883] Updated weights for policy 0, policy_version 44071 (0.0027) +[2024-06-18 02:56:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 722108416. Throughput: 0: 42109.6. Samples: 722279640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 02:56:26,994][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 02:56:29,070][12883] Updated weights for policy 0, policy_version 44081 (0.0039) +[2024-06-18 02:56:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 722321408. Throughput: 0: 42056.6. Samples: 722401840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 02:56:31,994][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 02:56:33,153][12883] Updated weights for policy 0, policy_version 44091 (0.0023) +[2024-06-18 02:56:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 42265.1). Total num frames: 722534400. Throughput: 0: 41931.0. Samples: 722647460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 02:56:36,995][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 02:56:37,106][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044101_722550784.pth... +[2024-06-18 02:56:37,121][12883] Updated weights for policy 0, policy_version 44101 (0.0042) +[2024-06-18 02:56:37,162][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043481_712392704.pth +[2024-06-18 02:56:40,822][12883] Updated weights for policy 0, policy_version 44111 (0.0034) +[2024-06-18 02:56:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 722731008. Throughput: 0: 42049.2. Samples: 722905780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 02:56:41,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 02:56:45,212][12883] Updated weights for policy 0, policy_version 44121 (0.0053) +[2024-06-18 02:56:46,994][12645] Fps is (10 sec: 39322.3, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 722927616. Throughput: 0: 41817.4. Samples: 723021000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 02:56:46,994][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 02:56:47,651][12862] Signal inference workers to stop experience collection... (10300 times) +[2024-06-18 02:56:47,651][12862] Signal inference workers to resume experience collection... (10300 times) +[2024-06-18 02:56:47,700][12883] InferenceWorker_p0-w0: stopping experience collection (10300 times) +[2024-06-18 02:56:47,700][12883] InferenceWorker_p0-w0: resuming experience collection (10300 times) +[2024-06-18 02:56:48,855][12883] Updated weights for policy 0, policy_version 44131 (0.0032) +[2024-06-18 02:56:51,996][12645] Fps is (10 sec: 44226.5, 60 sec: 41777.6, 300 sec: 42264.9). Total num frames: 723173376. Throughput: 0: 41910.0. Samples: 723276860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:56:51,996][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 02:56:52,791][12883] Updated weights for policy 0, policy_version 44141 (0.0033) +[2024-06-18 02:56:56,741][12883] Updated weights for policy 0, policy_version 44151 (0.0034) +[2024-06-18 02:56:56,994][12645] Fps is (10 sec: 44236.0, 60 sec: 41781.9, 300 sec: 42154.1). Total num frames: 723369984. Throughput: 0: 42003.0. Samples: 723535800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:56:56,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 02:57:00,438][12883] Updated weights for policy 0, policy_version 44161 (0.0041) +[2024-06-18 02:57:01,994][12645] Fps is (10 sec: 39330.2, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 723566592. Throughput: 0: 42132.5. Samples: 723662540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:57:01,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 02:57:04,343][12883] Updated weights for policy 0, policy_version 44171 (0.0043) +[2024-06-18 02:57:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 723795968. Throughput: 0: 42252.4. Samples: 723913220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:57:06,994][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 02:57:08,258][12883] Updated weights for policy 0, policy_version 44181 (0.0029) +[2024-06-18 02:57:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42154.3). Total num frames: 724008960. Throughput: 0: 41799.8. Samples: 724160620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:57:11,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 02:57:12,711][12883] Updated weights for policy 0, policy_version 44191 (0.0036) +[2024-06-18 02:57:16,419][12883] Updated weights for policy 0, policy_version 44201 (0.0040) +[2024-06-18 02:57:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 724189184. Throughput: 0: 41782.6. Samples: 724282060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) +[2024-06-18 02:57:16,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 02:57:20,266][12883] Updated weights for policy 0, policy_version 44211 (0.0038) +[2024-06-18 02:57:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 724434944. Throughput: 0: 41978.3. Samples: 724536480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:57:21,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 02:57:24,645][12883] Updated weights for policy 0, policy_version 44221 (0.0054) +[2024-06-18 02:57:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 724631552. Throughput: 0: 41758.1. Samples: 724784900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:57:26,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 02:57:27,882][12883] Updated weights for policy 0, policy_version 44231 (0.0034) +[2024-06-18 02:57:31,996][12645] Fps is (10 sec: 39313.1, 60 sec: 41777.6, 300 sec: 42153.8). Total num frames: 724828160. Throughput: 0: 42005.4. Samples: 724911340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:57:31,996][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 02:57:32,257][12883] Updated weights for policy 0, policy_version 44241 (0.0041) +[2024-06-18 02:57:35,594][12883] Updated weights for policy 0, policy_version 44251 (0.0031) +[2024-06-18 02:57:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42155.0). Total num frames: 725041152. Throughput: 0: 41990.6. Samples: 725166340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:57:36,994][12645] Avg episode reward: [(0, '0.064')] +[2024-06-18 02:57:40,224][12883] Updated weights for policy 0, policy_version 44261 (0.0026) +[2024-06-18 02:57:41,996][12645] Fps is (10 sec: 44236.7, 60 sec: 42323.7, 300 sec: 42264.9). Total num frames: 725270528. Throughput: 0: 41758.5. Samples: 725415020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:57:41,996][12645] Avg episode reward: [(0, '0.082')] +[2024-06-18 02:57:43,402][12883] Updated weights for policy 0, policy_version 44271 (0.0039) +[2024-06-18 02:57:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 725467136. Throughput: 0: 41867.5. Samples: 725546580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 02:57:46,994][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 02:57:47,624][12883] Updated weights for policy 0, policy_version 44281 (0.0027) +[2024-06-18 02:57:51,429][12883] Updated weights for policy 0, policy_version 44291 (0.0051) +[2024-06-18 02:57:51,994][12645] Fps is (10 sec: 39330.2, 60 sec: 41507.7, 300 sec: 42098.5). Total num frames: 725663744. Throughput: 0: 41774.3. Samples: 725793060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 02:57:51,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 02:57:55,350][12883] Updated weights for policy 0, policy_version 44301 (0.0034) +[2024-06-18 02:57:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42209.9). Total num frames: 725893120. Throughput: 0: 41907.4. Samples: 726046460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 02:57:56,996][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 02:57:58,855][12883] Updated weights for policy 0, policy_version 44311 (0.0040) +[2024-06-18 02:58:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 726089728. Throughput: 0: 42170.7. Samples: 726179740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 02:58:01,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 02:58:03,110][12883] Updated weights for policy 0, policy_version 44321 (0.0032) +[2024-06-18 02:58:06,369][12883] Updated weights for policy 0, policy_version 44331 (0.0020) +[2024-06-18 02:58:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 726319104. Throughput: 0: 42206.7. Samples: 726435780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 02:58:06,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 02:58:10,794][12883] Updated weights for policy 0, policy_version 44341 (0.0031) +[2024-06-18 02:58:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 726532096. Throughput: 0: 42307.5. Samples: 726688740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 02:58:11,994][12645] Avg episode reward: [(0, '0.119')] +[2024-06-18 02:58:14,387][12883] Updated weights for policy 0, policy_version 44351 (0.0041) +[2024-06-18 02:58:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 726728704. Throughput: 0: 42275.3. Samples: 726813640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 02:58:16,994][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 02:58:18,228][12883] Updated weights for policy 0, policy_version 44361 (0.0034) +[2024-06-18 02:58:19,558][12862] Signal inference workers to stop experience collection... (10350 times) +[2024-06-18 02:58:19,558][12862] Signal inference workers to resume experience collection... (10350 times) +[2024-06-18 02:58:19,588][12883] InferenceWorker_p0-w0: stopping experience collection (10350 times) +[2024-06-18 02:58:19,589][12883] InferenceWorker_p0-w0: resuming experience collection (10350 times) +[2024-06-18 02:58:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 726958080. Throughput: 0: 42256.8. Samples: 727067900. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 02:58:21,994][12645] Avg episode reward: [(0, '0.104')] +[2024-06-18 02:58:22,049][12883] Updated weights for policy 0, policy_version 44371 (0.0040) +[2024-06-18 02:58:26,888][12883] Updated weights for policy 0, policy_version 44381 (0.0043) +[2024-06-18 02:58:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 727138304. Throughput: 0: 42468.7. Samples: 727326020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 02:58:26,994][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 02:58:29,625][12883] Updated weights for policy 0, policy_version 44391 (0.0038) +[2024-06-18 02:58:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42326.8, 300 sec: 42265.1). Total num frames: 727367680. Throughput: 0: 42071.5. Samples: 727439800. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 02:58:31,994][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 02:58:34,789][12883] Updated weights for policy 0, policy_version 44401 (0.0038) +[2024-06-18 02:58:36,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 727597056. Throughput: 0: 42280.9. Samples: 727695700. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 02:58:36,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 02:58:37,149][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044410_727613440.pth... +[2024-06-18 02:58:37,201][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043790_717455360.pth +[2024-06-18 02:58:37,906][12883] Updated weights for policy 0, policy_version 44411 (0.0036) +[2024-06-18 02:58:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41780.8, 300 sec: 42098.5). Total num frames: 727777280. Throughput: 0: 42346.8. Samples: 727952060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 02:58:41,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 02:58:42,313][12883] Updated weights for policy 0, policy_version 44421 (0.0032) +[2024-06-18 02:58:45,400][12883] Updated weights for policy 0, policy_version 44431 (0.0037) +[2024-06-18 02:58:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 728023040. Throughput: 0: 42167.5. Samples: 728077280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 02:58:46,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 02:58:50,100][12883] Updated weights for policy 0, policy_version 44441 (0.0029) +[2024-06-18 02:58:51,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 728219648. Throughput: 0: 42303.4. Samples: 728339440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 02:58:51,995][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 02:58:53,062][12883] Updated weights for policy 0, policy_version 44451 (0.0039) +[2024-06-18 02:58:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 728432640. Throughput: 0: 42284.9. Samples: 728591560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 02:58:56,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 02:58:57,820][12883] Updated weights for policy 0, policy_version 44461 (0.0052) +[2024-06-18 02:59:00,601][12883] Updated weights for policy 0, policy_version 44471 (0.0033) +[2024-06-18 02:59:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 728662016. Throughput: 0: 42340.9. Samples: 728718980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 02:59:02,003][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 02:59:05,481][12883] Updated weights for policy 0, policy_version 44481 (0.0041) +[2024-06-18 02:59:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 728842240. Throughput: 0: 42414.2. Samples: 728976540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 02:59:06,994][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 02:59:08,210][12883] Updated weights for policy 0, policy_version 44491 (0.0033) +[2024-06-18 02:59:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 729055232. Throughput: 0: 42140.8. Samples: 729222360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 02:59:11,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 02:59:13,218][12883] Updated weights for policy 0, policy_version 44501 (0.0030) +[2024-06-18 02:59:16,625][12883] Updated weights for policy 0, policy_version 44511 (0.0035) +[2024-06-18 02:59:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 729284608. Throughput: 0: 42613.1. Samples: 729357380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:59:16,994][12645] Avg episode reward: [(0, '0.119')] +[2024-06-18 02:59:20,907][12883] Updated weights for policy 0, policy_version 44521 (0.0039) +[2024-06-18 02:59:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 729464832. Throughput: 0: 42481.2. Samples: 729607360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:59:22,003][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 02:59:24,175][12883] Updated weights for policy 0, policy_version 44531 (0.0028) +[2024-06-18 02:59:26,995][12645] Fps is (10 sec: 42592.1, 60 sec: 42870.5, 300 sec: 42209.4). Total num frames: 729710592. Throughput: 0: 42334.6. Samples: 729857180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:59:26,996][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 02:59:28,692][12883] Updated weights for policy 0, policy_version 44541 (0.0042) +[2024-06-18 02:59:30,869][12862] Signal inference workers to stop experience collection... (10400 times) +[2024-06-18 02:59:30,917][12883] InferenceWorker_p0-w0: stopping experience collection (10400 times) +[2024-06-18 02:59:30,989][12862] Signal inference workers to resume experience collection... (10400 times) +[2024-06-18 02:59:30,989][12883] InferenceWorker_p0-w0: resuming experience collection (10400 times) +[2024-06-18 02:59:31,963][12883] Updated weights for policy 0, policy_version 44551 (0.0036) +[2024-06-18 02:59:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 729923584. Throughput: 0: 42555.5. Samples: 729992280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:59:31,994][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 02:59:36,423][12883] Updated weights for policy 0, policy_version 44561 (0.0027) +[2024-06-18 02:59:36,994][12645] Fps is (10 sec: 39326.8, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 730103808. Throughput: 0: 42293.4. Samples: 730242640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 02:59:36,994][12645] Avg episode reward: [(0, '0.205')] +[2024-06-18 02:59:39,526][12883] Updated weights for policy 0, policy_version 44571 (0.0040) +[2024-06-18 02:59:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 730349568. Throughput: 0: 42238.7. Samples: 730492300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:59:41,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 02:59:44,191][12883] Updated weights for policy 0, policy_version 44581 (0.0038) +[2024-06-18 02:59:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 730546176. Throughput: 0: 42322.8. Samples: 730623500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:59:46,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 02:59:47,232][12883] Updated weights for policy 0, policy_version 44591 (0.0035) +[2024-06-18 02:59:51,763][12883] Updated weights for policy 0, policy_version 44601 (0.0045) +[2024-06-18 02:59:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 730742784. Throughput: 0: 42184.0. Samples: 730874820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:59:51,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 02:59:54,869][12883] Updated weights for policy 0, policy_version 44611 (0.0038) +[2024-06-18 02:59:56,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42596.8, 300 sec: 42264.9). Total num frames: 730988544. Throughput: 0: 42343.4. Samples: 731127900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 02:59:56,997][12645] Avg episode reward: [(0, '0.099')] +[2024-06-18 02:59:59,533][12883] Updated weights for policy 0, policy_version 44621 (0.0039) +[2024-06-18 03:00:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 731185152. Throughput: 0: 42270.2. Samples: 731259540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 03:00:01,994][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 03:00:02,757][12883] Updated weights for policy 0, policy_version 44631 (0.0026) +[2024-06-18 03:00:06,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 731381760. Throughput: 0: 42237.9. Samples: 731508060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 03:00:06,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 03:00:07,350][12883] Updated weights for policy 0, policy_version 44641 (0.0034) +[2024-06-18 03:00:10,802][12883] Updated weights for policy 0, policy_version 44651 (0.0041) +[2024-06-18 03:00:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 731627520. Throughput: 0: 42256.5. Samples: 731758660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:00:11,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 03:00:15,213][12883] Updated weights for policy 0, policy_version 44661 (0.0030) +[2024-06-18 03:00:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 731807744. Throughput: 0: 42256.9. Samples: 731893840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:00:16,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 03:00:18,403][12883] Updated weights for policy 0, policy_version 44671 (0.0039) +[2024-06-18 03:00:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 732020736. Throughput: 0: 42265.0. Samples: 732144560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:00:21,994][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 03:00:22,857][12883] Updated weights for policy 0, policy_version 44681 (0.0030) +[2024-06-18 03:00:26,050][12883] Updated weights for policy 0, policy_version 44691 (0.0026) +[2024-06-18 03:00:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42326.3, 300 sec: 42209.6). Total num frames: 732250112. Throughput: 0: 42339.6. Samples: 732397580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:00:26,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 03:00:30,381][12883] Updated weights for policy 0, policy_version 44701 (0.0030) +[2024-06-18 03:00:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 732430336. Throughput: 0: 42391.5. Samples: 732531120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:00:31,994][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 03:00:33,623][12883] Updated weights for policy 0, policy_version 44711 (0.0036) +[2024-06-18 03:00:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 732659712. Throughput: 0: 42425.2. Samples: 732783960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:00:36,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 03:00:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044718_732659712.pth... +[2024-06-18 03:00:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044101_722550784.pth +[2024-06-18 03:00:38,196][12883] Updated weights for policy 0, policy_version 44721 (0.0022) +[2024-06-18 03:00:41,106][12883] Updated weights for policy 0, policy_version 44731 (0.0033) +[2024-06-18 03:00:41,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 732905472. Throughput: 0: 42472.6. Samples: 733039080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:00:41,994][12645] Avg episode reward: [(0, '0.099')] +[2024-06-18 03:00:45,915][12883] Updated weights for policy 0, policy_version 44741 (0.0026) +[2024-06-18 03:00:46,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 733085696. Throughput: 0: 42389.9. Samples: 733167080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:00:46,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 03:00:49,226][12883] Updated weights for policy 0, policy_version 44751 (0.0035) +[2024-06-18 03:00:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42210.2). Total num frames: 733315072. Throughput: 0: 42432.8. Samples: 733417540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:00:51,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 03:00:53,447][12883] Updated weights for policy 0, policy_version 44761 (0.0026) +[2024-06-18 03:00:56,723][12883] Updated weights for policy 0, policy_version 44771 (0.0028) +[2024-06-18 03:00:56,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42326.8, 300 sec: 42265.2). Total num frames: 733528064. Throughput: 0: 42588.3. Samples: 733675140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:00:56,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 03:01:01,178][12883] Updated weights for policy 0, policy_version 44781 (0.0031) +[2024-06-18 03:01:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 733724672. Throughput: 0: 42392.0. Samples: 733801480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:01:01,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 03:01:04,465][12883] Updated weights for policy 0, policy_version 44791 (0.0035) +[2024-06-18 03:01:06,081][12862] Signal inference workers to stop experience collection... (10450 times) +[2024-06-18 03:01:06,119][12883] InferenceWorker_p0-w0: stopping experience collection (10450 times) +[2024-06-18 03:01:06,139][12862] Signal inference workers to resume experience collection... (10450 times) +[2024-06-18 03:01:06,144][12883] InferenceWorker_p0-w0: resuming experience collection (10450 times) +[2024-06-18 03:01:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 733954048. Throughput: 0: 42516.0. Samples: 734057780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:01:06,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 03:01:08,889][12883] Updated weights for policy 0, policy_version 44801 (0.0037) +[2024-06-18 03:01:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 734167040. Throughput: 0: 42457.8. Samples: 734308180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:01:11,994][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 03:01:12,193][12883] Updated weights for policy 0, policy_version 44811 (0.0044) +[2024-06-18 03:01:16,526][12883] Updated weights for policy 0, policy_version 44821 (0.0049) +[2024-06-18 03:01:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 734380032. Throughput: 0: 42359.6. Samples: 734437300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:01:16,994][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 03:01:19,751][12883] Updated weights for policy 0, policy_version 44831 (0.0024) +[2024-06-18 03:01:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 734593024. Throughput: 0: 42440.0. Samples: 734693760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:01:21,994][12645] Avg episode reward: [(0, '0.053')] +[2024-06-18 03:01:24,162][12883] Updated weights for policy 0, policy_version 44841 (0.0032) +[2024-06-18 03:01:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 734822400. Throughput: 0: 42488.1. Samples: 734951040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:01:26,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 03:01:27,394][12883] Updated weights for policy 0, policy_version 44851 (0.0043) +[2024-06-18 03:01:31,768][12883] Updated weights for policy 0, policy_version 44861 (0.0024) +[2024-06-18 03:01:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 735002624. Throughput: 0: 42554.1. Samples: 735082020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:01:31,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 03:01:35,219][12883] Updated weights for policy 0, policy_version 44871 (0.0042) +[2024-06-18 03:01:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 735199232. Throughput: 0: 42592.9. Samples: 735334220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:01:36,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 03:01:39,818][12883] Updated weights for policy 0, policy_version 44881 (0.0037) +[2024-06-18 03:01:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 735444992. Throughput: 0: 42429.5. Samples: 735584460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:01:41,994][12645] Avg episode reward: [(0, '0.104')] +[2024-06-18 03:01:43,176][12883] Updated weights for policy 0, policy_version 44891 (0.0042) +[2024-06-18 03:01:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42154.4). Total num frames: 735608832. Throughput: 0: 42377.3. Samples: 735708460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:01:46,995][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 03:01:47,698][12883] Updated weights for policy 0, policy_version 44901 (0.0038) +[2024-06-18 03:01:51,340][12883] Updated weights for policy 0, policy_version 44911 (0.0042) +[2024-06-18 03:01:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 735854592. Throughput: 0: 42393.2. Samples: 735965480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:01:51,994][12645] Avg episode reward: [(0, '0.119')] +[2024-06-18 03:01:55,364][12883] Updated weights for policy 0, policy_version 44921 (0.0028) +[2024-06-18 03:01:56,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 736083968. Throughput: 0: 42437.0. Samples: 736217840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:01:56,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 03:01:59,110][12883] Updated weights for policy 0, policy_version 44931 (0.0032) +[2024-06-18 03:02:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 736264192. Throughput: 0: 42635.0. Samples: 736355880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:02:01,994][12645] Avg episode reward: [(0, '0.028')] +[2024-06-18 03:02:02,844][12883] Updated weights for policy 0, policy_version 44941 (0.0039) +[2024-06-18 03:02:06,779][12883] Updated weights for policy 0, policy_version 44951 (0.0037) +[2024-06-18 03:02:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 736493568. Throughput: 0: 42539.7. Samples: 736608040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 03:02:06,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 03:02:10,527][12883] Updated weights for policy 0, policy_version 44961 (0.0036) +[2024-06-18 03:02:11,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 736722944. Throughput: 0: 42398.3. Samples: 736859060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 03:02:11,996][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 03:02:14,474][12883] Updated weights for policy 0, policy_version 44971 (0.0042) +[2024-06-18 03:02:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 736886784. Throughput: 0: 42457.3. Samples: 736992600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 03:02:16,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 03:02:17,993][12883] Updated weights for policy 0, policy_version 44981 (0.0034) +[2024-06-18 03:02:21,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 737116160. Throughput: 0: 42443.6. Samples: 737244180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 03:02:21,994][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 03:02:22,047][12883] Updated weights for policy 0, policy_version 44991 (0.0040) +[2024-06-18 03:02:25,618][12883] Updated weights for policy 0, policy_version 45001 (0.0033) +[2024-06-18 03:02:26,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 737361920. Throughput: 0: 42571.5. Samples: 737500180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 03:02:26,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 03:02:29,747][12883] Updated weights for policy 0, policy_version 45011 (0.0047) +[2024-06-18 03:02:31,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 737542144. Throughput: 0: 42746.6. Samples: 737632060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 03:02:31,995][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 03:02:33,248][12883] Updated weights for policy 0, policy_version 45021 (0.0040) +[2024-06-18 03:02:34,643][12862] Signal inference workers to stop experience collection... (10500 times) +[2024-06-18 03:02:34,664][12883] InferenceWorker_p0-w0: stopping experience collection (10500 times) +[2024-06-18 03:02:34,757][12862] Signal inference workers to resume experience collection... (10500 times) +[2024-06-18 03:02:34,757][12883] InferenceWorker_p0-w0: resuming experience collection (10500 times) +[2024-06-18 03:02:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42321.0). Total num frames: 737755136. Throughput: 0: 42503.2. Samples: 737878120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 03:02:36,994][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 03:02:37,113][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045030_737771520.pth... +[2024-06-18 03:02:37,172][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044410_727613440.pth +[2024-06-18 03:02:37,433][12883] Updated weights for policy 0, policy_version 45031 (0.0036) +[2024-06-18 03:02:41,149][12883] Updated weights for policy 0, policy_version 45041 (0.0046) +[2024-06-18 03:02:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 737984512. Throughput: 0: 42480.3. Samples: 738129460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 03:02:41,994][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 03:02:45,361][12883] Updated weights for policy 0, policy_version 45051 (0.0036) +[2024-06-18 03:02:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 738181120. Throughput: 0: 42313.8. Samples: 738260000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 03:02:46,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 03:02:48,837][12883] Updated weights for policy 0, policy_version 45061 (0.0050) +[2024-06-18 03:02:51,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42050.8, 300 sec: 42320.4). Total num frames: 738377728. Throughput: 0: 42191.7. Samples: 738506760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 03:02:51,997][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 03:02:53,476][12883] Updated weights for policy 0, policy_version 45071 (0.0036) +[2024-06-18 03:02:56,321][12883] Updated weights for policy 0, policy_version 45081 (0.0026) +[2024-06-18 03:02:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 738607104. Throughput: 0: 42350.5. Samples: 738764740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 03:02:56,994][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 03:03:01,112][12883] Updated weights for policy 0, policy_version 45091 (0.0042) +[2024-06-18 03:03:01,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 738820096. Throughput: 0: 42392.6. Samples: 738900260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:03:01,994][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 03:03:03,899][12883] Updated weights for policy 0, policy_version 45101 (0.0031) +[2024-06-18 03:03:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 739016704. Throughput: 0: 42338.7. Samples: 739149420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:03:06,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 03:03:08,917][12883] Updated weights for policy 0, policy_version 45111 (0.0038) +[2024-06-18 03:03:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42053.8, 300 sec: 42431.8). Total num frames: 739246080. Throughput: 0: 42192.4. Samples: 739398840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:03:11,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 03:03:12,153][12883] Updated weights for policy 0, policy_version 45121 (0.0034) +[2024-06-18 03:03:16,537][12883] Updated weights for policy 0, policy_version 45131 (0.0034) +[2024-06-18 03:03:17,000][12645] Fps is (10 sec: 44208.8, 60 sec: 42867.1, 300 sec: 42375.3). Total num frames: 739459072. Throughput: 0: 42274.7. Samples: 739534680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:03:17,001][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 03:03:19,819][12883] Updated weights for policy 0, policy_version 45141 (0.0038) +[2024-06-18 03:03:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 739672064. Throughput: 0: 42484.0. Samples: 739789900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:03:21,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 03:03:24,461][12883] Updated weights for policy 0, policy_version 45151 (0.0043) +[2024-06-18 03:03:26,994][12645] Fps is (10 sec: 44264.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 739901440. Throughput: 0: 42440.4. Samples: 740039280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:03:26,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 03:03:27,222][12883] Updated weights for policy 0, policy_version 45161 (0.0037) +[2024-06-18 03:03:31,973][12883] Updated weights for policy 0, policy_version 45171 (0.0033) +[2024-06-18 03:03:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 740081664. Throughput: 0: 42423.5. Samples: 740169060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:03:31,994][12645] Avg episode reward: [(0, '0.011')] +[2024-06-18 03:03:35,334][12883] Updated weights for policy 0, policy_version 45181 (0.0034) +[2024-06-18 03:03:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 740294656. Throughput: 0: 42535.8. Samples: 740420780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:03:36,994][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 03:03:40,039][12883] Updated weights for policy 0, policy_version 45191 (0.0034) +[2024-06-18 03:03:41,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.8, 300 sec: 42375.9). Total num frames: 740524032. Throughput: 0: 42266.3. Samples: 740666820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:03:41,997][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 03:03:43,371][12883] Updated weights for policy 0, policy_version 45201 (0.0037) +[2024-06-18 03:03:46,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42323.8, 300 sec: 42375.9). Total num frames: 740720640. Throughput: 0: 42269.8. Samples: 740802500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:03:46,996][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 03:03:47,513][12883] Updated weights for policy 0, policy_version 45211 (0.0029) +[2024-06-18 03:03:49,071][12862] Signal inference workers to stop experience collection... (10550 times) +[2024-06-18 03:03:49,072][12862] Signal inference workers to resume experience collection... (10550 times) +[2024-06-18 03:03:49,101][12883] InferenceWorker_p0-w0: stopping experience collection (10550 times) +[2024-06-18 03:03:49,101][12883] InferenceWorker_p0-w0: resuming experience collection (10550 times) +[2024-06-18 03:03:51,158][12883] Updated weights for policy 0, policy_version 45221 (0.0035) +[2024-06-18 03:03:51,994][12645] Fps is (10 sec: 39330.9, 60 sec: 42327.0, 300 sec: 42320.7). Total num frames: 740917248. Throughput: 0: 42328.0. Samples: 741054180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:03:51,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 03:03:54,946][12883] Updated weights for policy 0, policy_version 45231 (0.0039) +[2024-06-18 03:03:56,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 741163008. Throughput: 0: 42296.5. Samples: 741302180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:03:56,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 03:03:59,083][12883] Updated weights for policy 0, policy_version 45241 (0.0038) +[2024-06-18 03:04:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 741359616. Throughput: 0: 42246.3. Samples: 741435500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 03:04:01,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 03:04:02,446][12883] Updated weights for policy 0, policy_version 45251 (0.0030) +[2024-06-18 03:04:06,690][12883] Updated weights for policy 0, policy_version 45261 (0.0034) +[2024-06-18 03:04:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42376.3). Total num frames: 741556224. Throughput: 0: 42267.1. Samples: 741691920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 03:04:06,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 03:04:10,326][12883] Updated weights for policy 0, policy_version 45271 (0.0040) +[2024-06-18 03:04:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 741801984. Throughput: 0: 42206.7. Samples: 741938580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 03:04:11,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 03:04:14,420][12883] Updated weights for policy 0, policy_version 45281 (0.0031) +[2024-06-18 03:04:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42056.6, 300 sec: 42431.8). Total num frames: 741982208. Throughput: 0: 42220.9. Samples: 742069000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 03:04:16,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 03:04:18,139][12883] Updated weights for policy 0, policy_version 45291 (0.0027) +[2024-06-18 03:04:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42320.9). Total num frames: 742195200. Throughput: 0: 42173.8. Samples: 742318600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 03:04:21,995][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 03:04:22,341][12883] Updated weights for policy 0, policy_version 45301 (0.0030) +[2024-06-18 03:04:25,846][12883] Updated weights for policy 0, policy_version 45311 (0.0037) +[2024-06-18 03:04:26,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 742440960. Throughput: 0: 42360.7. Samples: 742572960. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) +[2024-06-18 03:04:26,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 03:04:29,847][12883] Updated weights for policy 0, policy_version 45321 (0.0025) +[2024-06-18 03:04:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 742637568. Throughput: 0: 42396.6. Samples: 742710260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) +[2024-06-18 03:04:31,995][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 03:04:33,297][12883] Updated weights for policy 0, policy_version 45331 (0.0028) +[2024-06-18 03:04:36,994][12645] Fps is (10 sec: 37684.1, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 742817792. Throughput: 0: 42271.1. Samples: 742956380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) +[2024-06-18 03:04:36,994][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 03:04:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045339_742834176.pth... +[2024-06-18 03:04:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044718_732659712.pth +[2024-06-18 03:04:38,031][12883] Updated weights for policy 0, policy_version 45341 (0.0028) +[2024-06-18 03:04:41,151][12883] Updated weights for policy 0, policy_version 45351 (0.0037) +[2024-06-18 03:04:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42326.9, 300 sec: 42431.8). Total num frames: 743063552. Throughput: 0: 42345.4. Samples: 743207720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) +[2024-06-18 03:04:41,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 03:04:45,613][12883] Updated weights for policy 0, policy_version 45361 (0.0034) +[2024-06-18 03:04:46,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 743276544. Throughput: 0: 42464.1. Samples: 743346380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) +[2024-06-18 03:04:46,994][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 03:04:48,979][12883] Updated weights for policy 0, policy_version 45371 (0.0042) +[2024-06-18 03:04:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42265.5). Total num frames: 743456768. Throughput: 0: 42202.6. Samples: 743591040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) +[2024-06-18 03:04:51,994][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 03:04:53,353][12883] Updated weights for policy 0, policy_version 45381 (0.0030) +[2024-06-18 03:04:56,586][12883] Updated weights for policy 0, policy_version 45391 (0.0029) +[2024-06-18 03:04:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 743686144. Throughput: 0: 42275.1. Samples: 743840960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:04:56,996][12645] Avg episode reward: [(0, '0.139')] +[2024-06-18 03:05:01,088][12883] Updated weights for policy 0, policy_version 45401 (0.0031) +[2024-06-18 03:05:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 743882752. Throughput: 0: 42383.5. Samples: 743976260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:05:01,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 03:05:04,313][12883] Updated weights for policy 0, policy_version 45411 (0.0023) +[2024-06-18 03:05:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 744095744. Throughput: 0: 42383.5. Samples: 744225860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:05:06,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 03:05:08,662][12883] Updated weights for policy 0, policy_version 45421 (0.0028) +[2024-06-18 03:05:11,933][12883] Updated weights for policy 0, policy_version 45431 (0.0032) +[2024-06-18 03:05:11,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 744341504. Throughput: 0: 42381.5. Samples: 744480120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:05:11,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 03:05:15,909][12862] Signal inference workers to stop experience collection... (10600 times) +[2024-06-18 03:05:15,910][12862] Signal inference workers to resume experience collection... (10600 times) +[2024-06-18 03:05:15,932][12883] InferenceWorker_p0-w0: stopping experience collection (10600 times) +[2024-06-18 03:05:15,936][12883] InferenceWorker_p0-w0: resuming experience collection (10600 times) +[2024-06-18 03:05:16,432][12883] Updated weights for policy 0, policy_version 45441 (0.0032) +[2024-06-18 03:05:16,997][12645] Fps is (10 sec: 44220.9, 60 sec: 42595.8, 300 sec: 42431.2). Total num frames: 744538112. Throughput: 0: 42280.7. Samples: 744613040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:05:16,998][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 03:05:19,650][12883] Updated weights for policy 0, policy_version 45451 (0.0039) +[2024-06-18 03:05:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 744751104. Throughput: 0: 42442.6. Samples: 744866300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 03:05:21,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 03:05:24,041][12883] Updated weights for policy 0, policy_version 45461 (0.0031) +[2024-06-18 03:05:26,994][12645] Fps is (10 sec: 44253.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 744980480. Throughput: 0: 42503.5. Samples: 745120380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 03:05:26,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 03:05:27,182][12883] Updated weights for policy 0, policy_version 45471 (0.0028) +[2024-06-18 03:05:31,687][12883] Updated weights for policy 0, policy_version 45481 (0.0042) +[2024-06-18 03:05:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 745160704. Throughput: 0: 42290.6. Samples: 745249460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 03:05:31,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 03:05:34,849][12883] Updated weights for policy 0, policy_version 45491 (0.0027) +[2024-06-18 03:05:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42376.3). Total num frames: 745406464. Throughput: 0: 42516.1. Samples: 745504260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 03:05:36,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 03:05:39,590][12883] Updated weights for policy 0, policy_version 45501 (0.0035) +[2024-06-18 03:05:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 745619456. Throughput: 0: 42752.9. Samples: 745764840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 03:05:41,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 03:05:42,510][12883] Updated weights for policy 0, policy_version 45511 (0.0032) +[2024-06-18 03:05:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 745799680. Throughput: 0: 42364.0. Samples: 745882640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 03:05:46,994][12645] Avg episode reward: [(0, '0.067')] +[2024-06-18 03:05:47,237][12883] Updated weights for policy 0, policy_version 45521 (0.0031) +[2024-06-18 03:05:50,799][12883] Updated weights for policy 0, policy_version 45531 (0.0037) +[2024-06-18 03:05:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42376.3). Total num frames: 746029056. Throughput: 0: 42529.5. Samples: 746139680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 03:05:51,994][12645] Avg episode reward: [(0, '0.026')] +[2024-06-18 03:05:54,964][12883] Updated weights for policy 0, policy_version 45541 (0.0038) +[2024-06-18 03:05:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 746225664. Throughput: 0: 42519.1. Samples: 746393480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 03:05:56,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 03:05:58,605][12883] Updated weights for policy 0, policy_version 45551 (0.0034) +[2024-06-18 03:06:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 746438656. Throughput: 0: 42436.8. Samples: 746522540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 03:06:01,998][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 03:06:02,663][12883] Updated weights for policy 0, policy_version 45561 (0.0038) +[2024-06-18 03:06:06,074][12883] Updated weights for policy 0, policy_version 45571 (0.0048) +[2024-06-18 03:06:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 746668032. Throughput: 0: 42380.9. Samples: 746773440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 03:06:06,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 03:06:10,291][12883] Updated weights for policy 0, policy_version 45581 (0.0035) +[2024-06-18 03:06:11,997][12645] Fps is (10 sec: 42583.3, 60 sec: 42049.7, 300 sec: 42320.2). Total num frames: 746864640. Throughput: 0: 42291.3. Samples: 747023640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 03:06:12,006][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 03:06:13,746][12883] Updated weights for policy 0, policy_version 45591 (0.0040) +[2024-06-18 03:06:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42054.9, 300 sec: 42265.2). Total num frames: 747061248. Throughput: 0: 42337.9. Samples: 747154660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 03:06:16,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 03:06:17,779][12883] Updated weights for policy 0, policy_version 45601 (0.0031) +[2024-06-18 03:06:21,656][12883] Updated weights for policy 0, policy_version 45611 (0.0030) +[2024-06-18 03:06:21,994][12645] Fps is (10 sec: 42613.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 747290624. Throughput: 0: 42240.9. Samples: 747405100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 03:06:21,994][12645] Avg episode reward: [(0, '0.067')] +[2024-06-18 03:06:25,312][12883] Updated weights for policy 0, policy_version 45621 (0.0032) +[2024-06-18 03:06:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 747503616. Throughput: 0: 42205.4. Samples: 747664080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 03:06:26,994][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 03:06:29,322][12883] Updated weights for policy 0, policy_version 45631 (0.0027) +[2024-06-18 03:06:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 747700224. Throughput: 0: 42429.8. Samples: 747791980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 03:06:31,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 03:06:32,241][12862] Signal inference workers to stop experience collection... (10650 times) +[2024-06-18 03:06:32,241][12862] Signal inference workers to resume experience collection... (10650 times) +[2024-06-18 03:06:32,267][12883] InferenceWorker_p0-w0: stopping experience collection (10650 times) +[2024-06-18 03:06:32,268][12883] InferenceWorker_p0-w0: resuming experience collection (10650 times) +[2024-06-18 03:06:33,054][12883] Updated weights for policy 0, policy_version 45641 (0.0036) +[2024-06-18 03:06:36,947][12883] Updated weights for policy 0, policy_version 45651 (0.0025) +[2024-06-18 03:06:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 747945984. Throughput: 0: 42397.7. Samples: 748047580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 03:06:36,994][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 03:06:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045651_747945984.pth... +[2024-06-18 03:06:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045030_737771520.pth +[2024-06-18 03:06:40,615][12883] Updated weights for policy 0, policy_version 45661 (0.0036) +[2024-06-18 03:06:41,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 748142592. Throughput: 0: 42398.6. Samples: 748301420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 03:06:41,994][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 03:06:45,056][12883] Updated weights for policy 0, policy_version 45671 (0.0037) +[2024-06-18 03:06:46,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 748339200. Throughput: 0: 42436.4. Samples: 748432180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 03:06:46,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 03:06:48,281][12883] Updated weights for policy 0, policy_version 45681 (0.0032) +[2024-06-18 03:06:51,999][12645] Fps is (10 sec: 42576.2, 60 sec: 42321.5, 300 sec: 42319.9). Total num frames: 748568576. Throughput: 0: 42431.9. Samples: 748683100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:06:51,999][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 03:06:52,832][12883] Updated weights for policy 0, policy_version 45691 (0.0033) +[2024-06-18 03:06:56,255][12883] Updated weights for policy 0, policy_version 45701 (0.0022) +[2024-06-18 03:06:56,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 748781568. Throughput: 0: 42645.7. Samples: 748942540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:06:56,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 03:07:00,441][12883] Updated weights for policy 0, policy_version 45711 (0.0033) +[2024-06-18 03:07:01,994][12645] Fps is (10 sec: 42620.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 748994560. Throughput: 0: 42449.6. Samples: 749064900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:07:01,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 03:07:04,223][12883] Updated weights for policy 0, policy_version 45721 (0.0033) +[2024-06-18 03:07:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42321.0). Total num frames: 749207552. Throughput: 0: 42526.2. Samples: 749318780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:07:06,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 03:07:08,157][12883] Updated weights for policy 0, policy_version 45731 (0.0032) +[2024-06-18 03:07:11,877][12883] Updated weights for policy 0, policy_version 45741 (0.0042) +[2024-06-18 03:07:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42600.9, 300 sec: 42487.3). Total num frames: 749420544. Throughput: 0: 42325.2. Samples: 749568720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:07:11,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 03:07:15,836][12883] Updated weights for policy 0, policy_version 45751 (0.0030) +[2024-06-18 03:07:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 749617152. Throughput: 0: 42306.2. Samples: 749695760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:07:16,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 03:07:20,010][12883] Updated weights for policy 0, policy_version 45761 (0.0029) +[2024-06-18 03:07:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 749830144. Throughput: 0: 42162.7. Samples: 749944900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:07:21,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 03:07:23,787][12883] Updated weights for policy 0, policy_version 45771 (0.0031) +[2024-06-18 03:07:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 750026752. Throughput: 0: 42191.2. Samples: 750200020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:07:26,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 03:07:27,744][12883] Updated weights for policy 0, policy_version 45781 (0.0038) +[2024-06-18 03:07:31,458][12883] Updated weights for policy 0, policy_version 45791 (0.0042) +[2024-06-18 03:07:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 750256128. Throughput: 0: 42024.6. Samples: 750323280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:07:32,003][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 03:07:35,771][12883] Updated weights for policy 0, policy_version 45801 (0.0042) +[2024-06-18 03:07:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 750452736. Throughput: 0: 42031.0. Samples: 750574280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:07:36,995][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 03:07:39,202][12883] Updated weights for policy 0, policy_version 45811 (0.0029) +[2024-06-18 03:07:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 750665728. Throughput: 0: 41733.8. Samples: 750820560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:07:41,994][12645] Avg episode reward: [(0, '0.096')] +[2024-06-18 03:07:43,703][12883] Updated weights for policy 0, policy_version 45821 (0.0037) +[2024-06-18 03:07:46,991][12883] Updated weights for policy 0, policy_version 45831 (0.0030) +[2024-06-18 03:07:46,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 750895104. Throughput: 0: 41909.0. Samples: 750950800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 03:07:47,002][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 03:07:51,566][12883] Updated weights for policy 0, policy_version 45841 (0.0040) +[2024-06-18 03:07:51,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41509.7, 300 sec: 42209.6). Total num frames: 751058944. Throughput: 0: 41929.3. Samples: 751205600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 03:07:51,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 03:07:54,670][12883] Updated weights for policy 0, policy_version 45851 (0.0052) +[2024-06-18 03:07:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 751288320. Throughput: 0: 41917.5. Samples: 751455000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 03:07:56,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 03:07:59,437][12883] Updated weights for policy 0, policy_version 45861 (0.0032) +[2024-06-18 03:08:01,994][12645] Fps is (10 sec: 45876.2, 60 sec: 42052.4, 300 sec: 42376.2). Total num frames: 751517696. Throughput: 0: 42008.9. Samples: 751586160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 03:08:01,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 03:08:02,487][12883] Updated weights for policy 0, policy_version 45871 (0.0035) +[2024-06-18 03:08:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 751697920. Throughput: 0: 42049.9. Samples: 751837140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 03:08:06,994][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 03:08:07,210][12883] Updated weights for policy 0, policy_version 45881 (0.0029) +[2024-06-18 03:08:10,366][12883] Updated weights for policy 0, policy_version 45891 (0.0036) +[2024-06-18 03:08:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42266.1). Total num frames: 751927296. Throughput: 0: 41912.1. Samples: 752086060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 03:08:11,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 03:08:14,801][12883] Updated weights for policy 0, policy_version 45901 (0.0033) +[2024-06-18 03:08:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 752140288. Throughput: 0: 42069.3. Samples: 752216400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:08:16,994][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 03:08:17,987][12883] Updated weights for policy 0, policy_version 45911 (0.0037) +[2024-06-18 03:08:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 752353280. Throughput: 0: 42209.1. Samples: 752473680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:08:21,994][12645] Avg episode reward: [(0, '0.119')] +[2024-06-18 03:08:22,431][12883] Updated weights for policy 0, policy_version 45921 (0.0024) +[2024-06-18 03:08:23,669][12862] Signal inference workers to stop experience collection... (10700 times) +[2024-06-18 03:08:23,669][12862] Signal inference workers to resume experience collection... (10700 times) +[2024-06-18 03:08:23,685][12883] InferenceWorker_p0-w0: stopping experience collection (10700 times) +[2024-06-18 03:08:23,692][12883] InferenceWorker_p0-w0: resuming experience collection (10700 times) +[2024-06-18 03:08:25,707][12883] Updated weights for policy 0, policy_version 45931 (0.0040) +[2024-06-18 03:08:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 752582656. Throughput: 0: 42089.6. Samples: 752714600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:08:26,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 03:08:30,163][12883] Updated weights for policy 0, policy_version 45941 (0.0034) +[2024-06-18 03:08:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 752779264. Throughput: 0: 42175.4. Samples: 752848700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:08:31,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 03:08:33,451][12883] Updated weights for policy 0, policy_version 45951 (0.0042) +[2024-06-18 03:08:36,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 42210.0). Total num frames: 752975872. Throughput: 0: 42123.3. Samples: 753101140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:08:36,994][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 03:08:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045958_752975872.pth... +[2024-06-18 03:08:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045339_742834176.pth +[2024-06-18 03:08:37,918][12883] Updated weights for policy 0, policy_version 45961 (0.0028) +[2024-06-18 03:08:41,251][12883] Updated weights for policy 0, policy_version 45971 (0.0036) +[2024-06-18 03:08:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42321.0). Total num frames: 753205248. Throughput: 0: 41968.8. Samples: 753343600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:08:41,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 03:08:45,669][12883] Updated weights for policy 0, policy_version 45981 (0.0031) +[2024-06-18 03:08:46,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41504.6, 300 sec: 42264.8). Total num frames: 753385472. Throughput: 0: 42003.1. Samples: 753476400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 03:08:46,997][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 03:08:49,128][12883] Updated weights for policy 0, policy_version 45991 (0.0031) +[2024-06-18 03:08:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 753582080. Throughput: 0: 41824.4. Samples: 753719240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 03:08:51,994][12645] Avg episode reward: [(0, '0.099')] +[2024-06-18 03:08:53,640][12883] Updated weights for policy 0, policy_version 46001 (0.0035) +[2024-06-18 03:08:56,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 753827840. Throughput: 0: 41948.0. Samples: 753973720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 03:08:56,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 03:08:57,034][12883] Updated weights for policy 0, policy_version 46011 (0.0030) +[2024-06-18 03:09:01,604][12883] Updated weights for policy 0, policy_version 46021 (0.0032) +[2024-06-18 03:09:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 42209.6). Total num frames: 754008064. Throughput: 0: 41898.6. Samples: 754101840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 03:09:01,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 03:09:04,931][12883] Updated weights for policy 0, policy_version 46031 (0.0037) +[2024-06-18 03:09:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 754221056. Throughput: 0: 41531.0. Samples: 754342580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 03:09:06,994][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 03:09:09,463][12883] Updated weights for policy 0, policy_version 46041 (0.0038) +[2024-06-18 03:09:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 754450432. Throughput: 0: 41888.5. Samples: 754599580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) +[2024-06-18 03:09:12,006][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 03:09:12,930][12883] Updated weights for policy 0, policy_version 46051 (0.0027) +[2024-06-18 03:09:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 754647040. Throughput: 0: 41717.1. Samples: 754725960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:16,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 03:09:17,093][12883] Updated weights for policy 0, policy_version 46061 (0.0023) +[2024-06-18 03:09:20,981][12883] Updated weights for policy 0, policy_version 46071 (0.0039) +[2024-06-18 03:09:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 754860032. Throughput: 0: 41560.7. Samples: 754971380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:21,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 03:09:24,965][12883] Updated weights for policy 0, policy_version 46081 (0.0045) +[2024-06-18 03:09:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.3, 300 sec: 42154.1). Total num frames: 755073024. Throughput: 0: 41870.3. Samples: 755227760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:26,994][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 03:09:28,802][12883] Updated weights for policy 0, policy_version 46091 (0.0032) +[2024-06-18 03:09:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 42154.1). Total num frames: 755253248. Throughput: 0: 41540.6. Samples: 755345640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:31,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 03:09:32,809][12883] Updated weights for policy 0, policy_version 46101 (0.0039) +[2024-06-18 03:09:36,929][12883] Updated weights for policy 0, policy_version 46111 (0.0029) +[2024-06-18 03:09:36,996][12645] Fps is (10 sec: 40949.1, 60 sec: 41777.3, 300 sec: 42098.2). Total num frames: 755482624. Throughput: 0: 41711.8. Samples: 755596380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:36,997][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 03:09:40,615][12883] Updated weights for policy 0, policy_version 46121 (0.0042) +[2024-06-18 03:09:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 755695616. Throughput: 0: 41741.3. Samples: 755852080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:41,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 03:09:44,488][12883] Updated weights for policy 0, policy_version 46131 (0.0034) +[2024-06-18 03:09:46,994][12645] Fps is (10 sec: 40970.5, 60 sec: 41780.7, 300 sec: 42154.1). Total num frames: 755892224. Throughput: 0: 41560.1. Samples: 755972040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:46,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 03:09:48,659][12883] Updated weights for policy 0, policy_version 46141 (0.0042) +[2024-06-18 03:09:50,194][12862] Signal inference workers to stop experience collection... (10750 times) +[2024-06-18 03:09:50,195][12862] Signal inference workers to resume experience collection... (10750 times) +[2024-06-18 03:09:50,218][12883] InferenceWorker_p0-w0: stopping experience collection (10750 times) +[2024-06-18 03:09:50,218][12883] InferenceWorker_p0-w0: resuming experience collection (10750 times) +[2024-06-18 03:09:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 756121600. Throughput: 0: 41764.0. Samples: 756221960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:51,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 03:09:52,635][12883] Updated weights for policy 0, policy_version 46151 (0.0030) +[2024-06-18 03:09:56,503][12883] Updated weights for policy 0, policy_version 46161 (0.0032) +[2024-06-18 03:09:56,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41232.9, 300 sec: 42098.5). Total num frames: 756301824. Throughput: 0: 41502.1. Samples: 756467180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:09:56,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 03:10:00,298][12883] Updated weights for policy 0, policy_version 46171 (0.0038) +[2024-06-18 03:10:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 756531200. Throughput: 0: 41568.3. Samples: 756596540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:10:02,000][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 03:10:04,485][12883] Updated weights for policy 0, policy_version 46181 (0.0028) +[2024-06-18 03:10:06,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 756727808. Throughput: 0: 41694.3. Samples: 756847620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 03:10:06,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 03:10:08,284][12883] Updated weights for policy 0, policy_version 46191 (0.0032) +[2024-06-18 03:10:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 42043.5). Total num frames: 756940800. Throughput: 0: 41535.5. Samples: 757096860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:10:11,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 03:10:12,833][12883] Updated weights for policy 0, policy_version 46201 (0.0037) +[2024-06-18 03:10:16,050][12883] Updated weights for policy 0, policy_version 46211 (0.0037) +[2024-06-18 03:10:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 757137408. Throughput: 0: 41596.1. Samples: 757217460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:10:16,994][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 03:10:20,431][12883] Updated weights for policy 0, policy_version 46221 (0.0039) +[2024-06-18 03:10:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 757350400. Throughput: 0: 41637.6. Samples: 757469960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:10:21,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 03:10:23,876][12883] Updated weights for policy 0, policy_version 46231 (0.0037) +[2024-06-18 03:10:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41931.9). Total num frames: 757530624. Throughput: 0: 41598.2. Samples: 757724000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:10:26,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 03:10:28,067][12883] Updated weights for policy 0, policy_version 46241 (0.0032) +[2024-06-18 03:10:31,487][12883] Updated weights for policy 0, policy_version 46251 (0.0033) +[2024-06-18 03:10:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 757776384. Throughput: 0: 41543.6. Samples: 757841500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:10:31,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 03:10:35,877][12883] Updated weights for policy 0, policy_version 46261 (0.0025) +[2024-06-18 03:10:36,994][12645] Fps is (10 sec: 45875.8, 60 sec: 41781.0, 300 sec: 41931.9). Total num frames: 757989376. Throughput: 0: 41726.3. Samples: 758099640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) +[2024-06-18 03:10:36,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 03:10:37,104][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046265_758005760.pth... +[2024-06-18 03:10:37,156][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045651_747945984.pth +[2024-06-18 03:10:39,159][12883] Updated weights for policy 0, policy_version 46271 (0.0040) +[2024-06-18 03:10:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 758169600. Throughput: 0: 41885.5. Samples: 758352020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:10:41,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 03:10:43,771][12883] Updated weights for policy 0, policy_version 46281 (0.0038) +[2024-06-18 03:10:46,859][12883] Updated weights for policy 0, policy_version 46291 (0.0032) +[2024-06-18 03:10:47,000][12645] Fps is (10 sec: 44208.9, 60 sec: 42321.0, 300 sec: 42042.1). Total num frames: 758431744. Throughput: 0: 41847.5. Samples: 758479940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:10:47,009][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 03:10:51,993][12883] Updated weights for policy 0, policy_version 46301 (0.0038) +[2024-06-18 03:10:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 758595584. Throughput: 0: 41873.3. Samples: 758731920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:10:51,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 03:10:54,669][12883] Updated weights for policy 0, policy_version 46311 (0.0026) +[2024-06-18 03:10:56,994][12645] Fps is (10 sec: 39346.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 758824960. Throughput: 0: 41839.1. Samples: 758979620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:10:56,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 03:10:58,081][12862] Signal inference workers to stop experience collection... (10800 times) +[2024-06-18 03:10:58,082][12862] Signal inference workers to resume experience collection... (10800 times) +[2024-06-18 03:10:58,093][12883] InferenceWorker_p0-w0: stopping experience collection (10800 times) +[2024-06-18 03:10:58,093][12883] InferenceWorker_p0-w0: resuming experience collection (10800 times) +[2024-06-18 03:10:59,629][12883] Updated weights for policy 0, policy_version 46321 (0.0039) +[2024-06-18 03:11:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 759037952. Throughput: 0: 42192.4. Samples: 759116120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:11:01,994][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 03:11:02,699][12883] Updated weights for policy 0, policy_version 46331 (0.0028) +[2024-06-18 03:11:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41876.9). Total num frames: 759218176. Throughput: 0: 42163.1. Samples: 759367300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:11:06,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 03:11:07,275][12883] Updated weights for policy 0, policy_version 46341 (0.0032) +[2024-06-18 03:11:10,669][12883] Updated weights for policy 0, policy_version 46351 (0.0031) +[2024-06-18 03:11:11,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42050.7, 300 sec: 42042.7). Total num frames: 759463936. Throughput: 0: 41834.9. Samples: 759606660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:11:11,997][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 03:11:15,063][12883] Updated weights for policy 0, policy_version 46361 (0.0031) +[2024-06-18 03:11:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 759644160. Throughput: 0: 42186.7. Samples: 759739900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:11:16,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 03:11:18,320][12883] Updated weights for policy 0, policy_version 46371 (0.0035) +[2024-06-18 03:11:21,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 759873536. Throughput: 0: 42045.2. Samples: 759991680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:11:21,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 03:11:22,730][12883] Updated weights for policy 0, policy_version 46381 (0.0033) +[2024-06-18 03:11:26,126][12883] Updated weights for policy 0, policy_version 46391 (0.0055) +[2024-06-18 03:11:26,994][12645] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42098.6). Total num frames: 760119296. Throughput: 0: 41942.7. Samples: 760239440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:11:26,994][12645] Avg episode reward: [(0, '0.104')] +[2024-06-18 03:11:30,391][12883] Updated weights for policy 0, policy_version 46401 (0.0036) +[2024-06-18 03:11:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 760283136. Throughput: 0: 42048.9. Samples: 760371880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:11:31,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 03:11:33,803][12883] Updated weights for policy 0, policy_version 46411 (0.0034) +[2024-06-18 03:11:36,994][12645] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 760496128. Throughput: 0: 41919.6. Samples: 760618300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:11:36,994][12645] Avg episode reward: [(0, '0.098')] +[2024-06-18 03:11:38,529][12883] Updated weights for policy 0, policy_version 46421 (0.0040) +[2024-06-18 03:11:41,483][12883] Updated weights for policy 0, policy_version 46431 (0.0023) +[2024-06-18 03:11:41,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 760725504. Throughput: 0: 41873.9. Samples: 760863940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:11:41,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 03:11:46,415][12883] Updated weights for policy 0, policy_version 46441 (0.0024) +[2024-06-18 03:11:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41510.4, 300 sec: 41877.1). Total num frames: 760922112. Throughput: 0: 41765.3. Samples: 760995560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:11:46,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 03:11:49,848][12883] Updated weights for policy 0, policy_version 46451 (0.0040) +[2024-06-18 03:11:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 761118720. Throughput: 0: 41739.6. Samples: 761245580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:11:51,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 03:11:54,183][12883] Updated weights for policy 0, policy_version 46461 (0.0032) +[2024-06-18 03:11:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 761348096. Throughput: 0: 42060.0. Samples: 761499260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:11:56,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 03:11:57,635][12883] Updated weights for policy 0, policy_version 46471 (0.0033) +[2024-06-18 03:12:01,938][12883] Updated weights for policy 0, policy_version 46481 (0.0028) +[2024-06-18 03:12:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 761544704. Throughput: 0: 41848.4. Samples: 761623080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:12:02,004][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 03:12:05,731][12883] Updated weights for policy 0, policy_version 46491 (0.0029) +[2024-06-18 03:12:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 41876.4). Total num frames: 761774080. Throughput: 0: 41712.0. Samples: 761868720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:12:07,003][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 03:12:09,710][12883] Updated weights for policy 0, policy_version 46501 (0.0032) +[2024-06-18 03:12:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41507.7, 300 sec: 41820.8). Total num frames: 761954304. Throughput: 0: 41859.5. Samples: 762123120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:12:11,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 03:12:13,583][12883] Updated weights for policy 0, policy_version 46511 (0.0034) +[2024-06-18 03:12:16,343][12862] Signal inference workers to stop experience collection... (10850 times) +[2024-06-18 03:12:16,399][12883] InferenceWorker_p0-w0: stopping experience collection (10850 times) +[2024-06-18 03:12:16,401][12862] Signal inference workers to resume experience collection... (10850 times) +[2024-06-18 03:12:16,409][12883] InferenceWorker_p0-w0: resuming experience collection (10850 times) +[2024-06-18 03:12:16,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 762183680. Throughput: 0: 41533.0. Samples: 762240860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:12:16,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 03:12:17,480][12883] Updated weights for policy 0, policy_version 46521 (0.0040) +[2024-06-18 03:12:21,557][12883] Updated weights for policy 0, policy_version 46531 (0.0034) +[2024-06-18 03:12:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.3, 300 sec: 41820.9). Total num frames: 762363904. Throughput: 0: 41732.0. Samples: 762496240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:12:21,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 03:12:25,295][12883] Updated weights for policy 0, policy_version 46541 (0.0032) +[2024-06-18 03:12:26,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40686.8, 300 sec: 41709.8). Total num frames: 762560512. Throughput: 0: 41813.2. Samples: 762745540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:12:26,994][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 03:12:29,388][12883] Updated weights for policy 0, policy_version 46551 (0.0031) +[2024-06-18 03:12:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 762822656. Throughput: 0: 41667.1. Samples: 762870580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:12:31,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 03:12:33,170][12883] Updated weights for policy 0, policy_version 46561 (0.0029) +[2024-06-18 03:12:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 763002880. Throughput: 0: 41848.8. Samples: 763128780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:12:36,994][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 03:12:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046570_763002880.pth... +[2024-06-18 03:12:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045958_752975872.pth +[2024-06-18 03:12:37,562][12883] Updated weights for policy 0, policy_version 46571 (0.0041) +[2024-06-18 03:12:41,499][12883] Updated weights for policy 0, policy_version 46581 (0.0031) +[2024-06-18 03:12:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 763215872. Throughput: 0: 41672.3. Samples: 763374520. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) +[2024-06-18 03:12:41,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 03:12:45,325][12883] Updated weights for policy 0, policy_version 46591 (0.0030) +[2024-06-18 03:12:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 763445248. Throughput: 0: 41673.8. Samples: 763498400. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) +[2024-06-18 03:12:46,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 03:12:49,171][12883] Updated weights for policy 0, policy_version 46601 (0.0025) +[2024-06-18 03:12:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 763625472. Throughput: 0: 41802.3. Samples: 763749820. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) +[2024-06-18 03:12:51,995][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 03:12:52,872][12883] Updated weights for policy 0, policy_version 46611 (0.0036) +[2024-06-18 03:12:56,819][12883] Updated weights for policy 0, policy_version 46621 (0.0033) +[2024-06-18 03:12:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 763854848. Throughput: 0: 41700.4. Samples: 763999640. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) +[2024-06-18 03:12:56,995][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 03:13:00,425][12883] Updated weights for policy 0, policy_version 46631 (0.0032) +[2024-06-18 03:13:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 764051456. Throughput: 0: 41914.6. Samples: 764127020. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) +[2024-06-18 03:13:01,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 03:13:04,413][12883] Updated weights for policy 0, policy_version 46641 (0.0032) +[2024-06-18 03:13:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 764248064. Throughput: 0: 41872.0. Samples: 764380480. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) +[2024-06-18 03:13:06,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 03:13:08,139][12883] Updated weights for policy 0, policy_version 46651 (0.0039) +[2024-06-18 03:13:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 764461056. Throughput: 0: 41842.3. Samples: 764628440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 03:13:11,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 03:13:12,415][12883] Updated weights for policy 0, policy_version 46661 (0.0028) +[2024-06-18 03:13:15,930][12883] Updated weights for policy 0, policy_version 46671 (0.0031) +[2024-06-18 03:13:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 764690432. Throughput: 0: 41945.4. Samples: 764758120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 03:13:16,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 03:13:20,170][12883] Updated weights for policy 0, policy_version 46681 (0.0044) +[2024-06-18 03:13:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 764870656. Throughput: 0: 41809.9. Samples: 765010220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 03:13:21,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 03:13:23,635][12883] Updated weights for policy 0, policy_version 46691 (0.0035) +[2024-06-18 03:13:27,000][12645] Fps is (10 sec: 40934.3, 60 sec: 42321.0, 300 sec: 41764.5). Total num frames: 765100032. Throughput: 0: 41976.1. Samples: 765263700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 03:13:27,009][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 03:13:27,777][12883] Updated weights for policy 0, policy_version 46701 (0.0035) +[2024-06-18 03:13:30,496][12862] Signal inference workers to stop experience collection... (10900 times) +[2024-06-18 03:13:30,497][12862] Signal inference workers to resume experience collection... (10900 times) +[2024-06-18 03:13:30,526][12883] InferenceWorker_p0-w0: stopping experience collection (10900 times) +[2024-06-18 03:13:30,526][12883] InferenceWorker_p0-w0: resuming experience collection (10900 times) +[2024-06-18 03:13:31,350][12883] Updated weights for policy 0, policy_version 46711 (0.0037) +[2024-06-18 03:13:31,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 765345792. Throughput: 0: 42263.0. Samples: 765400240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 03:13:31,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 03:13:35,714][12883] Updated weights for policy 0, policy_version 46721 (0.0033) +[2024-06-18 03:13:36,994][12645] Fps is (10 sec: 40985.3, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 765509632. Throughput: 0: 42064.5. Samples: 765642720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:13:36,994][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 03:13:39,198][12883] Updated weights for policy 0, policy_version 46731 (0.0032) +[2024-06-18 03:13:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 41876.7). Total num frames: 765739008. Throughput: 0: 42217.4. Samples: 765899420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:13:41,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 03:13:43,547][12883] Updated weights for policy 0, policy_version 46741 (0.0038) +[2024-06-18 03:13:46,929][12883] Updated weights for policy 0, policy_version 46751 (0.0035) +[2024-06-18 03:13:46,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 765968384. Throughput: 0: 42225.4. Samples: 766027160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:13:46,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 03:13:51,369][12883] Updated weights for policy 0, policy_version 46761 (0.0032) +[2024-06-18 03:13:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 766148608. Throughput: 0: 42247.3. Samples: 766281620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:13:51,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 03:13:54,641][12883] Updated weights for policy 0, policy_version 46771 (0.0037) +[2024-06-18 03:13:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 766377984. Throughput: 0: 42153.8. Samples: 766525360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:13:56,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 03:13:59,345][12883] Updated weights for policy 0, policy_version 46781 (0.0028) +[2024-06-18 03:14:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 766574592. Throughput: 0: 42217.2. Samples: 766657900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:14:01,994][12645] Avg episode reward: [(0, '0.025')] +[2024-06-18 03:14:02,565][12883] Updated weights for policy 0, policy_version 46791 (0.0030) +[2024-06-18 03:14:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 766771200. Throughput: 0: 42096.1. Samples: 766904540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 03:14:06,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 03:14:07,048][12883] Updated weights for policy 0, policy_version 46801 (0.0027) +[2024-06-18 03:14:10,351][12883] Updated weights for policy 0, policy_version 46811 (0.0031) +[2024-06-18 03:14:11,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 767016960. Throughput: 0: 42019.6. Samples: 767154320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 03:14:11,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 03:14:14,765][12883] Updated weights for policy 0, policy_version 46821 (0.0038) +[2024-06-18 03:14:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 767197184. Throughput: 0: 41888.1. Samples: 767285200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 03:14:16,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 03:14:18,280][12883] Updated weights for policy 0, policy_version 46831 (0.0029) +[2024-06-18 03:14:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 767410176. Throughput: 0: 42021.4. Samples: 767533680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 03:14:21,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 03:14:22,384][12883] Updated weights for policy 0, policy_version 46841 (0.0043) +[2024-06-18 03:14:26,239][12883] Updated weights for policy 0, policy_version 46851 (0.0037) +[2024-06-18 03:14:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42329.7, 300 sec: 41987.5). Total num frames: 767639552. Throughput: 0: 42127.5. Samples: 767795160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 03:14:26,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 03:14:29,998][12883] Updated weights for policy 0, policy_version 46861 (0.0033) +[2024-06-18 03:14:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41506.0, 300 sec: 41876.7). Total num frames: 767836160. Throughput: 0: 41991.3. Samples: 767916780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 03:14:31,995][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 03:14:34,136][12883] Updated weights for policy 0, policy_version 46871 (0.0054) +[2024-06-18 03:14:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 768049152. Throughput: 0: 41829.0. Samples: 768163920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:14:36,994][12645] Avg episode reward: [(0, '0.094')] +[2024-06-18 03:14:37,192][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046880_768081920.pth... +[2024-06-18 03:14:37,239][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046265_758005760.pth +[2024-06-18 03:14:38,223][12883] Updated weights for policy 0, policy_version 46881 (0.0032) +[2024-06-18 03:14:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 768245760. Throughput: 0: 42133.7. Samples: 768421380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:14:42,000][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 03:14:42,062][12883] Updated weights for policy 0, policy_version 46891 (0.0033) +[2024-06-18 03:14:45,775][12883] Updated weights for policy 0, policy_version 46901 (0.0038) +[2024-06-18 03:14:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 768458752. Throughput: 0: 41866.7. Samples: 768541900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:14:46,994][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 03:14:49,780][12883] Updated weights for policy 0, policy_version 46911 (0.0027) +[2024-06-18 03:14:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 768704512. Throughput: 0: 42107.8. Samples: 768799400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:14:51,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 03:14:53,275][12883] Updated weights for policy 0, policy_version 46921 (0.0036) +[2024-06-18 03:14:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 768884736. Throughput: 0: 42155.9. Samples: 769051340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:14:56,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 03:14:57,681][12883] Updated weights for policy 0, policy_version 46931 (0.0022) +[2024-06-18 03:15:00,656][12883] Updated weights for policy 0, policy_version 46941 (0.0028) +[2024-06-18 03:15:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 769081344. Throughput: 0: 41963.5. Samples: 769173560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:15:01,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 03:15:02,875][12862] Signal inference workers to stop experience collection... (10950 times) +[2024-06-18 03:15:02,875][12862] Signal inference workers to resume experience collection... (10950 times) +[2024-06-18 03:15:02,917][12883] InferenceWorker_p0-w0: stopping experience collection (10950 times) +[2024-06-18 03:15:02,917][12883] InferenceWorker_p0-w0: resuming experience collection (10950 times) +[2024-06-18 03:15:05,366][12883] Updated weights for policy 0, policy_version 46951 (0.0028) +[2024-06-18 03:15:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 769310720. Throughput: 0: 42206.8. Samples: 769432980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:15:06,994][12645] Avg episode reward: [(0, '0.099')] +[2024-06-18 03:15:08,433][12883] Updated weights for policy 0, policy_version 46961 (0.0031) +[2024-06-18 03:15:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 769523712. Throughput: 0: 41882.7. Samples: 769679880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:15:11,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 03:15:13,234][12883] Updated weights for policy 0, policy_version 46971 (0.0029) +[2024-06-18 03:15:16,515][12883] Updated weights for policy 0, policy_version 46981 (0.0034) +[2024-06-18 03:15:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 769736704. Throughput: 0: 42058.9. Samples: 769809420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:15:16,994][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 03:15:21,029][12883] Updated weights for policy 0, policy_version 46991 (0.0037) +[2024-06-18 03:15:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 769933312. Throughput: 0: 42304.4. Samples: 770067620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:15:21,994][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 03:15:24,091][12883] Updated weights for policy 0, policy_version 47001 (0.0035) +[2024-06-18 03:15:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 770146304. Throughput: 0: 42140.0. Samples: 770317680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:15:26,995][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 03:15:28,963][12883] Updated weights for policy 0, policy_version 47011 (0.0033) +[2024-06-18 03:15:31,821][12883] Updated weights for policy 0, policy_version 47021 (0.0031) +[2024-06-18 03:15:31,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42871.6, 300 sec: 42098.5). Total num frames: 770408448. Throughput: 0: 42359.1. Samples: 770448060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 03:15:31,994][12645] Avg episode reward: [(0, '0.094')] +[2024-06-18 03:15:36,653][12883] Updated weights for policy 0, policy_version 47031 (0.0031) +[2024-06-18 03:15:36,998][12645] Fps is (10 sec: 42580.8, 60 sec: 42049.4, 300 sec: 42042.4). Total num frames: 770572288. Throughput: 0: 42262.0. Samples: 770701360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 03:15:36,998][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 03:15:39,458][12883] Updated weights for policy 0, policy_version 47041 (0.0029) +[2024-06-18 03:15:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 41877.3). Total num frames: 770785280. Throughput: 0: 42201.0. Samples: 770950380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 03:15:41,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 03:15:44,306][12883] Updated weights for policy 0, policy_version 47051 (0.0031) +[2024-06-18 03:15:46,994][12645] Fps is (10 sec: 45894.6, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 771031040. Throughput: 0: 42332.1. Samples: 771078500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 03:15:46,996][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 03:15:47,013][12862] Saving new best policy, reward=0.369! +[2024-06-18 03:15:47,330][12883] Updated weights for policy 0, policy_version 47061 (0.0035) +[2024-06-18 03:15:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41932.0). Total num frames: 771194880. Throughput: 0: 42202.2. Samples: 771332080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 03:15:51,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 03:15:52,024][12883] Updated weights for policy 0, policy_version 47071 (0.0037) +[2024-06-18 03:15:55,293][12883] Updated weights for policy 0, policy_version 47081 (0.0032) +[2024-06-18 03:15:56,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 771407872. Throughput: 0: 42120.5. Samples: 771575300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 03:15:56,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 03:16:00,415][12883] Updated weights for policy 0, policy_version 47091 (0.0023) +[2024-06-18 03:16:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 771620864. Throughput: 0: 42090.3. Samples: 771703480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 03:16:01,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 03:16:03,328][12883] Updated weights for policy 0, policy_version 47101 (0.0032) +[2024-06-18 03:16:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41876.7). Total num frames: 771817472. Throughput: 0: 41885.9. Samples: 771952480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:16:06,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 03:16:07,933][12883] Updated weights for policy 0, policy_version 47111 (0.0036) +[2024-06-18 03:16:11,209][12883] Updated weights for policy 0, policy_version 47121 (0.0033) +[2024-06-18 03:16:11,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 772063232. Throughput: 0: 41719.9. Samples: 772195080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:16:11,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 03:16:15,796][12883] Updated weights for policy 0, policy_version 47131 (0.0031) +[2024-06-18 03:16:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 772259840. Throughput: 0: 41923.6. Samples: 772334620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:16:16,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 03:16:18,861][12883] Updated weights for policy 0, policy_version 47141 (0.0038) +[2024-06-18 03:16:21,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 772456448. Throughput: 0: 41801.7. Samples: 772582260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:16:21,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 03:16:23,479][12883] Updated weights for policy 0, policy_version 47151 (0.0039) +[2024-06-18 03:16:25,190][12862] Signal inference workers to stop experience collection... (11000 times) +[2024-06-18 03:16:25,190][12862] Signal inference workers to resume experience collection... (11000 times) +[2024-06-18 03:16:25,205][12883] InferenceWorker_p0-w0: stopping experience collection (11000 times) +[2024-06-18 03:16:25,205][12883] InferenceWorker_p0-w0: resuming experience collection (11000 times) +[2024-06-18 03:16:26,922][12883] Updated weights for policy 0, policy_version 47161 (0.0038) +[2024-06-18 03:16:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 772685824. Throughput: 0: 41805.8. Samples: 772831640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:16:26,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 03:16:31,307][12883] Updated weights for policy 0, policy_version 47171 (0.0030) +[2024-06-18 03:16:31,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40959.9, 300 sec: 41931.9). Total num frames: 772866048. Throughput: 0: 41918.5. Samples: 772964840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:16:31,994][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 03:16:34,602][12883] Updated weights for policy 0, policy_version 47181 (0.0035) +[2024-06-18 03:16:36,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42053.6, 300 sec: 41931.6). Total num frames: 773095424. Throughput: 0: 41934.3. Samples: 773219220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:16:36,996][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 03:16:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047186_773095424.pth... +[2024-06-18 03:16:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046570_763002880.pth +[2024-06-18 03:16:39,411][12883] Updated weights for policy 0, policy_version 47191 (0.0030) +[2024-06-18 03:16:41,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 773324800. Throughput: 0: 41747.5. Samples: 773453940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:16:41,995][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 03:16:42,143][12883] Updated weights for policy 0, policy_version 47201 (0.0035) +[2024-06-18 03:16:46,994][12645] Fps is (10 sec: 39329.9, 60 sec: 40959.9, 300 sec: 41931.9). Total num frames: 773488640. Throughput: 0: 41774.9. Samples: 773583360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:16:46,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 03:16:47,465][12883] Updated weights for policy 0, policy_version 47211 (0.0046) +[2024-06-18 03:16:50,591][12883] Updated weights for policy 0, policy_version 47221 (0.0031) +[2024-06-18 03:16:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 773734400. Throughput: 0: 41997.7. Samples: 773842380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:16:51,998][12645] Avg episode reward: [(0, '0.057')] +[2024-06-18 03:16:55,118][12883] Updated weights for policy 0, policy_version 47231 (0.0038) +[2024-06-18 03:16:56,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 773963776. Throughput: 0: 42115.7. Samples: 774090280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:16:56,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 03:16:58,144][12883] Updated weights for policy 0, policy_version 47241 (0.0046) +[2024-06-18 03:17:01,995][12645] Fps is (10 sec: 40954.2, 60 sec: 42051.1, 300 sec: 41931.7). Total num frames: 774144000. Throughput: 0: 41925.2. Samples: 774221320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:17:01,996][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 03:17:02,703][12883] Updated weights for policy 0, policy_version 47251 (0.0036) +[2024-06-18 03:17:05,671][12883] Updated weights for policy 0, policy_version 47261 (0.0045) +[2024-06-18 03:17:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 774373376. Throughput: 0: 42058.7. Samples: 774474900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) +[2024-06-18 03:17:06,994][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 03:17:10,383][12883] Updated weights for policy 0, policy_version 47271 (0.0044) +[2024-06-18 03:17:11,994][12645] Fps is (10 sec: 44243.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 774586368. Throughput: 0: 42137.7. Samples: 774727840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) +[2024-06-18 03:17:11,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 03:17:13,604][12883] Updated weights for policy 0, policy_version 47281 (0.0037) +[2024-06-18 03:17:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 774782976. Throughput: 0: 41983.2. Samples: 774854080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) +[2024-06-18 03:17:16,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 03:17:18,076][12883] Updated weights for policy 0, policy_version 47291 (0.0036) +[2024-06-18 03:17:18,610][12862] Signal inference workers to stop experience collection... (11050 times) +[2024-06-18 03:17:18,610][12862] Signal inference workers to resume experience collection... (11050 times) +[2024-06-18 03:17:18,637][12883] InferenceWorker_p0-w0: stopping experience collection (11050 times) +[2024-06-18 03:17:18,637][12883] InferenceWorker_p0-w0: resuming experience collection (11050 times) +[2024-06-18 03:17:21,135][12883] Updated weights for policy 0, policy_version 47301 (0.0037) +[2024-06-18 03:17:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 775012352. Throughput: 0: 42033.6. Samples: 775110640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) +[2024-06-18 03:17:21,996][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 03:17:25,543][12883] Updated weights for policy 0, policy_version 47311 (0.0039) +[2024-06-18 03:17:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 775208960. Throughput: 0: 42466.3. Samples: 775364920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) +[2024-06-18 03:17:26,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 03:17:28,787][12883] Updated weights for policy 0, policy_version 47321 (0.0036) +[2024-06-18 03:17:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 775405568. Throughput: 0: 42303.5. Samples: 775487020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) +[2024-06-18 03:17:31,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 03:17:33,400][12883] Updated weights for policy 0, policy_version 47331 (0.0033) +[2024-06-18 03:17:36,385][12883] Updated weights for policy 0, policy_version 47341 (0.0032) +[2024-06-18 03:17:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42599.9, 300 sec: 42154.1). Total num frames: 775651328. Throughput: 0: 42283.1. Samples: 775745120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 03:17:36,994][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 03:17:41,051][12883] Updated weights for policy 0, policy_version 47351 (0.0038) +[2024-06-18 03:17:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 775847936. Throughput: 0: 42503.1. Samples: 776002920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 03:17:41,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 03:17:44,144][12883] Updated weights for policy 0, policy_version 47361 (0.0039) +[2024-06-18 03:17:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 776044544. Throughput: 0: 42277.3. Samples: 776123740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 03:17:46,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 03:17:48,854][12883] Updated weights for policy 0, policy_version 47371 (0.0042) +[2024-06-18 03:17:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 776273920. Throughput: 0: 42324.3. Samples: 776379500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 03:17:51,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 03:17:52,015][12883] Updated weights for policy 0, policy_version 47381 (0.0025) +[2024-06-18 03:17:56,716][12883] Updated weights for policy 0, policy_version 47391 (0.0027) +[2024-06-18 03:17:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 776470528. Throughput: 0: 42427.6. Samples: 776637080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 03:17:56,994][12645] Avg episode reward: [(0, '0.137')] +[2024-06-18 03:17:59,997][12883] Updated weights for policy 0, policy_version 47401 (0.0034) +[2024-06-18 03:18:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.3, 300 sec: 42154.1). Total num frames: 776683520. Throughput: 0: 42306.2. Samples: 776757860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 03:18:01,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 03:18:04,301][12883] Updated weights for policy 0, policy_version 47411 (0.0030) +[2024-06-18 03:18:06,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.7, 300 sec: 42209.3). Total num frames: 776912896. Throughput: 0: 42230.8. Samples: 777011120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) +[2024-06-18 03:18:06,996][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 03:18:07,571][12883] Updated weights for policy 0, policy_version 47421 (0.0040) +[2024-06-18 03:18:11,913][12883] Updated weights for policy 0, policy_version 47431 (0.0029) +[2024-06-18 03:18:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 777109504. Throughput: 0: 42323.9. Samples: 777269500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) +[2024-06-18 03:18:11,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 03:18:15,289][12883] Updated weights for policy 0, policy_version 47441 (0.0051) +[2024-06-18 03:18:16,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 777338880. Throughput: 0: 42349.0. Samples: 777392720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) +[2024-06-18 03:18:16,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 03:18:19,590][12883] Updated weights for policy 0, policy_version 47451 (0.0022) +[2024-06-18 03:18:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42099.4). Total num frames: 777519104. Throughput: 0: 42247.9. Samples: 777646280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) +[2024-06-18 03:18:22,000][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 03:18:23,241][12883] Updated weights for policy 0, policy_version 47461 (0.0031) +[2024-06-18 03:18:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 777732096. Throughput: 0: 42175.1. Samples: 777900800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) +[2024-06-18 03:18:26,994][12645] Avg episode reward: [(0, '0.023')] +[2024-06-18 03:18:27,558][12883] Updated weights for policy 0, policy_version 47471 (0.0028) +[2024-06-18 03:18:30,774][12883] Updated weights for policy 0, policy_version 47481 (0.0036) +[2024-06-18 03:18:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 777961472. Throughput: 0: 42289.8. Samples: 778026780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) +[2024-06-18 03:18:31,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 03:18:35,191][12883] Updated weights for policy 0, policy_version 47491 (0.0035) +[2024-06-18 03:18:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 778158080. Throughput: 0: 42263.2. Samples: 778281340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:18:36,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 03:18:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047495_778158080.pth... +[2024-06-18 03:18:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046880_768081920.pth +[2024-06-18 03:18:38,681][12883] Updated weights for policy 0, policy_version 47501 (0.0044) +[2024-06-18 03:18:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 778371072. Throughput: 0: 42100.4. Samples: 778531600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:18:41,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 03:18:42,506][12862] Signal inference workers to stop experience collection... (11100 times) +[2024-06-18 03:18:42,506][12862] Signal inference workers to resume experience collection... (11100 times) +[2024-06-18 03:18:42,525][12883] InferenceWorker_p0-w0: stopping experience collection (11100 times) +[2024-06-18 03:18:42,525][12883] InferenceWorker_p0-w0: resuming experience collection (11100 times) +[2024-06-18 03:18:42,805][12883] Updated weights for policy 0, policy_version 47511 (0.0028) +[2024-06-18 03:18:46,728][12883] Updated weights for policy 0, policy_version 47521 (0.0035) +[2024-06-18 03:18:46,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 778584064. Throughput: 0: 42157.7. Samples: 778654960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:18:46,995][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 03:18:50,565][12883] Updated weights for policy 0, policy_version 47531 (0.0033) +[2024-06-18 03:18:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 778780672. Throughput: 0: 42079.8. Samples: 778904620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:18:51,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 03:18:54,570][12883] Updated weights for policy 0, policy_version 47541 (0.0040) +[2024-06-18 03:18:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 778993664. Throughput: 0: 42043.6. Samples: 779161460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:18:56,994][12645] Avg episode reward: [(0, '0.126')] +[2024-06-18 03:18:58,209][12883] Updated weights for policy 0, policy_version 47551 (0.0042) +[2024-06-18 03:19:01,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 779223040. Throughput: 0: 42179.7. Samples: 779290800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:19:01,994][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 03:19:02,543][12883] Updated weights for policy 0, policy_version 47561 (0.0023) +[2024-06-18 03:19:06,204][12883] Updated weights for policy 0, policy_version 47571 (0.0036) +[2024-06-18 03:19:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41780.7, 300 sec: 42043.0). Total num frames: 779419648. Throughput: 0: 42155.7. Samples: 779543280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 03:19:06,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 03:19:10,304][12883] Updated weights for policy 0, policy_version 47581 (0.0040) +[2024-06-18 03:19:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 779632640. Throughput: 0: 41978.1. Samples: 779789820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 03:19:11,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 03:19:14,076][12883] Updated weights for policy 0, policy_version 47591 (0.0041) +[2024-06-18 03:19:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 779845632. Throughput: 0: 42037.4. Samples: 779918460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 03:19:16,994][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 03:19:17,885][12883] Updated weights for policy 0, policy_version 47601 (0.0040) +[2024-06-18 03:19:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 780042240. Throughput: 0: 42130.6. Samples: 780177220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 03:19:21,994][12645] Avg episode reward: [(0, '0.137')] +[2024-06-18 03:19:21,997][12883] Updated weights for policy 0, policy_version 47611 (0.0037) +[2024-06-18 03:19:25,978][12883] Updated weights for policy 0, policy_version 47621 (0.0026) +[2024-06-18 03:19:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 780255232. Throughput: 0: 42132.1. Samples: 780427540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 03:19:26,994][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 03:19:29,878][12883] Updated weights for policy 0, policy_version 47631 (0.0032) +[2024-06-18 03:19:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 780484608. Throughput: 0: 42179.8. Samples: 780553040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 03:19:31,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 03:19:33,611][12883] Updated weights for policy 0, policy_version 47641 (0.0051) +[2024-06-18 03:19:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 780664832. Throughput: 0: 42280.5. Samples: 780807240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 03:19:36,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 03:19:37,484][12883] Updated weights for policy 0, policy_version 47651 (0.0038) +[2024-06-18 03:19:41,242][12883] Updated weights for policy 0, policy_version 47661 (0.0024) +[2024-06-18 03:19:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 780877824. Throughput: 0: 42221.0. Samples: 781061400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 03:19:41,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 03:19:45,490][12883] Updated weights for policy 0, policy_version 47671 (0.0037) +[2024-06-18 03:19:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 781123584. Throughput: 0: 42195.0. Samples: 781189580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 03:19:46,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 03:19:49,307][12883] Updated weights for policy 0, policy_version 47681 (0.0026) +[2024-06-18 03:19:51,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 781320192. Throughput: 0: 42155.4. Samples: 781440280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 03:19:51,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 03:19:52,987][12883] Updated weights for policy 0, policy_version 47691 (0.0031) +[2024-06-18 03:19:56,910][12883] Updated weights for policy 0, policy_version 47701 (0.0037) +[2024-06-18 03:19:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 781533184. Throughput: 0: 42326.2. Samples: 781694500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 03:19:56,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 03:19:59,592][12862] Signal inference workers to stop experience collection... (11150 times) +[2024-06-18 03:19:59,593][12862] Signal inference workers to resume experience collection... (11150 times) +[2024-06-18 03:19:59,643][12883] InferenceWorker_p0-w0: stopping experience collection (11150 times) +[2024-06-18 03:19:59,643][12883] InferenceWorker_p0-w0: resuming experience collection (11150 times) +[2024-06-18 03:20:00,510][12883] Updated weights for policy 0, policy_version 47711 (0.0033) +[2024-06-18 03:20:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 781762560. Throughput: 0: 42313.8. Samples: 781822580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 03:20:01,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 03:20:04,567][12883] Updated weights for policy 0, policy_version 47721 (0.0035) +[2024-06-18 03:20:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 781942784. Throughput: 0: 42253.8. Samples: 782078640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:20:06,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 03:20:08,251][12883] Updated weights for policy 0, policy_version 47731 (0.0026) +[2024-06-18 03:20:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 782155776. Throughput: 0: 42160.3. Samples: 782324760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:20:11,994][12645] Avg episode reward: [(0, '0.034')] +[2024-06-18 03:20:12,241][12883] Updated weights for policy 0, policy_version 47741 (0.0037) +[2024-06-18 03:20:15,927][12883] Updated weights for policy 0, policy_version 47751 (0.0030) +[2024-06-18 03:20:16,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42596.8, 300 sec: 42264.9). Total num frames: 782401536. Throughput: 0: 42303.2. Samples: 782456780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:20:16,997][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 03:20:19,816][12883] Updated weights for policy 0, policy_version 47761 (0.0038) +[2024-06-18 03:20:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 782581760. Throughput: 0: 42285.8. Samples: 782710100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:20:21,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 03:20:23,704][12883] Updated weights for policy 0, policy_version 47771 (0.0039) +[2024-06-18 03:20:26,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 782794752. Throughput: 0: 42194.7. Samples: 782960160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:20:26,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 03:20:27,543][12883] Updated weights for policy 0, policy_version 47781 (0.0033) +[2024-06-18 03:20:31,695][12883] Updated weights for policy 0, policy_version 47791 (0.0030) +[2024-06-18 03:20:31,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42210.2). Total num frames: 783024128. Throughput: 0: 42098.3. Samples: 783084000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:20:31,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 03:20:35,819][12883] Updated weights for policy 0, policy_version 47801 (0.0032) +[2024-06-18 03:20:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 783204352. Throughput: 0: 42179.6. Samples: 783338360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 03:20:36,994][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 03:20:37,120][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047804_783220736.pth... +[2024-06-18 03:20:37,185][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047186_773095424.pth +[2024-06-18 03:20:39,387][12883] Updated weights for policy 0, policy_version 47811 (0.0035) +[2024-06-18 03:20:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 783417344. Throughput: 0: 41983.2. Samples: 783583740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 03:20:41,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 03:20:43,679][12883] Updated weights for policy 0, policy_version 47821 (0.0025) +[2024-06-18 03:20:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 783646720. Throughput: 0: 42106.5. Samples: 783717380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 03:20:46,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 03:20:47,113][12883] Updated weights for policy 0, policy_version 47831 (0.0045) +[2024-06-18 03:20:51,610][12883] Updated weights for policy 0, policy_version 47841 (0.0034) +[2024-06-18 03:20:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 783843328. Throughput: 0: 41909.9. Samples: 783964580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 03:20:51,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 03:20:54,771][12883] Updated weights for policy 0, policy_version 47851 (0.0042) +[2024-06-18 03:20:54,965][12862] Signal inference workers to stop experience collection... (11200 times) +[2024-06-18 03:20:54,971][12862] Signal inference workers to resume experience collection... (11200 times) +[2024-06-18 03:20:54,999][12883] InferenceWorker_p0-w0: stopping experience collection (11200 times) +[2024-06-18 03:20:55,004][12883] InferenceWorker_p0-w0: resuming experience collection (11200 times) +[2024-06-18 03:20:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 784072704. Throughput: 0: 41903.6. Samples: 784210420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 03:20:56,994][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 03:20:59,450][12883] Updated weights for policy 0, policy_version 47861 (0.0037) +[2024-06-18 03:21:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 784252928. Throughput: 0: 41895.5. Samples: 784341980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 03:21:01,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 03:21:02,740][12883] Updated weights for policy 0, policy_version 47871 (0.0034) +[2024-06-18 03:21:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 784449536. Throughput: 0: 41797.4. Samples: 784590980. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) +[2024-06-18 03:21:06,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 03:21:07,230][12883] Updated weights for policy 0, policy_version 47881 (0.0032) +[2024-06-18 03:21:10,504][12883] Updated weights for policy 0, policy_version 47891 (0.0038) +[2024-06-18 03:21:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 784695296. Throughput: 0: 41592.5. Samples: 784831820. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) +[2024-06-18 03:21:11,994][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 03:21:15,104][12883] Updated weights for policy 0, policy_version 47901 (0.0038) +[2024-06-18 03:21:16,994][12645] Fps is (10 sec: 42597.3, 60 sec: 41234.5, 300 sec: 42098.5). Total num frames: 784875520. Throughput: 0: 41850.4. Samples: 784967280. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) +[2024-06-18 03:21:16,994][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 03:21:18,352][12883] Updated weights for policy 0, policy_version 47911 (0.0039) +[2024-06-18 03:21:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 785088512. Throughput: 0: 41638.4. Samples: 785212080. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) +[2024-06-18 03:21:21,994][12645] Avg episode reward: [(0, '0.072')] +[2024-06-18 03:21:23,518][12883] Updated weights for policy 0, policy_version 47921 (0.0036) +[2024-06-18 03:21:26,262][12883] Updated weights for policy 0, policy_version 47931 (0.0031) +[2024-06-18 03:21:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 785317888. Throughput: 0: 41628.8. Samples: 785457040. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) +[2024-06-18 03:21:26,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 03:21:31,211][12883] Updated weights for policy 0, policy_version 47941 (0.0033) +[2024-06-18 03:21:31,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40959.9, 300 sec: 41987.8). Total num frames: 785481728. Throughput: 0: 41613.8. Samples: 785590000. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) +[2024-06-18 03:21:31,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 03:21:34,018][12883] Updated weights for policy 0, policy_version 47951 (0.0037) +[2024-06-18 03:21:36,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 785694720. Throughput: 0: 41592.8. Samples: 785836260. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) +[2024-06-18 03:21:36,994][12645] Avg episode reward: [(0, '0.030')] +[2024-06-18 03:21:39,387][12883] Updated weights for policy 0, policy_version 47961 (0.0047) +[2024-06-18 03:21:41,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 785940480. Throughput: 0: 41534.3. Samples: 786079460. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) +[2024-06-18 03:21:41,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 03:21:42,052][12883] Updated weights for policy 0, policy_version 47971 (0.0025) +[2024-06-18 03:21:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41931.9). Total num frames: 786104320. Throughput: 0: 41660.3. Samples: 786216700. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) +[2024-06-18 03:21:46,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 03:21:47,269][12883] Updated weights for policy 0, policy_version 47981 (0.0036) +[2024-06-18 03:21:49,883][12883] Updated weights for policy 0, policy_version 47991 (0.0032) +[2024-06-18 03:21:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 786350080. Throughput: 0: 41560.9. Samples: 786461220. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) +[2024-06-18 03:21:51,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 03:21:54,802][12883] Updated weights for policy 0, policy_version 48001 (0.0047) +[2024-06-18 03:21:56,994][12645] Fps is (10 sec: 47514.2, 60 sec: 41779.3, 300 sec: 42154.3). Total num frames: 786579456. Throughput: 0: 41821.7. Samples: 786713800. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) +[2024-06-18 03:21:56,994][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 03:21:57,628][12883] Updated weights for policy 0, policy_version 48011 (0.0029) +[2024-06-18 03:21:57,668][12862] Signal inference workers to stop experience collection... (11250 times) +[2024-06-18 03:21:57,668][12862] Signal inference workers to resume experience collection... (11250 times) +[2024-06-18 03:21:57,685][12883] InferenceWorker_p0-w0: stopping experience collection (11250 times) +[2024-06-18 03:21:57,686][12883] InferenceWorker_p0-w0: resuming experience collection (11250 times) +[2024-06-18 03:22:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 786726912. Throughput: 0: 41690.4. Samples: 786843340. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) +[2024-06-18 03:22:01,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 03:22:02,656][12883] Updated weights for policy 0, policy_version 48021 (0.0044) +[2024-06-18 03:22:05,254][12883] Updated weights for policy 0, policy_version 48031 (0.0035) +[2024-06-18 03:22:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 786989056. Throughput: 0: 41685.3. Samples: 787087920. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) +[2024-06-18 03:22:06,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 03:22:10,613][12883] Updated weights for policy 0, policy_version 48041 (0.0035) +[2024-06-18 03:22:11,994][12645] Fps is (10 sec: 45875.5, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 787185664. Throughput: 0: 41962.8. Samples: 787345360. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) +[2024-06-18 03:22:11,994][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 03:22:12,958][12883] Updated weights for policy 0, policy_version 48051 (0.0028) +[2024-06-18 03:22:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 787365888. Throughput: 0: 41741.4. Samples: 787468360. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) +[2024-06-18 03:22:16,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 03:22:18,059][12883] Updated weights for policy 0, policy_version 48061 (0.0029) +[2024-06-18 03:22:21,287][12883] Updated weights for policy 0, policy_version 48071 (0.0027) +[2024-06-18 03:22:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 787611648. Throughput: 0: 41781.3. Samples: 787716420. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) +[2024-06-18 03:22:21,994][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 03:22:25,548][12883] Updated weights for policy 0, policy_version 48081 (0.0036) +[2024-06-18 03:22:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 787808256. Throughput: 0: 42114.6. Samples: 787974620. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) +[2024-06-18 03:22:26,994][12645] Avg episode reward: [(0, '0.070')] +[2024-06-18 03:22:29,454][12883] Updated weights for policy 0, policy_version 48091 (0.0043) +[2024-06-18 03:22:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 788021248. Throughput: 0: 41924.9. Samples: 788103320. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) +[2024-06-18 03:22:31,994][12645] Avg episode reward: [(0, '0.054')] +[2024-06-18 03:22:33,299][12883] Updated weights for policy 0, policy_version 48101 (0.0044) +[2024-06-18 03:22:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 788234240. Throughput: 0: 42080.9. Samples: 788354860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:22:36,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 03:22:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048111_788250624.pth... +[2024-06-18 03:22:37,029][12883] Updated weights for policy 0, policy_version 48111 (0.0036) +[2024-06-18 03:22:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047495_778158080.pth +[2024-06-18 03:22:41,003][12883] Updated weights for policy 0, policy_version 48121 (0.0044) +[2024-06-18 03:22:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 788414464. Throughput: 0: 42181.8. Samples: 788611980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:22:41,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 03:22:44,656][12883] Updated weights for policy 0, policy_version 48131 (0.0042) +[2024-06-18 03:22:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 788660224. Throughput: 0: 41960.0. Samples: 788731540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:22:46,994][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 03:22:48,793][12883] Updated weights for policy 0, policy_version 48141 (0.0043) +[2024-06-18 03:22:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 788873216. Throughput: 0: 42231.1. Samples: 788988320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:22:51,994][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 03:22:52,580][12883] Updated weights for policy 0, policy_version 48151 (0.0036) +[2024-06-18 03:22:56,886][12883] Updated weights for policy 0, policy_version 48161 (0.0024) +[2024-06-18 03:22:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 789069824. Throughput: 0: 42141.3. Samples: 789241720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:22:56,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 03:23:00,049][12883] Updated weights for policy 0, policy_version 48171 (0.0034) +[2024-06-18 03:23:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 41987.8). Total num frames: 789299200. Throughput: 0: 42174.5. Samples: 789366220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:23:01,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 03:23:04,454][12883] Updated weights for policy 0, policy_version 48181 (0.0036) +[2024-06-18 03:23:07,000][12645] Fps is (10 sec: 44209.1, 60 sec: 42047.9, 300 sec: 42042.1). Total num frames: 789512192. Throughput: 0: 42433.8. Samples: 789626200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 03:23:07,000][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 03:23:07,711][12883] Updated weights for policy 0, policy_version 48191 (0.0035) +[2024-06-18 03:23:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 789708800. Throughput: 0: 42255.5. Samples: 789876120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 03:23:11,994][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 03:23:12,135][12883] Updated weights for policy 0, policy_version 48201 (0.0030) +[2024-06-18 03:23:14,357][12862] Signal inference workers to stop experience collection... (11300 times) +[2024-06-18 03:23:14,407][12883] InferenceWorker_p0-w0: stopping experience collection (11300 times) +[2024-06-18 03:23:14,414][12862] Signal inference workers to resume experience collection... (11300 times) +[2024-06-18 03:23:14,424][12883] InferenceWorker_p0-w0: resuming experience collection (11300 times) +[2024-06-18 03:23:15,539][12883] Updated weights for policy 0, policy_version 48211 (0.0058) +[2024-06-18 03:23:16,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 789921792. Throughput: 0: 42141.0. Samples: 789999660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 03:23:16,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 03:23:20,544][12883] Updated weights for policy 0, policy_version 48221 (0.0034) +[2024-06-18 03:23:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 790134784. Throughput: 0: 42365.3. Samples: 790261300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 03:23:21,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 03:23:23,313][12883] Updated weights for policy 0, policy_version 48231 (0.0030) +[2024-06-18 03:23:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 790347776. Throughput: 0: 42099.7. Samples: 790506460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 03:23:26,994][12645] Avg episode reward: [(0, '0.038')] +[2024-06-18 03:23:28,082][12883] Updated weights for policy 0, policy_version 48241 (0.0034) +[2024-06-18 03:23:31,743][12883] Updated weights for policy 0, policy_version 48251 (0.0035) +[2024-06-18 03:23:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 790560768. Throughput: 0: 42348.1. Samples: 790637200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 03:23:31,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 03:23:36,414][12883] Updated weights for policy 0, policy_version 48261 (0.0042) +[2024-06-18 03:23:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 790757376. Throughput: 0: 42332.9. Samples: 790893300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:23:36,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 03:23:39,457][12883] Updated weights for policy 0, policy_version 48271 (0.0036) +[2024-06-18 03:23:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42098.6). Total num frames: 791003136. Throughput: 0: 42075.2. Samples: 791135100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:23:41,994][12645] Avg episode reward: [(0, '0.032')] +[2024-06-18 03:23:44,073][12883] Updated weights for policy 0, policy_version 48281 (0.0045) +[2024-06-18 03:23:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 791183360. Throughput: 0: 42313.9. Samples: 791270340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:23:46,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 03:23:47,026][12883] Updated weights for policy 0, policy_version 48291 (0.0034) +[2024-06-18 03:23:51,684][12883] Updated weights for policy 0, policy_version 48301 (0.0053) +[2024-06-18 03:23:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 791379968. Throughput: 0: 42053.0. Samples: 791518320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:23:51,994][12645] Avg episode reward: [(0, '0.124')] +[2024-06-18 03:23:54,779][12883] Updated weights for policy 0, policy_version 48311 (0.0048) +[2024-06-18 03:23:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42098.5). Total num frames: 791642112. Throughput: 0: 41987.6. Samples: 791765560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:23:56,994][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 03:23:59,306][12883] Updated weights for policy 0, policy_version 48321 (0.0028) +[2024-06-18 03:24:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 791822336. Throughput: 0: 42355.0. Samples: 791905640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 03:24:01,994][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 03:24:02,469][12883] Updated weights for policy 0, policy_version 48331 (0.0049) +[2024-06-18 03:24:06,724][12883] Updated weights for policy 0, policy_version 48341 (0.0038) +[2024-06-18 03:24:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42056.6, 300 sec: 42043.0). Total num frames: 792035328. Throughput: 0: 41979.1. Samples: 792150360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 03:24:06,996][12645] Avg episode reward: [(0, '0.053')] +[2024-06-18 03:24:10,559][12883] Updated weights for policy 0, policy_version 48351 (0.0031) +[2024-06-18 03:24:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 792264704. Throughput: 0: 42134.6. Samples: 792402520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 03:24:11,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 03:24:14,499][12883] Updated weights for policy 0, policy_version 48361 (0.0036) +[2024-06-18 03:24:16,997][12645] Fps is (10 sec: 39309.0, 60 sec: 41776.9, 300 sec: 41987.0). Total num frames: 792428544. Throughput: 0: 42215.2. Samples: 792537020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 03:24:17,004][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 03:24:17,184][12862] Signal inference workers to stop experience collection... (11350 times) +[2024-06-18 03:24:17,208][12883] InferenceWorker_p0-w0: stopping experience collection (11350 times) +[2024-06-18 03:24:17,245][12862] Signal inference workers to resume experience collection... (11350 times) +[2024-06-18 03:24:17,245][12883] InferenceWorker_p0-w0: resuming experience collection (11350 times) +[2024-06-18 03:24:18,060][12883] Updated weights for policy 0, policy_version 48371 (0.0033) +[2024-06-18 03:24:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 792657920. Throughput: 0: 41947.0. Samples: 792780920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 03:24:21,995][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 03:24:22,066][12883] Updated weights for policy 0, policy_version 48381 (0.0031) +[2024-06-18 03:24:25,732][12883] Updated weights for policy 0, policy_version 48391 (0.0032) +[2024-06-18 03:24:26,994][12645] Fps is (10 sec: 47529.0, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 792903680. Throughput: 0: 42286.2. Samples: 793037980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 03:24:26,994][12645] Avg episode reward: [(0, '0.053')] +[2024-06-18 03:24:29,710][12883] Updated weights for policy 0, policy_version 48401 (0.0029) +[2024-06-18 03:24:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 793067520. Throughput: 0: 42110.6. Samples: 793165320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 03:24:31,996][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 03:24:33,843][12883] Updated weights for policy 0, policy_version 48411 (0.0041) +[2024-06-18 03:24:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 793313280. Throughput: 0: 41990.5. Samples: 793407900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 03:24:36,994][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 03:24:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048420_793313280.pth... +[2024-06-18 03:24:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047804_783220736.pth +[2024-06-18 03:24:37,359][12883] Updated weights for policy 0, policy_version 48421 (0.0033) +[2024-06-18 03:24:41,776][12883] Updated weights for policy 0, policy_version 48431 (0.0030) +[2024-06-18 03:24:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 793493504. Throughput: 0: 42366.7. Samples: 793672060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 03:24:41,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 03:24:45,022][12883] Updated weights for policy 0, policy_version 48441 (0.0032) +[2024-06-18 03:24:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 793690112. Throughput: 0: 41764.1. Samples: 793785020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 03:24:46,994][12645] Avg episode reward: [(0, '0.021')] +[2024-06-18 03:24:49,593][12883] Updated weights for policy 0, policy_version 48451 (0.0040) +[2024-06-18 03:24:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 793935872. Throughput: 0: 42036.4. Samples: 794042000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 03:24:51,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 03:24:52,679][12883] Updated weights for policy 0, policy_version 48461 (0.0021) +[2024-06-18 03:24:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 794116096. Throughput: 0: 42475.1. Samples: 794313900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 03:24:56,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 03:24:57,233][12883] Updated weights for policy 0, policy_version 48471 (0.0044) +[2024-06-18 03:25:00,199][12883] Updated weights for policy 0, policy_version 48481 (0.0031) +[2024-06-18 03:25:01,996][12645] Fps is (10 sec: 40951.4, 60 sec: 42050.8, 300 sec: 42042.7). Total num frames: 794345472. Throughput: 0: 42051.2. Samples: 794429280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) +[2024-06-18 03:25:01,996][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 03:25:05,036][12883] Updated weights for policy 0, policy_version 48491 (0.0025) +[2024-06-18 03:25:06,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 794591232. Throughput: 0: 42255.7. Samples: 794682420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) +[2024-06-18 03:25:06,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 03:25:08,107][12883] Updated weights for policy 0, policy_version 48501 (0.0036) +[2024-06-18 03:25:11,994][12645] Fps is (10 sec: 39329.5, 60 sec: 41232.9, 300 sec: 41821.2). Total num frames: 794738688. Throughput: 0: 42240.3. Samples: 794938800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) +[2024-06-18 03:25:11,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 03:25:12,151][12862] Signal inference workers to stop experience collection... (11400 times) +[2024-06-18 03:25:12,152][12862] Signal inference workers to resume experience collection... (11400 times) +[2024-06-18 03:25:12,195][12883] InferenceWorker_p0-w0: stopping experience collection (11400 times) +[2024-06-18 03:25:12,195][12883] InferenceWorker_p0-w0: resuming experience collection (11400 times) +[2024-06-18 03:25:12,930][12883] Updated weights for policy 0, policy_version 48511 (0.0043) +[2024-06-18 03:25:15,917][12883] Updated weights for policy 0, policy_version 48521 (0.0030) +[2024-06-18 03:25:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42873.8, 300 sec: 42098.6). Total num frames: 795000832. Throughput: 0: 41960.5. Samples: 795053540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) +[2024-06-18 03:25:16,994][12645] Avg episode reward: [(0, '0.174')] +[2024-06-18 03:25:20,638][12883] Updated weights for policy 0, policy_version 48531 (0.0035) +[2024-06-18 03:25:21,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 795197440. Throughput: 0: 42465.2. Samples: 795318840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) +[2024-06-18 03:25:21,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 03:25:23,641][12883] Updated weights for policy 0, policy_version 48541 (0.0031) +[2024-06-18 03:25:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 795377664. Throughput: 0: 42290.2. Samples: 795575120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) +[2024-06-18 03:25:26,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 03:25:28,305][12883] Updated weights for policy 0, policy_version 48551 (0.0032) +[2024-06-18 03:25:31,295][12883] Updated weights for policy 0, policy_version 48561 (0.0043) +[2024-06-18 03:25:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 795639808. Throughput: 0: 42428.8. Samples: 795694320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) +[2024-06-18 03:25:31,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 03:25:36,021][12883] Updated weights for policy 0, policy_version 48571 (0.0038) +[2024-06-18 03:25:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 795820032. Throughput: 0: 42413.4. Samples: 795950600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 03:25:36,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 03:25:39,270][12883] Updated weights for policy 0, policy_version 48581 (0.0030) +[2024-06-18 03:25:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 796033024. Throughput: 0: 41841.3. Samples: 796196760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 03:25:41,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 03:25:43,927][12883] Updated weights for policy 0, policy_version 48591 (0.0031) +[2024-06-18 03:25:46,802][12883] Updated weights for policy 0, policy_version 48601 (0.0027) +[2024-06-18 03:25:46,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42154.1). Total num frames: 796278784. Throughput: 0: 42168.2. Samples: 796326760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 03:25:46,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 03:25:51,697][12883] Updated weights for policy 0, policy_version 48611 (0.0033) +[2024-06-18 03:25:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 796442624. Throughput: 0: 42100.0. Samples: 796576920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 03:25:51,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 03:25:54,473][12883] Updated weights for policy 0, policy_version 48621 (0.0026) +[2024-06-18 03:25:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 796672000. Throughput: 0: 41987.2. Samples: 796828220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 03:25:56,994][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 03:25:59,744][12883] Updated weights for policy 0, policy_version 48631 (0.0042) +[2024-06-18 03:26:01,996][12645] Fps is (10 sec: 45864.9, 60 sec: 42598.3, 300 sec: 42209.3). Total num frames: 796901376. Throughput: 0: 42380.1. Samples: 796960740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 03:26:01,997][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 03:26:02,310][12883] Updated weights for policy 0, policy_version 48641 (0.0032) +[2024-06-18 03:26:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 797065216. Throughput: 0: 42063.8. Samples: 797211700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:26:06,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 03:26:07,439][12883] Updated weights for policy 0, policy_version 48651 (0.0029) +[2024-06-18 03:26:10,345][12883] Updated weights for policy 0, policy_version 48661 (0.0034) +[2024-06-18 03:26:11,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.6, 300 sec: 42209.7). Total num frames: 797327360. Throughput: 0: 41804.8. Samples: 797456340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:26:11,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 03:26:15,190][12883] Updated weights for policy 0, policy_version 48671 (0.0041) +[2024-06-18 03:26:16,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 797523968. Throughput: 0: 42225.3. Samples: 797594460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:26:16,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 03:26:18,199][12883] Updated weights for policy 0, policy_version 48681 (0.0030) +[2024-06-18 03:26:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 797720576. Throughput: 0: 42043.4. Samples: 797842560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:26:22,004][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 03:26:22,743][12883] Updated weights for policy 0, policy_version 48691 (0.0031) +[2024-06-18 03:26:25,788][12883] Updated weights for policy 0, policy_version 48701 (0.0036) +[2024-06-18 03:26:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 797949952. Throughput: 0: 42249.7. Samples: 798098000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:26:26,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 03:26:28,654][12862] Signal inference workers to stop experience collection... (11450 times) +[2024-06-18 03:26:28,655][12862] Signal inference workers to resume experience collection... (11450 times) +[2024-06-18 03:26:28,682][12883] InferenceWorker_p0-w0: stopping experience collection (11450 times) +[2024-06-18 03:26:28,683][12883] InferenceWorker_p0-w0: resuming experience collection (11450 times) +[2024-06-18 03:26:30,247][12883] Updated weights for policy 0, policy_version 48711 (0.0029) +[2024-06-18 03:26:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 798146560. Throughput: 0: 42279.1. Samples: 798229320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:26:31,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 03:26:33,434][12883] Updated weights for policy 0, policy_version 48721 (0.0033) +[2024-06-18 03:26:36,994][12645] Fps is (10 sec: 40956.8, 60 sec: 42324.7, 300 sec: 42098.4). Total num frames: 798359552. Throughput: 0: 42289.9. Samples: 798480000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:26:36,995][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 03:26:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048728_798359552.pth... +[2024-06-18 03:26:37,058][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048111_788250624.pth +[2024-06-18 03:26:37,995][12883] Updated weights for policy 0, policy_version 48731 (0.0026) +[2024-06-18 03:26:41,760][12883] Updated weights for policy 0, policy_version 48741 (0.0033) +[2024-06-18 03:26:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 798572544. Throughput: 0: 42379.6. Samples: 798735300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:26:41,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 03:26:45,625][12883] Updated weights for policy 0, policy_version 48751 (0.0034) +[2024-06-18 03:26:46,994][12645] Fps is (10 sec: 42601.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 798785536. Throughput: 0: 42235.8. Samples: 798861260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:26:46,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 03:26:49,585][12883] Updated weights for policy 0, policy_version 48761 (0.0025) +[2024-06-18 03:26:51,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42098.5). Total num frames: 798998528. Throughput: 0: 42368.2. Samples: 799118280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:26:51,994][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 03:26:53,179][12883] Updated weights for policy 0, policy_version 48771 (0.0026) +[2024-06-18 03:26:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 799211520. Throughput: 0: 42481.3. Samples: 799368000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:26:56,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 03:26:57,341][12883] Updated weights for policy 0, policy_version 48781 (0.0042) +[2024-06-18 03:27:01,014][12883] Updated weights for policy 0, policy_version 48791 (0.0026) +[2024-06-18 03:27:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41780.7, 300 sec: 42098.5). Total num frames: 799408128. Throughput: 0: 42187.2. Samples: 799492880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:27:01,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 03:27:05,172][12883] Updated weights for policy 0, policy_version 48801 (0.0030) +[2024-06-18 03:27:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 799621120. Throughput: 0: 42272.1. Samples: 799744800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 03:27:06,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 03:27:08,774][12883] Updated weights for policy 0, policy_version 48811 (0.0035) +[2024-06-18 03:27:11,996][12645] Fps is (10 sec: 42588.9, 60 sec: 41777.6, 300 sec: 42264.8). Total num frames: 799834112. Throughput: 0: 42221.9. Samples: 799998080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 03:27:11,996][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 03:27:13,050][12883] Updated weights for policy 0, policy_version 48821 (0.0037) +[2024-06-18 03:27:16,523][12883] Updated weights for policy 0, policy_version 48831 (0.0033) +[2024-06-18 03:27:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 800047104. Throughput: 0: 42076.1. Samples: 800122740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 03:27:17,000][12645] Avg episode reward: [(0, '0.215')] +[2024-06-18 03:27:21,069][12883] Updated weights for policy 0, policy_version 48841 (0.0034) +[2024-06-18 03:27:21,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 800276480. Throughput: 0: 42290.1. Samples: 800383020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 03:27:21,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 03:27:24,222][12883] Updated weights for policy 0, policy_version 48851 (0.0033) +[2024-06-18 03:27:26,994][12645] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 800456704. Throughput: 0: 42186.9. Samples: 800633720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 03:27:26,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 03:27:28,925][12883] Updated weights for policy 0, policy_version 48861 (0.0028) +[2024-06-18 03:27:31,823][12883] Updated weights for policy 0, policy_version 48871 (0.0037) +[2024-06-18 03:27:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 800702464. Throughput: 0: 42175.7. Samples: 800759160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 03:27:31,994][12645] Avg episode reward: [(0, '0.029')] +[2024-06-18 03:27:36,595][12883] Updated weights for policy 0, policy_version 48881 (0.0033) +[2024-06-18 03:27:36,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.9, 300 sec: 42320.7). Total num frames: 800899072. Throughput: 0: 42256.6. Samples: 801019820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 03:27:36,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 03:27:39,610][12883] Updated weights for policy 0, policy_version 48891 (0.0040) +[2024-06-18 03:27:42,000][12645] Fps is (10 sec: 40934.0, 60 sec: 42320.9, 300 sec: 42208.7). Total num frames: 801112064. Throughput: 0: 42056.9. Samples: 801260820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 03:27:42,000][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 03:27:44,280][12883] Updated weights for policy 0, policy_version 48901 (0.0040) +[2024-06-18 03:27:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 801325056. Throughput: 0: 42165.8. Samples: 801390340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 03:27:46,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 03:27:47,682][12883] Updated weights for policy 0, policy_version 48911 (0.0032) +[2024-06-18 03:27:51,994][12645] Fps is (10 sec: 39346.5, 60 sec: 41779.4, 300 sec: 42154.1). Total num frames: 801505280. Throughput: 0: 42155.1. Samples: 801641780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 03:27:51,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 03:27:52,021][12883] Updated weights for policy 0, policy_version 48921 (0.0027) +[2024-06-18 03:27:55,498][12883] Updated weights for policy 0, policy_version 48931 (0.0033) +[2024-06-18 03:27:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 801718272. Throughput: 0: 42110.0. Samples: 801892940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 03:27:56,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 03:27:59,606][12862] Signal inference workers to stop experience collection... (11500 times) +[2024-06-18 03:27:59,627][12883] InferenceWorker_p0-w0: stopping experience collection (11500 times) +[2024-06-18 03:27:59,660][12862] Signal inference workers to resume experience collection... (11500 times) +[2024-06-18 03:27:59,663][12883] InferenceWorker_p0-w0: resuming experience collection (11500 times) +[2024-06-18 03:27:59,666][12883] Updated weights for policy 0, policy_version 48941 (0.0034) +[2024-06-18 03:28:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42099.4). Total num frames: 801931264. Throughput: 0: 42131.1. Samples: 802018640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 03:28:01,994][12645] Avg episode reward: [(0, '0.205')] +[2024-06-18 03:28:03,297][12883] Updated weights for policy 0, policy_version 48951 (0.0035) +[2024-06-18 03:28:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 802160640. Throughput: 0: 42035.6. Samples: 802274620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) +[2024-06-18 03:28:06,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 03:28:07,128][12883] Updated weights for policy 0, policy_version 48961 (0.0040) +[2024-06-18 03:28:11,273][12883] Updated weights for policy 0, policy_version 48971 (0.0039) +[2024-06-18 03:28:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42053.8, 300 sec: 42154.1). Total num frames: 802357248. Throughput: 0: 42123.7. Samples: 802529280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 03:28:11,994][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 03:28:14,713][12883] Updated weights for policy 0, policy_version 48981 (0.0031) +[2024-06-18 03:28:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 802570240. Throughput: 0: 42001.7. Samples: 802649240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 03:28:16,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 03:28:18,924][12883] Updated weights for policy 0, policy_version 48991 (0.0029) +[2024-06-18 03:28:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 802783232. Throughput: 0: 42018.6. Samples: 802910660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 03:28:21,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 03:28:22,374][12883] Updated weights for policy 0, policy_version 49001 (0.0030) +[2024-06-18 03:28:26,741][12883] Updated weights for policy 0, policy_version 49011 (0.0027) +[2024-06-18 03:28:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 802996224. Throughput: 0: 42175.6. Samples: 803158460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 03:28:26,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 03:28:30,379][12883] Updated weights for policy 0, policy_version 49021 (0.0021) +[2024-06-18 03:28:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 803192832. Throughput: 0: 42134.6. Samples: 803286400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 03:28:31,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 03:28:34,418][12883] Updated weights for policy 0, policy_version 49031 (0.0039) +[2024-06-18 03:28:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 803405824. Throughput: 0: 42230.0. Samples: 803542140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 03:28:36,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 03:28:37,039][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049037_803422208.pth... +[2024-06-18 03:28:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048420_793313280.pth +[2024-06-18 03:28:38,163][12883] Updated weights for policy 0, policy_version 49041 (0.0042) +[2024-06-18 03:28:41,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42055.1, 300 sec: 42209.3). Total num frames: 803635200. Throughput: 0: 42147.3. Samples: 803789660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 03:28:41,997][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 03:28:42,333][12883] Updated weights for policy 0, policy_version 49051 (0.0024) +[2024-06-18 03:28:46,183][12883] Updated weights for policy 0, policy_version 49061 (0.0047) +[2024-06-18 03:28:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 803831808. Throughput: 0: 42241.0. Samples: 803919480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 03:28:46,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 03:28:50,258][12883] Updated weights for policy 0, policy_version 49071 (0.0022) +[2024-06-18 03:28:51,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 804044800. Throughput: 0: 42284.4. Samples: 804177420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 03:28:51,994][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 03:28:54,229][12883] Updated weights for policy 0, policy_version 49081 (0.0032) +[2024-06-18 03:28:56,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42869.9, 300 sec: 42264.8). Total num frames: 804290560. Throughput: 0: 42099.7. Samples: 804423860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 03:28:56,996][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 03:28:58,171][12883] Updated weights for policy 0, policy_version 49091 (0.0038) +[2024-06-18 03:29:01,924][12883] Updated weights for policy 0, policy_version 49101 (0.0044) +[2024-06-18 03:29:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 804470784. Throughput: 0: 42418.3. Samples: 804558060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 03:29:01,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 03:29:05,877][12883] Updated weights for policy 0, policy_version 49111 (0.0028) +[2024-06-18 03:29:06,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 804683776. Throughput: 0: 42313.9. Samples: 804814780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 03:29:06,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 03:29:09,456][12883] Updated weights for policy 0, policy_version 49121 (0.0035) +[2024-06-18 03:29:10,302][12862] Signal inference workers to stop experience collection... (11550 times) +[2024-06-18 03:29:10,303][12862] Signal inference workers to resume experience collection... (11550 times) +[2024-06-18 03:29:10,337][12883] InferenceWorker_p0-w0: stopping experience collection (11550 times) +[2024-06-18 03:29:10,337][12883] InferenceWorker_p0-w0: resuming experience collection (11550 times) +[2024-06-18 03:29:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42376.7). Total num frames: 804929536. Throughput: 0: 42273.4. Samples: 805060760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 03:29:11,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 03:29:13,790][12883] Updated weights for policy 0, policy_version 49131 (0.0050) +[2024-06-18 03:29:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 805109760. Throughput: 0: 42385.3. Samples: 805193740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 03:29:16,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 03:29:17,556][12883] Updated weights for policy 0, policy_version 49141 (0.0059) +[2024-06-18 03:29:21,382][12883] Updated weights for policy 0, policy_version 49151 (0.0046) +[2024-06-18 03:29:22,000][12645] Fps is (10 sec: 37658.5, 60 sec: 42047.7, 300 sec: 42042.1). Total num frames: 805306368. Throughput: 0: 42232.6. Samples: 805442880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 03:29:22,000][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 03:29:25,094][12883] Updated weights for policy 0, policy_version 49161 (0.0034) +[2024-06-18 03:29:26,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 805568512. Throughput: 0: 42250.9. Samples: 805690860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 03:29:26,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 03:29:27,016][12862] Saving new best policy, reward=0.390! +[2024-06-18 03:29:29,143][12883] Updated weights for policy 0, policy_version 49171 (0.0039) +[2024-06-18 03:29:31,994][12645] Fps is (10 sec: 42626.4, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 805732352. Throughput: 0: 42521.8. Samples: 805832960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 03:29:31,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 03:29:32,493][12883] Updated weights for policy 0, policy_version 49181 (0.0030) +[2024-06-18 03:29:36,834][12883] Updated weights for policy 0, policy_version 49191 (0.0043) +[2024-06-18 03:29:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 805945344. Throughput: 0: 42246.6. Samples: 806078520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) +[2024-06-18 03:29:36,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 03:29:40,227][12883] Updated weights for policy 0, policy_version 49201 (0.0038) +[2024-06-18 03:29:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42600.0, 300 sec: 42376.2). Total num frames: 806191104. Throughput: 0: 42290.6. Samples: 806326840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 03:29:41,994][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 03:29:44,562][12883] Updated weights for policy 0, policy_version 49211 (0.0037) +[2024-06-18 03:29:47,000][12645] Fps is (10 sec: 42572.2, 60 sec: 42320.9, 300 sec: 42153.2). Total num frames: 806371328. Throughput: 0: 42393.2. Samples: 806466020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 03:29:47,000][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 03:29:47,953][12883] Updated weights for policy 0, policy_version 49221 (0.0031) +[2024-06-18 03:29:51,999][12645] Fps is (10 sec: 39300.7, 60 sec: 42321.6, 300 sec: 42264.4). Total num frames: 806584320. Throughput: 0: 42166.1. Samples: 806712480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 03:29:51,999][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 03:29:52,151][12883] Updated weights for policy 0, policy_version 49231 (0.0022) +[2024-06-18 03:29:55,636][12883] Updated weights for policy 0, policy_version 49241 (0.0044) +[2024-06-18 03:29:56,994][12645] Fps is (10 sec: 45903.7, 60 sec: 42326.9, 300 sec: 42321.0). Total num frames: 806830080. Throughput: 0: 42206.2. Samples: 806960040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 03:29:56,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 03:29:59,571][12883] Updated weights for policy 0, policy_version 49251 (0.0037) +[2024-06-18 03:30:01,994][12645] Fps is (10 sec: 40981.4, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 806993920. Throughput: 0: 42278.1. Samples: 807096260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 03:30:01,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 03:30:03,163][12883] Updated weights for policy 0, policy_version 49261 (0.0036) +[2024-06-18 03:30:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42376.3). Total num frames: 807239680. Throughput: 0: 42516.4. Samples: 807355840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 03:30:06,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 03:30:07,351][12883] Updated weights for policy 0, policy_version 49271 (0.0028) +[2024-06-18 03:30:11,107][12883] Updated weights for policy 0, policy_version 49281 (0.0027) +[2024-06-18 03:30:11,994][12645] Fps is (10 sec: 49152.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 807485440. Throughput: 0: 42490.2. Samples: 807602920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 03:30:11,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 03:30:14,120][12862] Signal inference workers to stop experience collection... (11600 times) +[2024-06-18 03:30:14,154][12883] InferenceWorker_p0-w0: stopping experience collection (11600 times) +[2024-06-18 03:30:14,235][12862] Signal inference workers to resume experience collection... (11600 times) +[2024-06-18 03:30:14,235][12883] InferenceWorker_p0-w0: resuming experience collection (11600 times) +[2024-06-18 03:30:14,842][12883] Updated weights for policy 0, policy_version 49291 (0.0033) +[2024-06-18 03:30:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 807649280. Throughput: 0: 42237.3. Samples: 807733640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 03:30:16,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 03:30:18,905][12883] Updated weights for policy 0, policy_version 49301 (0.0036) +[2024-06-18 03:30:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42876.2, 300 sec: 42376.2). Total num frames: 807878656. Throughput: 0: 42342.3. Samples: 807983920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 03:30:21,994][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 03:30:22,842][12883] Updated weights for policy 0, policy_version 49311 (0.0030) +[2024-06-18 03:30:26,743][12883] Updated weights for policy 0, policy_version 49321 (0.0028) +[2024-06-18 03:30:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 808091648. Throughput: 0: 42568.4. Samples: 808242420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 03:30:26,994][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 03:30:30,612][12883] Updated weights for policy 0, policy_version 49331 (0.0035) +[2024-06-18 03:30:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 808271872. Throughput: 0: 42296.6. Samples: 808369100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 03:30:31,994][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 03:30:34,253][12883] Updated weights for policy 0, policy_version 49341 (0.0025) +[2024-06-18 03:30:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 808484864. Throughput: 0: 42421.9. Samples: 808621240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 03:30:36,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 03:30:37,173][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049347_808501248.pth... +[2024-06-18 03:30:37,216][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048728_798359552.pth +[2024-06-18 03:30:38,137][12883] Updated weights for policy 0, policy_version 49351 (0.0033) +[2024-06-18 03:30:41,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42050.7, 300 sec: 42153.8). Total num frames: 808714240. Throughput: 0: 42536.5. Samples: 808874280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) +[2024-06-18 03:30:41,997][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 03:30:42,203][12883] Updated weights for policy 0, policy_version 49361 (0.0033) +[2024-06-18 03:30:45,684][12883] Updated weights for policy 0, policy_version 49371 (0.0035) +[2024-06-18 03:30:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42329.8, 300 sec: 42265.2). Total num frames: 808910848. Throughput: 0: 42341.1. Samples: 809001600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:30:46,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 03:30:49,797][12883] Updated weights for policy 0, policy_version 49381 (0.0042) +[2024-06-18 03:30:51,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42602.2, 300 sec: 42265.2). Total num frames: 809140224. Throughput: 0: 42178.7. Samples: 809253880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:30:51,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 03:30:53,376][12883] Updated weights for policy 0, policy_version 49391 (0.0039) +[2024-06-18 03:30:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42154.4). Total num frames: 809336832. Throughput: 0: 42267.2. Samples: 809504940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:30:56,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 03:30:57,509][12883] Updated weights for policy 0, policy_version 49401 (0.0038) +[2024-06-18 03:31:01,293][12883] Updated weights for policy 0, policy_version 49411 (0.0034) +[2024-06-18 03:31:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 809549824. Throughput: 0: 42087.6. Samples: 809627580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:31:01,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 03:31:05,359][12883] Updated weights for policy 0, policy_version 49421 (0.0034) +[2024-06-18 03:31:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 809762816. Throughput: 0: 42110.2. Samples: 809878880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:31:06,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 03:31:09,038][12883] Updated weights for policy 0, policy_version 49431 (0.0039) +[2024-06-18 03:31:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 42154.1). Total num frames: 809959424. Throughput: 0: 41957.5. Samples: 810130500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:31:11,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 03:31:13,477][12883] Updated weights for policy 0, policy_version 49441 (0.0040) +[2024-06-18 03:31:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 810205184. Throughput: 0: 41938.6. Samples: 810256340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) +[2024-06-18 03:31:16,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 03:31:16,997][12883] Updated weights for policy 0, policy_version 49451 (0.0033) +[2024-06-18 03:31:21,148][12883] Updated weights for policy 0, policy_version 49461 (0.0025) +[2024-06-18 03:31:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 810385408. Throughput: 0: 41991.2. Samples: 810510840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) +[2024-06-18 03:31:21,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 03:31:25,055][12883] Updated weights for policy 0, policy_version 49471 (0.0036) +[2024-06-18 03:31:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 810598400. Throughput: 0: 41937.1. Samples: 810761360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) +[2024-06-18 03:31:26,994][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 03:31:28,828][12883] Updated weights for policy 0, policy_version 49481 (0.0050) +[2024-06-18 03:31:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 810811392. Throughput: 0: 41967.4. Samples: 810890140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) +[2024-06-18 03:31:31,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 03:31:32,909][12883] Updated weights for policy 0, policy_version 49491 (0.0039) +[2024-06-18 03:31:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 811008000. Throughput: 0: 41956.0. Samples: 811141900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) +[2024-06-18 03:31:36,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 03:31:37,035][12883] Updated weights for policy 0, policy_version 49501 (0.0028) +[2024-06-18 03:31:40,645][12883] Updated weights for policy 0, policy_version 49511 (0.0028) +[2024-06-18 03:31:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41780.8, 300 sec: 42154.1). Total num frames: 811220992. Throughput: 0: 41815.1. Samples: 811386620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) +[2024-06-18 03:31:41,994][12645] Avg episode reward: [(0, '0.066')] +[2024-06-18 03:31:45,050][12883] Updated weights for policy 0, policy_version 49521 (0.0035) +[2024-06-18 03:31:45,512][12862] Signal inference workers to stop experience collection... (11650 times) +[2024-06-18 03:31:45,562][12862] Signal inference workers to resume experience collection... (11650 times) +[2024-06-18 03:31:45,563][12883] InferenceWorker_p0-w0: stopping experience collection (11650 times) +[2024-06-18 03:31:45,591][12883] InferenceWorker_p0-w0: resuming experience collection (11650 times) +[2024-06-18 03:31:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 811433984. Throughput: 0: 42019.2. Samples: 811518440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 03:31:46,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 03:31:48,582][12883] Updated weights for policy 0, policy_version 49531 (0.0034) +[2024-06-18 03:31:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 811630592. Throughput: 0: 41959.1. Samples: 811767040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 03:31:51,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 03:31:52,769][12883] Updated weights for policy 0, policy_version 49541 (0.0037) +[2024-06-18 03:31:56,002][12883] Updated weights for policy 0, policy_version 49551 (0.0043) +[2024-06-18 03:31:56,996][12645] Fps is (10 sec: 44226.2, 60 sec: 42323.7, 300 sec: 42264.8). Total num frames: 811876352. Throughput: 0: 41803.5. Samples: 812011760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 03:31:56,996][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 03:32:00,658][12883] Updated weights for policy 0, policy_version 49561 (0.0039) +[2024-06-18 03:32:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 812056576. Throughput: 0: 42020.9. Samples: 812147280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 03:32:01,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 03:32:03,895][12883] Updated weights for policy 0, policy_version 49571 (0.0029) +[2024-06-18 03:32:06,994][12645] Fps is (10 sec: 39331.1, 60 sec: 41779.3, 300 sec: 42154.4). Total num frames: 812269568. Throughput: 0: 41937.8. Samples: 812398040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 03:32:06,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 03:32:08,393][12883] Updated weights for policy 0, policy_version 49581 (0.0036) +[2024-06-18 03:32:11,615][12883] Updated weights for policy 0, policy_version 49591 (0.0046) +[2024-06-18 03:32:11,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 812531712. Throughput: 0: 41884.5. Samples: 812646160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 03:32:11,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 03:32:16,062][12883] Updated weights for policy 0, policy_version 49601 (0.0046) +[2024-06-18 03:32:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 812695552. Throughput: 0: 41933.7. Samples: 812777160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 03:32:16,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 03:32:19,258][12883] Updated weights for policy 0, policy_version 49611 (0.0047) +[2024-06-18 03:32:21,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 812908544. Throughput: 0: 41808.8. Samples: 813023300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 03:32:21,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 03:32:23,690][12883] Updated weights for policy 0, policy_version 49621 (0.0038) +[2024-06-18 03:32:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 813137920. Throughput: 0: 41978.1. Samples: 813275640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 03:32:26,994][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 03:32:27,407][12883] Updated weights for policy 0, policy_version 49631 (0.0024) +[2024-06-18 03:32:31,685][12883] Updated weights for policy 0, policy_version 49641 (0.0042) +[2024-06-18 03:32:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 813318144. Throughput: 0: 41910.6. Samples: 813404420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 03:32:31,994][12645] Avg episode reward: [(0, '0.031')] +[2024-06-18 03:32:35,099][12883] Updated weights for policy 0, policy_version 49651 (0.0050) +[2024-06-18 03:32:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42155.0). Total num frames: 813547520. Throughput: 0: 41842.1. Samples: 813649940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 03:32:36,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 03:32:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049655_813547520.pth... +[2024-06-18 03:32:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049037_803422208.pth +[2024-06-18 03:32:39,440][12883] Updated weights for policy 0, policy_version 49661 (0.0029) +[2024-06-18 03:32:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 813744128. Throughput: 0: 42099.9. Samples: 813906160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 03:32:41,995][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 03:32:42,967][12883] Updated weights for policy 0, policy_version 49671 (0.0029) +[2024-06-18 03:32:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.0, 300 sec: 42154.0). Total num frames: 813940736. Throughput: 0: 41918.5. Samples: 814033620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 03:32:46,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 03:32:47,394][12883] Updated weights for policy 0, policy_version 49681 (0.0037) +[2024-06-18 03:32:50,742][12883] Updated weights for policy 0, policy_version 49691 (0.0030) +[2024-06-18 03:32:51,998][12645] Fps is (10 sec: 42578.6, 60 sec: 42322.0, 300 sec: 42209.0). Total num frames: 814170112. Throughput: 0: 41922.7. Samples: 814284760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 03:32:51,999][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 03:32:55,139][12883] Updated weights for policy 0, policy_version 49701 (0.0050) +[2024-06-18 03:32:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41507.7, 300 sec: 42154.1). Total num frames: 814366720. Throughput: 0: 42069.2. Samples: 814539280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 03:32:56,994][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 03:32:58,521][12883] Updated weights for policy 0, policy_version 49711 (0.0047) +[2024-06-18 03:33:01,994][12645] Fps is (10 sec: 39340.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 814563328. Throughput: 0: 41809.4. Samples: 814658580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 03:33:01,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 03:33:03,178][12883] Updated weights for policy 0, policy_version 49721 (0.0030) +[2024-06-18 03:33:06,431][12883] Updated weights for policy 0, policy_version 49731 (0.0037) +[2024-06-18 03:33:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 814825472. Throughput: 0: 41986.8. Samples: 814912700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 03:33:06,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 03:33:10,925][12883] Updated weights for policy 0, policy_version 49741 (0.0044) +[2024-06-18 03:33:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 42098.6). Total num frames: 814989312. Throughput: 0: 41823.3. Samples: 815157680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 03:33:11,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 03:33:14,448][12883] Updated weights for policy 0, policy_version 49751 (0.0053) +[2024-06-18 03:33:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 815202304. Throughput: 0: 41689.4. Samples: 815280440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 03:33:16,994][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 03:33:18,533][12883] Updated weights for policy 0, policy_version 49761 (0.0038) +[2024-06-18 03:33:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 815415296. Throughput: 0: 41949.6. Samples: 815537660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 03:33:21,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 03:33:22,328][12883] Updated weights for policy 0, policy_version 49771 (0.0034) +[2024-06-18 03:33:22,519][12862] Signal inference workers to stop experience collection... (11700 times) +[2024-06-18 03:33:22,519][12862] Signal inference workers to resume experience collection... (11700 times) +[2024-06-18 03:33:22,557][12883] InferenceWorker_p0-w0: stopping experience collection (11700 times) +[2024-06-18 03:33:22,557][12883] InferenceWorker_p0-w0: resuming experience collection (11700 times) +[2024-06-18 03:33:26,197][12883] Updated weights for policy 0, policy_version 49781 (0.0039) +[2024-06-18 03:33:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 815628288. Throughput: 0: 41800.9. Samples: 815787200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 03:33:26,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 03:33:30,147][12883] Updated weights for policy 0, policy_version 49791 (0.0045) +[2024-06-18 03:33:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 815841280. Throughput: 0: 41719.8. Samples: 815911000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 03:33:31,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 03:33:34,441][12883] Updated weights for policy 0, policy_version 49801 (0.0043) +[2024-06-18 03:33:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 42043.3). Total num frames: 816037888. Throughput: 0: 41669.7. Samples: 816159700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 03:33:36,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 03:33:37,919][12883] Updated weights for policy 0, policy_version 49811 (0.0026) +[2024-06-18 03:33:41,995][12645] Fps is (10 sec: 40955.6, 60 sec: 41778.5, 300 sec: 42098.4). Total num frames: 816250880. Throughput: 0: 41663.1. Samples: 816414160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 03:33:41,995][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 03:33:42,151][12883] Updated weights for policy 0, policy_version 49821 (0.0032) +[2024-06-18 03:33:45,689][12883] Updated weights for policy 0, policy_version 49831 (0.0030) +[2024-06-18 03:33:46,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42050.9, 300 sec: 42098.2). Total num frames: 816463872. Throughput: 0: 41813.5. Samples: 816540280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 03:33:46,996][12645] Avg episode reward: [(0, '0.027')] +[2024-06-18 03:33:50,055][12883] Updated weights for policy 0, policy_version 49841 (0.0045) +[2024-06-18 03:33:51,996][12645] Fps is (10 sec: 42594.7, 60 sec: 41781.1, 300 sec: 41987.5). Total num frames: 816676864. Throughput: 0: 41750.7. Samples: 816791560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 03:33:51,996][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 03:33:53,743][12883] Updated weights for policy 0, policy_version 49851 (0.0035) +[2024-06-18 03:33:56,996][12645] Fps is (10 sec: 42598.4, 60 sec: 42050.8, 300 sec: 42098.2). Total num frames: 816889856. Throughput: 0: 41844.6. Samples: 817040780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 03:33:56,996][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 03:33:57,769][12883] Updated weights for policy 0, policy_version 49861 (0.0031) +[2024-06-18 03:34:01,516][12883] Updated weights for policy 0, policy_version 49871 (0.0037) +[2024-06-18 03:34:01,994][12645] Fps is (10 sec: 40968.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 817086464. Throughput: 0: 41972.4. Samples: 817169200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 03:34:01,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 03:34:05,470][12883] Updated weights for policy 0, policy_version 49881 (0.0029) +[2024-06-18 03:34:06,994][12645] Fps is (10 sec: 42607.6, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 817315840. Throughput: 0: 41827.8. Samples: 817419920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 03:34:06,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 03:34:09,787][12883] Updated weights for policy 0, policy_version 49891 (0.0039) +[2024-06-18 03:34:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 817512448. Throughput: 0: 41825.4. Samples: 817669340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 03:34:11,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 03:34:13,192][12883] Updated weights for policy 0, policy_version 49901 (0.0035) +[2024-06-18 03:34:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42099.5). Total num frames: 817725440. Throughput: 0: 41827.6. Samples: 817793240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 03:34:16,994][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 03:34:17,544][12883] Updated weights for policy 0, policy_version 49911 (0.0047) +[2024-06-18 03:34:20,870][12883] Updated weights for policy 0, policy_version 49921 (0.0045) +[2024-06-18 03:34:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 817938432. Throughput: 0: 42018.6. Samples: 818050540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 03:34:21,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 03:34:25,179][12883] Updated weights for policy 0, policy_version 49931 (0.0030) +[2024-06-18 03:34:26,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 818151424. Throughput: 0: 42022.6. Samples: 818305140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 03:34:26,994][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 03:34:28,626][12883] Updated weights for policy 0, policy_version 49941 (0.0039) +[2024-06-18 03:34:32,000][12645] Fps is (10 sec: 42572.1, 60 sec: 42047.9, 300 sec: 42097.7). Total num frames: 818364416. Throughput: 0: 41982.5. Samples: 818429660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 03:34:32,000][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 03:34:32,727][12883] Updated weights for policy 0, policy_version 49951 (0.0038) +[2024-06-18 03:34:36,713][12883] Updated weights for policy 0, policy_version 49961 (0.0029) +[2024-06-18 03:34:36,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 818561024. Throughput: 0: 42173.4. Samples: 818689280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 03:34:36,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 03:34:37,078][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049962_818577408.pth... +[2024-06-18 03:34:37,142][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049347_808501248.pth +[2024-06-18 03:34:40,410][12883] Updated weights for policy 0, policy_version 49971 (0.0030) +[2024-06-18 03:34:41,994][12645] Fps is (10 sec: 37707.0, 60 sec: 41506.9, 300 sec: 41932.8). Total num frames: 818741248. Throughput: 0: 42147.0. Samples: 818937300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 03:34:41,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 03:34:44,539][12883] Updated weights for policy 0, policy_version 49981 (0.0032) +[2024-06-18 03:34:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42326.9, 300 sec: 42099.3). Total num frames: 819003392. Throughput: 0: 41882.2. Samples: 819053900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 03:34:46,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 03:34:48,355][12883] Updated weights for policy 0, policy_version 49991 (0.0034) +[2024-06-18 03:34:51,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42053.5, 300 sec: 41931.9). Total num frames: 819200000. Throughput: 0: 42144.4. Samples: 819316420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 03:34:51,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 03:34:52,234][12883] Updated weights for policy 0, policy_version 50001 (0.0034) +[2024-06-18 03:34:52,472][12862] Signal inference workers to stop experience collection... (11750 times) +[2024-06-18 03:34:52,472][12862] Signal inference workers to resume experience collection... (11750 times) +[2024-06-18 03:34:52,495][12883] InferenceWorker_p0-w0: stopping experience collection (11750 times) +[2024-06-18 03:34:52,496][12883] InferenceWorker_p0-w0: resuming experience collection (11750 times) +[2024-06-18 03:34:55,898][12883] Updated weights for policy 0, policy_version 50011 (0.0032) +[2024-06-18 03:34:56,996][12645] Fps is (10 sec: 37674.7, 60 sec: 41506.1, 300 sec: 41987.2). Total num frames: 819380224. Throughput: 0: 42196.9. Samples: 819568300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 03:34:56,996][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 03:34:59,972][12883] Updated weights for policy 0, policy_version 50021 (0.0025) +[2024-06-18 03:35:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 819625984. Throughput: 0: 42261.2. Samples: 819695000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 03:35:01,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 03:35:03,662][12883] Updated weights for policy 0, policy_version 50031 (0.0032) +[2024-06-18 03:35:06,996][12645] Fps is (10 sec: 44236.9, 60 sec: 41777.7, 300 sec: 41820.5). Total num frames: 819822592. Throughput: 0: 42237.9. Samples: 819951340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 03:35:06,997][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 03:35:07,617][12883] Updated weights for policy 0, policy_version 50041 (0.0033) +[2024-06-18 03:35:11,507][12883] Updated weights for policy 0, policy_version 50051 (0.0035) +[2024-06-18 03:35:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 820035584. Throughput: 0: 42087.7. Samples: 820199080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 03:35:11,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 03:35:15,229][12883] Updated weights for policy 0, policy_version 50061 (0.0047) +[2024-06-18 03:35:16,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42325.2, 300 sec: 41987.4). Total num frames: 820264960. Throughput: 0: 42220.8. Samples: 820329340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 03:35:16,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 03:35:19,590][12883] Updated weights for policy 0, policy_version 50071 (0.0033) +[2024-06-18 03:35:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 820445184. Throughput: 0: 42111.1. Samples: 820584280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) +[2024-06-18 03:35:21,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 03:35:22,975][12883] Updated weights for policy 0, policy_version 50081 (0.0021) +[2024-06-18 03:35:26,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 820674560. Throughput: 0: 42128.0. Samples: 820833060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 03:35:26,994][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 03:35:27,355][12883] Updated weights for policy 0, policy_version 50091 (0.0038) +[2024-06-18 03:35:30,632][12883] Updated weights for policy 0, policy_version 50101 (0.0037) +[2024-06-18 03:35:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42056.6, 300 sec: 42043.0). Total num frames: 820887552. Throughput: 0: 42306.7. Samples: 820957700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 03:35:31,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 03:35:35,297][12883] Updated weights for policy 0, policy_version 50111 (0.0028) +[2024-06-18 03:35:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 821100544. Throughput: 0: 42187.2. Samples: 821214840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 03:35:36,994][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 03:35:38,442][12883] Updated weights for policy 0, policy_version 50121 (0.0035) +[2024-06-18 03:35:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 41987.4). Total num frames: 821297152. Throughput: 0: 42168.7. Samples: 821465800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 03:35:41,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 03:35:42,940][12883] Updated weights for policy 0, policy_version 50131 (0.0044) +[2024-06-18 03:35:46,127][12883] Updated weights for policy 0, policy_version 50141 (0.0033) +[2024-06-18 03:35:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 821526528. Throughput: 0: 42095.7. Samples: 821589300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 03:35:46,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 03:35:51,073][12883] Updated weights for policy 0, policy_version 50151 (0.0028) +[2024-06-18 03:35:51,993][12645] Fps is (10 sec: 42599.5, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 821723136. Throughput: 0: 42048.0. Samples: 821843400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 03:35:51,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 03:35:53,842][12883] Updated weights for policy 0, policy_version 50161 (0.0036) +[2024-06-18 03:35:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42327.0, 300 sec: 41931.9). Total num frames: 821919744. Throughput: 0: 42230.7. Samples: 822099460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-18 03:35:56,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 03:35:58,867][12883] Updated weights for policy 0, policy_version 50171 (0.0029) +[2024-06-18 03:36:01,985][12883] Updated weights for policy 0, policy_version 50181 (0.0033) +[2024-06-18 03:36:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 822165504. Throughput: 0: 41989.5. Samples: 822218860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-18 03:36:01,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 03:36:06,579][12883] Updated weights for policy 0, policy_version 50191 (0.0027) +[2024-06-18 03:36:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41780.9, 300 sec: 41931.9). Total num frames: 822329344. Throughput: 0: 41954.3. Samples: 822472220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-18 03:36:06,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 03:36:09,866][12883] Updated weights for policy 0, policy_version 50201 (0.0029) +[2024-06-18 03:36:11,999][12645] Fps is (10 sec: 39298.7, 60 sec: 42048.2, 300 sec: 41875.6). Total num frames: 822558720. Throughput: 0: 41949.6. Samples: 822721040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-18 03:36:12,000][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 03:36:14,261][12883] Updated weights for policy 0, policy_version 50211 (0.0038) +[2024-06-18 03:36:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 822788096. Throughput: 0: 42039.2. Samples: 822849460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-18 03:36:16,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 03:36:17,629][12883] Updated weights for policy 0, policy_version 50221 (0.0034) +[2024-06-18 03:36:21,994][12645] Fps is (10 sec: 40984.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 822968320. Throughput: 0: 41996.0. Samples: 823104660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) +[2024-06-18 03:36:21,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 03:36:22,301][12883] Updated weights for policy 0, policy_version 50231 (0.0029) +[2024-06-18 03:36:25,832][12883] Updated weights for policy 0, policy_version 50241 (0.0029) +[2024-06-18 03:36:26,994][12645] Fps is (10 sec: 42596.6, 60 sec: 42325.1, 300 sec: 42043.0). Total num frames: 823214080. Throughput: 0: 41820.3. Samples: 823347720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:36:26,995][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 03:36:29,845][12883] Updated weights for policy 0, policy_version 50251 (0.0040) +[2024-06-18 03:36:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 823394304. Throughput: 0: 42119.5. Samples: 823484680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:36:31,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 03:36:33,543][12883] Updated weights for policy 0, policy_version 50261 (0.0039) +[2024-06-18 03:36:34,660][12862] Signal inference workers to stop experience collection... (11800 times) +[2024-06-18 03:36:34,661][12862] Signal inference workers to resume experience collection... (11800 times) +[2024-06-18 03:36:34,700][12883] InferenceWorker_p0-w0: stopping experience collection (11800 times) +[2024-06-18 03:36:34,700][12883] InferenceWorker_p0-w0: resuming experience collection (11800 times) +[2024-06-18 03:36:36,994][12645] Fps is (10 sec: 37684.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 823590912. Throughput: 0: 41962.1. Samples: 823731700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:36:36,994][12645] Avg episode reward: [(0, '0.057')] +[2024-06-18 03:36:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050268_823590912.pth... +[2024-06-18 03:36:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049655_813547520.pth +[2024-06-18 03:36:37,680][12883] Updated weights for policy 0, policy_version 50271 (0.0032) +[2024-06-18 03:36:41,029][12883] Updated weights for policy 0, policy_version 50281 (0.0037) +[2024-06-18 03:36:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 823836672. Throughput: 0: 41740.3. Samples: 823977780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:36:41,996][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 03:36:45,386][12883] Updated weights for policy 0, policy_version 50291 (0.0028) +[2024-06-18 03:36:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 824033280. Throughput: 0: 42083.6. Samples: 824112620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:36:46,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 03:36:48,803][12883] Updated weights for policy 0, policy_version 50301 (0.0039) +[2024-06-18 03:36:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41932.3). Total num frames: 824246272. Throughput: 0: 41941.7. Samples: 824359600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:36:51,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 03:36:53,352][12883] Updated weights for policy 0, policy_version 50311 (0.0045) +[2024-06-18 03:36:56,468][12883] Updated weights for policy 0, policy_version 50321 (0.0034) +[2024-06-18 03:36:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 824475648. Throughput: 0: 42022.3. Samples: 824611800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:36:56,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 03:37:01,093][12883] Updated weights for policy 0, policy_version 50331 (0.0035) +[2024-06-18 03:37:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 824655872. Throughput: 0: 41996.3. Samples: 824739300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:37:01,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 03:37:04,336][12883] Updated weights for policy 0, policy_version 50341 (0.0034) +[2024-06-18 03:37:06,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42596.8, 300 sec: 41876.1). Total num frames: 824885248. Throughput: 0: 41893.9. Samples: 824989980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:37:06,996][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 03:37:08,816][12883] Updated weights for policy 0, policy_version 50351 (0.0030) +[2024-06-18 03:37:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42056.3, 300 sec: 41987.5). Total num frames: 825081856. Throughput: 0: 42094.9. Samples: 825241980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:37:11,995][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 03:37:12,217][12883] Updated weights for policy 0, policy_version 50361 (0.0033) +[2024-06-18 03:37:16,603][12883] Updated weights for policy 0, policy_version 50371 (0.0039) +[2024-06-18 03:37:16,994][12645] Fps is (10 sec: 40968.8, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 825294848. Throughput: 0: 41783.1. Samples: 825364920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:37:16,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 03:37:20,429][12883] Updated weights for policy 0, policy_version 50381 (0.0041) +[2024-06-18 03:37:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 825524224. Throughput: 0: 41971.9. Samples: 825620440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:37:21,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 03:37:24,374][12883] Updated weights for policy 0, policy_version 50391 (0.0030) +[2024-06-18 03:37:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.4, 300 sec: 41987.5). Total num frames: 825704448. Throughput: 0: 42217.9. Samples: 825877580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 03:37:26,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 03:37:28,097][12883] Updated weights for policy 0, policy_version 50401 (0.0033) +[2024-06-18 03:37:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 825917440. Throughput: 0: 41820.3. Samples: 825994540. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) +[2024-06-18 03:37:32,003][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 03:37:32,185][12883] Updated weights for policy 0, policy_version 50411 (0.0036) +[2024-06-18 03:37:36,116][12883] Updated weights for policy 0, policy_version 50421 (0.0034) +[2024-06-18 03:37:36,996][12645] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42098.2). Total num frames: 826163200. Throughput: 0: 42146.3. Samples: 826256280. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) +[2024-06-18 03:37:36,997][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 03:37:39,899][12883] Updated weights for policy 0, policy_version 50431 (0.0034) +[2024-06-18 03:37:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 42043.1). Total num frames: 826343424. Throughput: 0: 42181.0. Samples: 826509940. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) +[2024-06-18 03:37:41,994][12645] Avg episode reward: [(0, '0.072')] +[2024-06-18 03:37:43,751][12883] Updated weights for policy 0, policy_version 50441 (0.0033) +[2024-06-18 03:37:46,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.2, 300 sec: 42043.7). Total num frames: 826572800. Throughput: 0: 42065.3. Samples: 826632240. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) +[2024-06-18 03:37:46,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 03:37:47,491][12883] Updated weights for policy 0, policy_version 50451 (0.0031) +[2024-06-18 03:37:51,495][12883] Updated weights for policy 0, policy_version 50461 (0.0039) +[2024-06-18 03:37:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 826769408. Throughput: 0: 42195.8. Samples: 826888700. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) +[2024-06-18 03:37:51,994][12645] Avg episode reward: [(0, '0.037')] +[2024-06-18 03:37:55,723][12883] Updated weights for policy 0, policy_version 50471 (0.0044) +[2024-06-18 03:37:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 826949632. Throughput: 0: 42082.7. Samples: 827135700. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) +[2024-06-18 03:37:56,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 03:37:59,497][12883] Updated weights for policy 0, policy_version 50481 (0.0031) +[2024-06-18 03:38:01,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 827211776. Throughput: 0: 42193.3. Samples: 827263620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) +[2024-06-18 03:38:01,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 03:38:03,308][12883] Updated weights for policy 0, policy_version 50491 (0.0030) +[2024-06-18 03:38:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41507.7, 300 sec: 41987.5). Total num frames: 827375616. Throughput: 0: 42295.6. Samples: 827523740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) +[2024-06-18 03:38:06,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 03:38:07,097][12862] Signal inference workers to stop experience collection... (11850 times) +[2024-06-18 03:38:07,097][12862] Signal inference workers to resume experience collection... (11850 times) +[2024-06-18 03:38:07,119][12883] InferenceWorker_p0-w0: stopping experience collection (11850 times) +[2024-06-18 03:38:07,119][12883] InferenceWorker_p0-w0: resuming experience collection (11850 times) +[2024-06-18 03:38:07,278][12883] Updated weights for policy 0, policy_version 50501 (0.0035) +[2024-06-18 03:38:11,176][12883] Updated weights for policy 0, policy_version 50511 (0.0046) +[2024-06-18 03:38:11,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 827604992. Throughput: 0: 42076.9. Samples: 827771040. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) +[2024-06-18 03:38:11,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 03:38:14,953][12883] Updated weights for policy 0, policy_version 50521 (0.0041) +[2024-06-18 03:38:16,994][12645] Fps is (10 sec: 49152.1, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 827867136. Throughput: 0: 42437.4. Samples: 827904220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) +[2024-06-18 03:38:16,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 03:38:18,608][12883] Updated weights for policy 0, policy_version 50531 (0.0039) +[2024-06-18 03:38:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 828014592. Throughput: 0: 42319.0. Samples: 828160540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) +[2024-06-18 03:38:21,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 03:38:22,797][12883] Updated weights for policy 0, policy_version 50541 (0.0038) +[2024-06-18 03:38:26,080][12883] Updated weights for policy 0, policy_version 50551 (0.0031) +[2024-06-18 03:38:26,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 828243968. Throughput: 0: 42137.1. Samples: 828406120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) +[2024-06-18 03:38:26,994][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 03:38:30,546][12883] Updated weights for policy 0, policy_version 50561 (0.0026) +[2024-06-18 03:38:31,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 828489728. Throughput: 0: 42449.9. Samples: 828542480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) +[2024-06-18 03:38:31,994][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 03:38:33,720][12883] Updated weights for policy 0, policy_version 50571 (0.0037) +[2024-06-18 03:38:36,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40961.6, 300 sec: 41932.1). Total num frames: 828620800. Throughput: 0: 42329.9. Samples: 828793540. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) +[2024-06-18 03:38:36,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 03:38:37,046][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050576_828637184.pth... +[2024-06-18 03:38:37,129][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049962_818577408.pth +[2024-06-18 03:38:38,282][12883] Updated weights for policy 0, policy_version 50581 (0.0025) +[2024-06-18 03:38:41,758][12883] Updated weights for policy 0, policy_version 50591 (0.0034) +[2024-06-18 03:38:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42098.9). Total num frames: 828882944. Throughput: 0: 42297.3. Samples: 829039080. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) +[2024-06-18 03:38:41,994][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 03:38:45,997][12883] Updated weights for policy 0, policy_version 50601 (0.0030) +[2024-06-18 03:38:46,994][12645] Fps is (10 sec: 49151.6, 60 sec: 42325.4, 300 sec: 42154.4). Total num frames: 829112320. Throughput: 0: 42571.2. Samples: 829179320. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) +[2024-06-18 03:38:46,994][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 03:38:49,092][12883] Updated weights for policy 0, policy_version 50611 (0.0036) +[2024-06-18 03:38:51,996][12645] Fps is (10 sec: 37675.2, 60 sec: 41504.7, 300 sec: 41931.9). Total num frames: 829259776. Throughput: 0: 42269.6. Samples: 829425960. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) +[2024-06-18 03:38:51,996][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 03:38:53,614][12883] Updated weights for policy 0, policy_version 50621 (0.0047) +[2024-06-18 03:38:56,794][12883] Updated weights for policy 0, policy_version 50631 (0.0035) +[2024-06-18 03:38:56,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42209.6). Total num frames: 829538304. Throughput: 0: 42277.6. Samples: 829673540. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) +[2024-06-18 03:38:57,000][12645] Avg episode reward: [(0, '0.057')] +[2024-06-18 03:39:01,377][12883] Updated weights for policy 0, policy_version 50641 (0.0033) +[2024-06-18 03:39:01,994][12645] Fps is (10 sec: 47523.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 829734912. Throughput: 0: 42443.9. Samples: 829814200. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) +[2024-06-18 03:39:01,994][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 03:39:02,304][12862] Signal inference workers to stop experience collection... (11900 times) +[2024-06-18 03:39:02,304][12862] Signal inference workers to resume experience collection... (11900 times) +[2024-06-18 03:39:02,340][12883] InferenceWorker_p0-w0: stopping experience collection (11900 times) +[2024-06-18 03:39:02,341][12883] InferenceWorker_p0-w0: resuming experience collection (11900 times) +[2024-06-18 03:39:04,311][12883] Updated weights for policy 0, policy_version 50651 (0.0031) +[2024-06-18 03:39:06,994][12645] Fps is (10 sec: 37683.9, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 829915136. Throughput: 0: 42226.2. Samples: 830060720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) +[2024-06-18 03:39:06,994][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 03:39:09,528][12883] Updated weights for policy 0, policy_version 50661 (0.0029) +[2024-06-18 03:39:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 830177280. Throughput: 0: 42137.4. Samples: 830302300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) +[2024-06-18 03:39:11,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 03:39:12,223][12883] Updated weights for policy 0, policy_version 50671 (0.0032) +[2024-06-18 03:39:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 830341120. Throughput: 0: 42082.2. Samples: 830436180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) +[2024-06-18 03:39:16,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 03:39:17,256][12883] Updated weights for policy 0, policy_version 50681 (0.0033) +[2024-06-18 03:39:20,048][12883] Updated weights for policy 0, policy_version 50691 (0.0041) +[2024-06-18 03:39:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 830570496. Throughput: 0: 42066.1. Samples: 830686520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) +[2024-06-18 03:39:21,994][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 03:39:25,053][12883] Updated weights for policy 0, policy_version 50701 (0.0037) +[2024-06-18 03:39:26,994][12645] Fps is (10 sec: 49151.5, 60 sec: 43144.5, 300 sec: 42266.0). Total num frames: 830832640. Throughput: 0: 42137.7. Samples: 830935280. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) +[2024-06-18 03:39:26,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 03:39:27,966][12883] Updated weights for policy 0, policy_version 50711 (0.0035) +[2024-06-18 03:39:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41232.9, 300 sec: 42043.0). Total num frames: 830963712. Throughput: 0: 41999.4. Samples: 831069300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) +[2024-06-18 03:39:31,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 03:39:32,671][12883] Updated weights for policy 0, policy_version 50721 (0.0041) +[2024-06-18 03:39:35,622][12883] Updated weights for policy 0, policy_version 50731 (0.0035) +[2024-06-18 03:39:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 43144.4, 300 sec: 42265.1). Total num frames: 831209472. Throughput: 0: 42142.8. Samples: 831322300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) +[2024-06-18 03:39:36,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 03:39:40,385][12883] Updated weights for policy 0, policy_version 50741 (0.0047) +[2024-06-18 03:39:41,994][12645] Fps is (10 sec: 47514.8, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 831438848. Throughput: 0: 42441.1. Samples: 831583380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) +[2024-06-18 03:39:41,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 03:39:43,422][12883] Updated weights for policy 0, policy_version 50751 (0.0028) +[2024-06-18 03:39:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 831602688. Throughput: 0: 41983.1. Samples: 831703440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) +[2024-06-18 03:39:46,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 03:39:48,271][12883] Updated weights for policy 0, policy_version 50761 (0.0038) +[2024-06-18 03:39:51,355][12883] Updated weights for policy 0, policy_version 50771 (0.0033) +[2024-06-18 03:39:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43146.1, 300 sec: 42265.5). Total num frames: 831848448. Throughput: 0: 41924.1. Samples: 831947300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) +[2024-06-18 03:39:51,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 03:39:56,263][12883] Updated weights for policy 0, policy_version 50781 (0.0038) +[2024-06-18 03:39:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 832028672. Throughput: 0: 42451.9. Samples: 832212640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) +[2024-06-18 03:39:56,994][12645] Avg episode reward: [(0, '0.044')] +[2024-06-18 03:39:59,115][12883] Updated weights for policy 0, policy_version 50791 (0.0031) +[2024-06-18 03:40:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42154.4). Total num frames: 832258048. Throughput: 0: 42046.7. Samples: 832328280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) +[2024-06-18 03:40:01,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 03:40:04,091][12883] Updated weights for policy 0, policy_version 50801 (0.0030) +[2024-06-18 03:40:06,199][12862] Signal inference workers to stop experience collection... (11950 times) +[2024-06-18 03:40:06,199][12862] Signal inference workers to resume experience collection... (11950 times) +[2024-06-18 03:40:06,253][12883] InferenceWorker_p0-w0: stopping experience collection (11950 times) +[2024-06-18 03:40:06,253][12883] InferenceWorker_p0-w0: resuming experience collection (11950 times) +[2024-06-18 03:40:06,947][12883] Updated weights for policy 0, policy_version 50811 (0.0030) +[2024-06-18 03:40:06,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 832487424. Throughput: 0: 42211.7. Samples: 832586040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) +[2024-06-18 03:40:06,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 03:40:11,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40960.1, 300 sec: 41932.0). Total num frames: 832634880. Throughput: 0: 42295.3. Samples: 832838560. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) +[2024-06-18 03:40:11,994][12645] Avg episode reward: [(0, '0.053')] +[2024-06-18 03:40:12,009][12883] Updated weights for policy 0, policy_version 50821 (0.0029) +[2024-06-18 03:40:14,757][12883] Updated weights for policy 0, policy_version 50831 (0.0031) +[2024-06-18 03:40:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 832897024. Throughput: 0: 41880.1. Samples: 832953900. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) +[2024-06-18 03:40:16,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 03:40:19,672][12883] Updated weights for policy 0, policy_version 50841 (0.0032) +[2024-06-18 03:40:21,993][12645] Fps is (10 sec: 47513.9, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 833110016. Throughput: 0: 42184.7. Samples: 833220600. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) +[2024-06-18 03:40:21,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 03:40:22,547][12883] Updated weights for policy 0, policy_version 50851 (0.0032) +[2024-06-18 03:40:26,994][12645] Fps is (10 sec: 36044.7, 60 sec: 40413.9, 300 sec: 41931.9). Total num frames: 833257472. Throughput: 0: 41826.9. Samples: 833465600. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) +[2024-06-18 03:40:26,994][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 03:40:27,536][12883] Updated weights for policy 0, policy_version 50861 (0.0035) +[2024-06-18 03:40:30,516][12883] Updated weights for policy 0, policy_version 50871 (0.0053) +[2024-06-18 03:40:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.6, 300 sec: 42154.1). Total num frames: 833536000. Throughput: 0: 41814.3. Samples: 833585080. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) +[2024-06-18 03:40:31,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 03:40:35,140][12883] Updated weights for policy 0, policy_version 50881 (0.0032) +[2024-06-18 03:40:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 833699840. Throughput: 0: 42019.1. Samples: 833838160. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) +[2024-06-18 03:40:36,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 03:40:37,091][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050886_833716224.pth... +[2024-06-18 03:40:37,157][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050268_823590912.pth +[2024-06-18 03:40:38,387][12883] Updated weights for policy 0, policy_version 50891 (0.0037) +[2024-06-18 03:40:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 833912832. Throughput: 0: 41638.3. Samples: 834086360. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) +[2024-06-18 03:40:41,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 03:40:43,221][12883] Updated weights for policy 0, policy_version 50901 (0.0032) +[2024-06-18 03:40:46,067][12883] Updated weights for policy 0, policy_version 50911 (0.0034) +[2024-06-18 03:40:46,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 834158592. Throughput: 0: 41925.2. Samples: 834214920. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) +[2024-06-18 03:40:46,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 03:40:50,977][12883] Updated weights for policy 0, policy_version 50921 (0.0038) +[2024-06-18 03:40:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 834322432. Throughput: 0: 41855.9. Samples: 834469560. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) +[2024-06-18 03:40:51,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 03:40:53,908][12883] Updated weights for policy 0, policy_version 50931 (0.0025) +[2024-06-18 03:40:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 834551808. Throughput: 0: 41658.6. Samples: 834713200. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) +[2024-06-18 03:40:56,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 03:40:58,613][12883] Updated weights for policy 0, policy_version 50941 (0.0042) +[2024-06-18 03:41:01,910][12883] Updated weights for policy 0, policy_version 50951 (0.0035) +[2024-06-18 03:41:01,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 834781184. Throughput: 0: 42064.6. Samples: 834846800. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) +[2024-06-18 03:41:01,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 03:41:06,373][12883] Updated weights for policy 0, policy_version 50961 (0.0031) +[2024-06-18 03:41:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41988.3). Total num frames: 834945024. Throughput: 0: 41775.9. Samples: 835100520. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) +[2024-06-18 03:41:06,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 03:41:07,847][12862] Signal inference workers to stop experience collection... (12000 times) +[2024-06-18 03:41:07,900][12883] InferenceWorker_p0-w0: stopping experience collection (12000 times) +[2024-06-18 03:41:07,966][12862] Signal inference workers to resume experience collection... (12000 times) +[2024-06-18 03:41:07,966][12883] InferenceWorker_p0-w0: resuming experience collection (12000 times) +[2024-06-18 03:41:09,641][12883] Updated weights for policy 0, policy_version 50971 (0.0036) +[2024-06-18 03:41:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 835190784. Throughput: 0: 41670.8. Samples: 835340780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:41:11,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 03:41:14,122][12883] Updated weights for policy 0, policy_version 50981 (0.0037) +[2024-06-18 03:41:16,994][12645] Fps is (10 sec: 45874.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 835403776. Throughput: 0: 42041.3. Samples: 835476940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:41:16,994][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 03:41:17,624][12883] Updated weights for policy 0, policy_version 50991 (0.0030) +[2024-06-18 03:41:21,949][12883] Updated weights for policy 0, policy_version 51001 (0.0034) +[2024-06-18 03:41:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 835600384. Throughput: 0: 41948.0. Samples: 835725820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:41:21,994][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 03:41:25,459][12883] Updated weights for policy 0, policy_version 51011 (0.0024) +[2024-06-18 03:41:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 835829760. Throughput: 0: 41918.2. Samples: 835972680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:41:26,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 03:41:29,852][12883] Updated weights for policy 0, policy_version 51021 (0.0037) +[2024-06-18 03:41:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 836042752. Throughput: 0: 42020.0. Samples: 836105820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:41:31,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 03:41:33,239][12883] Updated weights for policy 0, policy_version 51031 (0.0036) +[2024-06-18 03:41:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 836206592. Throughput: 0: 41899.1. Samples: 836355020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:41:36,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 03:41:37,848][12883] Updated weights for policy 0, policy_version 51041 (0.0034) +[2024-06-18 03:41:41,094][12883] Updated weights for policy 0, policy_version 51051 (0.0038) +[2024-06-18 03:41:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 836452352. Throughput: 0: 42046.8. Samples: 836605300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 03:41:41,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 03:41:45,528][12883] Updated weights for policy 0, policy_version 51061 (0.0024) +[2024-06-18 03:41:46,994][12645] Fps is (10 sec: 45875.4, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 836665344. Throughput: 0: 41978.6. Samples: 836735840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 03:41:46,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 03:41:49,030][12883] Updated weights for policy 0, policy_version 51071 (0.0040) +[2024-06-18 03:41:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 836845568. Throughput: 0: 41760.4. Samples: 836979740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 03:41:51,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 03:41:53,351][12883] Updated weights for policy 0, policy_version 51081 (0.0027) +[2024-06-18 03:41:56,627][12883] Updated weights for policy 0, policy_version 51091 (0.0032) +[2024-06-18 03:41:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 837091328. Throughput: 0: 42059.9. Samples: 837233480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 03:41:56,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 03:42:01,388][12883] Updated weights for policy 0, policy_version 51101 (0.0027) +[2024-06-18 03:42:01,996][12645] Fps is (10 sec: 44226.9, 60 sec: 41777.6, 300 sec: 42043.0). Total num frames: 837287936. Throughput: 0: 41930.9. Samples: 837363920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 03:42:01,996][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 03:42:04,562][12883] Updated weights for policy 0, policy_version 51111 (0.0027) +[2024-06-18 03:42:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 837484544. Throughput: 0: 41970.7. Samples: 837614500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 03:42:06,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 03:42:08,922][12883] Updated weights for policy 0, policy_version 51121 (0.0038) +[2024-06-18 03:42:11,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 837697536. Throughput: 0: 42152.9. Samples: 837869560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) +[2024-06-18 03:42:11,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 03:42:12,434][12883] Updated weights for policy 0, policy_version 51131 (0.0032) +[2024-06-18 03:42:14,823][12862] Signal inference workers to stop experience collection... (12050 times) +[2024-06-18 03:42:14,824][12862] Signal inference workers to resume experience collection... (12050 times) +[2024-06-18 03:42:14,852][12883] InferenceWorker_p0-w0: stopping experience collection (12050 times) +[2024-06-18 03:42:14,852][12883] InferenceWorker_p0-w0: resuming experience collection (12050 times) +[2024-06-18 03:42:16,665][12883] Updated weights for policy 0, policy_version 51141 (0.0030) +[2024-06-18 03:42:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 837910528. Throughput: 0: 41988.5. Samples: 837995300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:42:16,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 03:42:19,943][12883] Updated weights for policy 0, policy_version 51151 (0.0028) +[2024-06-18 03:42:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 838123520. Throughput: 0: 42087.2. Samples: 838248940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:42:21,994][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 03:42:24,176][12883] Updated weights for policy 0, policy_version 51161 (0.0027) +[2024-06-18 03:42:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 838336512. Throughput: 0: 42167.1. Samples: 838502820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:42:26,994][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 03:42:27,711][12883] Updated weights for policy 0, policy_version 51171 (0.0028) +[2024-06-18 03:42:31,835][12883] Updated weights for policy 0, policy_version 51181 (0.0039) +[2024-06-18 03:42:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41987.8). Total num frames: 838549504. Throughput: 0: 42040.4. Samples: 838627660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:42:31,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 03:42:35,308][12883] Updated weights for policy 0, policy_version 51191 (0.0039) +[2024-06-18 03:42:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 838762496. Throughput: 0: 42189.8. Samples: 838878280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:42:36,994][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 03:42:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051194_838762496.pth... +[2024-06-18 03:42:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050576_828637184.pth +[2024-06-18 03:42:39,338][12883] Updated weights for policy 0, policy_version 51201 (0.0034) +[2024-06-18 03:42:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 838975488. Throughput: 0: 42161.4. Samples: 839130740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:42:41,996][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 03:42:43,134][12883] Updated weights for policy 0, policy_version 51211 (0.0033) +[2024-06-18 03:42:46,847][12883] Updated weights for policy 0, policy_version 51221 (0.0043) +[2024-06-18 03:42:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 839204864. Throughput: 0: 42183.4. Samples: 839262080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 03:42:46,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 03:42:50,832][12883] Updated weights for policy 0, policy_version 51231 (0.0028) +[2024-06-18 03:42:52,000][12645] Fps is (10 sec: 40934.6, 60 sec: 42320.9, 300 sec: 42153.2). Total num frames: 839385088. Throughput: 0: 42213.7. Samples: 839514380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 03:42:52,000][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 03:42:54,454][12883] Updated weights for policy 0, policy_version 51241 (0.0044) +[2024-06-18 03:42:56,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42050.7, 300 sec: 42042.7). Total num frames: 839614464. Throughput: 0: 42112.1. Samples: 839764700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 03:42:56,997][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 03:42:59,013][12883] Updated weights for policy 0, policy_version 51251 (0.0034) +[2024-06-18 03:43:01,994][12645] Fps is (10 sec: 44264.6, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 839827456. Throughput: 0: 42248.5. Samples: 839896480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 03:43:01,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 03:43:02,287][12883] Updated weights for policy 0, policy_version 51261 (0.0032) +[2024-06-18 03:43:06,938][12883] Updated weights for policy 0, policy_version 51271 (0.0040) +[2024-06-18 03:43:06,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 840024064. Throughput: 0: 42092.0. Samples: 840143080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 03:43:06,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 03:43:10,603][12883] Updated weights for policy 0, policy_version 51281 (0.0029) +[2024-06-18 03:43:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 840237056. Throughput: 0: 41950.5. Samples: 840390600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 03:43:11,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 03:43:14,763][12883] Updated weights for policy 0, policy_version 51291 (0.0020) +[2024-06-18 03:43:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 840433664. Throughput: 0: 41989.0. Samples: 840517160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 03:43:16,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 03:43:18,229][12883] Updated weights for policy 0, policy_version 51301 (0.0031) +[2024-06-18 03:43:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 840663040. Throughput: 0: 42103.5. Samples: 840772940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:43:21,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 03:43:22,523][12883] Updated weights for policy 0, policy_version 51311 (0.0033) +[2024-06-18 03:43:26,418][12883] Updated weights for policy 0, policy_version 51321 (0.0031) +[2024-06-18 03:43:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 840859648. Throughput: 0: 42025.8. Samples: 841021900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:43:26,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 03:43:30,059][12883] Updated weights for policy 0, policy_version 51331 (0.0034) +[2024-06-18 03:43:32,000][12645] Fps is (10 sec: 40934.7, 60 sec: 42047.9, 300 sec: 42208.7). Total num frames: 841072640. Throughput: 0: 41815.6. Samples: 841144040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:43:32,000][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 03:43:34,034][12883] Updated weights for policy 0, policy_version 51341 (0.0045) +[2024-06-18 03:43:36,886][12862] Signal inference workers to stop experience collection... (12100 times) +[2024-06-18 03:43:36,920][12883] InferenceWorker_p0-w0: stopping experience collection (12100 times) +[2024-06-18 03:43:36,951][12862] Signal inference workers to resume experience collection... (12100 times) +[2024-06-18 03:43:36,952][12883] InferenceWorker_p0-w0: resuming experience collection (12100 times) +[2024-06-18 03:43:36,996][12645] Fps is (10 sec: 40950.8, 60 sec: 41777.6, 300 sec: 41987.2). Total num frames: 841269248. Throughput: 0: 41941.5. Samples: 841401580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:43:36,996][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 03:43:37,690][12883] Updated weights for policy 0, policy_version 51351 (0.0035) +[2024-06-18 03:43:41,928][12883] Updated weights for policy 0, policy_version 51361 (0.0037) +[2024-06-18 03:43:41,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 841498624. Throughput: 0: 41899.0. Samples: 841650060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:43:41,996][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 03:43:45,763][12883] Updated weights for policy 0, policy_version 51371 (0.0039) +[2024-06-18 03:43:46,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42052.3, 300 sec: 42265.5). Total num frames: 841728000. Throughput: 0: 41899.5. Samples: 841781960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:43:46,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 03:43:49,935][12883] Updated weights for policy 0, policy_version 51381 (0.0031) +[2024-06-18 03:43:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41783.6, 300 sec: 41876.4). Total num frames: 841891840. Throughput: 0: 41878.2. Samples: 842027600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:43:51,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 03:43:54,069][12883] Updated weights for policy 0, policy_version 51391 (0.0030) +[2024-06-18 03:43:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41780.8, 300 sec: 41987.5). Total num frames: 842121216. Throughput: 0: 41907.6. Samples: 842276440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:43:56,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 03:43:57,557][12883] Updated weights for policy 0, policy_version 51401 (0.0032) +[2024-06-18 03:44:01,578][12883] Updated weights for policy 0, policy_version 51411 (0.0032) +[2024-06-18 03:44:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 842334208. Throughput: 0: 42039.4. Samples: 842408940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:44:01,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 03:44:05,283][12883] Updated weights for policy 0, policy_version 51421 (0.0025) +[2024-06-18 03:44:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 842530816. Throughput: 0: 41868.5. Samples: 842657020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:44:06,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 03:44:09,386][12883] Updated weights for policy 0, policy_version 51431 (0.0039) +[2024-06-18 03:44:11,997][12645] Fps is (10 sec: 42583.3, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 842760192. Throughput: 0: 41864.2. Samples: 842905940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:44:11,998][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 03:44:13,063][12883] Updated weights for policy 0, policy_version 51441 (0.0029) +[2024-06-18 03:44:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 842956800. Throughput: 0: 41998.6. Samples: 843033720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:44:16,995][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 03:44:17,006][12883] Updated weights for policy 0, policy_version 51451 (0.0025) +[2024-06-18 03:44:21,285][12883] Updated weights for policy 0, policy_version 51461 (0.0037) +[2024-06-18 03:44:21,994][12645] Fps is (10 sec: 39335.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 843153408. Throughput: 0: 41859.8. Samples: 843285180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 03:44:22,003][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 03:44:24,868][12883] Updated weights for policy 0, policy_version 51471 (0.0041) +[2024-06-18 03:44:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 843399168. Throughput: 0: 41805.8. Samples: 843531320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 03:44:26,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 03:44:28,998][12883] Updated weights for policy 0, policy_version 51481 (0.0044) +[2024-06-18 03:44:32,000][12645] Fps is (10 sec: 40934.7, 60 sec: 41506.1, 300 sec: 41875.5). Total num frames: 843563008. Throughput: 0: 41789.7. Samples: 843662760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 03:44:32,001][12645] Avg episode reward: [(0, '0.094')] +[2024-06-18 03:44:32,813][12883] Updated weights for policy 0, policy_version 51491 (0.0027) +[2024-06-18 03:44:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41780.7, 300 sec: 41820.8). Total num frames: 843776000. Throughput: 0: 41876.3. Samples: 843912040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 03:44:36,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 03:44:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051500_843776000.pth... +[2024-06-18 03:44:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050886_833716224.pth +[2024-06-18 03:44:37,248][12883] Updated weights for policy 0, policy_version 51501 (0.0032) +[2024-06-18 03:44:40,689][12883] Updated weights for policy 0, policy_version 51511 (0.0034) +[2024-06-18 03:44:41,994][12645] Fps is (10 sec: 45904.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 844021760. Throughput: 0: 41876.0. Samples: 844160860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 03:44:41,994][12645] Avg episode reward: [(0, '0.137')] +[2024-06-18 03:44:44,715][12883] Updated weights for policy 0, policy_version 51521 (0.0032) +[2024-06-18 03:44:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 844201984. Throughput: 0: 41961.0. Samples: 844297180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 03:44:46,994][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 03:44:48,374][12883] Updated weights for policy 0, policy_version 51531 (0.0029) +[2024-06-18 03:44:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 844431360. Throughput: 0: 41944.4. Samples: 844544520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 03:44:51,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 03:44:52,285][12883] Updated weights for policy 0, policy_version 51541 (0.0041) +[2024-06-18 03:44:56,287][12883] Updated weights for policy 0, policy_version 51551 (0.0033) +[2024-06-18 03:44:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 844644352. Throughput: 0: 42173.2. Samples: 844803580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 03:44:56,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 03:45:00,167][12883] Updated weights for policy 0, policy_version 51561 (0.0028) +[2024-06-18 03:45:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 844840960. Throughput: 0: 42108.5. Samples: 844928600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 03:45:01,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 03:45:04,143][12883] Updated weights for policy 0, policy_version 51571 (0.0028) +[2024-06-18 03:45:05,662][12862] Signal inference workers to stop experience collection... (12150 times) +[2024-06-18 03:45:05,663][12862] Signal inference workers to resume experience collection... (12150 times) +[2024-06-18 03:45:05,680][12883] InferenceWorker_p0-w0: stopping experience collection (12150 times) +[2024-06-18 03:45:05,681][12883] InferenceWorker_p0-w0: resuming experience collection (12150 times) +[2024-06-18 03:45:06,996][12645] Fps is (10 sec: 42588.4, 60 sec: 42323.7, 300 sec: 42153.7). Total num frames: 845070336. Throughput: 0: 42019.3. Samples: 845176140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 03:45:06,997][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 03:45:08,423][12883] Updated weights for policy 0, policy_version 51581 (0.0036) +[2024-06-18 03:45:11,838][12883] Updated weights for policy 0, policy_version 51591 (0.0034) +[2024-06-18 03:45:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41781.7, 300 sec: 41931.9). Total num frames: 845266944. Throughput: 0: 42207.2. Samples: 845430640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 03:45:11,994][12645] Avg episode reward: [(0, '0.094')] +[2024-06-18 03:45:16,158][12883] Updated weights for policy 0, policy_version 51601 (0.0044) +[2024-06-18 03:45:16,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42325.3, 300 sec: 41987.4). Total num frames: 845496320. Throughput: 0: 42022.7. Samples: 845553520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 03:45:16,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 03:45:19,672][12883] Updated weights for policy 0, policy_version 51611 (0.0035) +[2024-06-18 03:45:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 845692928. Throughput: 0: 42116.5. Samples: 845807280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 03:45:21,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 03:45:23,862][12883] Updated weights for policy 0, policy_version 51621 (0.0028) +[2024-06-18 03:45:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 845905920. Throughput: 0: 42245.8. Samples: 846061920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) +[2024-06-18 03:45:26,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 03:45:27,451][12883] Updated weights for policy 0, policy_version 51631 (0.0031) +[2024-06-18 03:45:31,575][12883] Updated weights for policy 0, policy_version 51641 (0.0033) +[2024-06-18 03:45:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42056.7, 300 sec: 41987.5). Total num frames: 846086144. Throughput: 0: 42014.7. Samples: 846187840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 03:45:31,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 03:45:35,290][12883] Updated weights for policy 0, policy_version 51651 (0.0032) +[2024-06-18 03:45:36,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42869.9, 300 sec: 42153.8). Total num frames: 846348288. Throughput: 0: 42111.7. Samples: 846439640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 03:45:36,996][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 03:45:39,427][12883] Updated weights for policy 0, policy_version 51661 (0.0042) +[2024-06-18 03:45:42,000][12645] Fps is (10 sec: 44209.1, 60 sec: 41774.8, 300 sec: 41931.1). Total num frames: 846528512. Throughput: 0: 41833.7. Samples: 846686360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 03:45:42,001][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 03:45:43,037][12883] Updated weights for policy 0, policy_version 51671 (0.0038) +[2024-06-18 03:45:46,994][12645] Fps is (10 sec: 37691.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 846725120. Throughput: 0: 41803.0. Samples: 846809740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 03:45:46,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 03:45:47,159][12883] Updated weights for policy 0, policy_version 51681 (0.0028) +[2024-06-18 03:45:50,658][12883] Updated weights for policy 0, policy_version 51691 (0.0038) +[2024-06-18 03:45:51,994][12645] Fps is (10 sec: 44264.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 846970880. Throughput: 0: 42071.1. Samples: 847069240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 03:45:51,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 03:45:54,650][12883] Updated weights for policy 0, policy_version 51701 (0.0035) +[2024-06-18 03:45:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.1, 300 sec: 41987.4). Total num frames: 847167488. Throughput: 0: 42018.5. Samples: 847321480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 03:45:57,003][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 03:45:58,151][12883] Updated weights for policy 0, policy_version 51711 (0.0035) +[2024-06-18 03:46:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 847364096. Throughput: 0: 42103.7. Samples: 847448180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 03:46:01,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 03:46:02,410][12883] Updated weights for policy 0, policy_version 51721 (0.0041) +[2024-06-18 03:46:05,959][12883] Updated weights for policy 0, policy_version 51731 (0.0039) +[2024-06-18 03:46:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42053.8, 300 sec: 42043.0). Total num frames: 847593472. Throughput: 0: 42209.3. Samples: 847706700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 03:46:06,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 03:46:10,161][12883] Updated weights for policy 0, policy_version 51741 (0.0034) +[2024-06-18 03:46:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 847806464. Throughput: 0: 42217.3. Samples: 847961700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 03:46:11,996][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 03:46:13,590][12883] Updated weights for policy 0, policy_version 51751 (0.0041) +[2024-06-18 03:46:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 848003072. Throughput: 0: 42230.2. Samples: 848088200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 03:46:16,994][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 03:46:17,747][12883] Updated weights for policy 0, policy_version 51761 (0.0033) +[2024-06-18 03:46:21,690][12883] Updated weights for policy 0, policy_version 51771 (0.0027) +[2024-06-18 03:46:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 848232448. Throughput: 0: 42380.8. Samples: 848346680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 03:46:21,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 03:46:25,683][12883] Updated weights for policy 0, policy_version 51781 (0.0031) +[2024-06-18 03:46:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 848445440. Throughput: 0: 42494.4. Samples: 848598340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 03:46:26,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 03:46:27,641][12862] Signal inference workers to stop experience collection... (12200 times) +[2024-06-18 03:46:27,642][12862] Signal inference workers to resume experience collection... (12200 times) +[2024-06-18 03:46:27,685][12883] InferenceWorker_p0-w0: stopping experience collection (12200 times) +[2024-06-18 03:46:27,692][12883] InferenceWorker_p0-w0: resuming experience collection (12200 times) +[2024-06-18 03:46:29,403][12883] Updated weights for policy 0, policy_version 51791 (0.0024) +[2024-06-18 03:46:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 848658432. Throughput: 0: 42630.3. Samples: 848728100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 03:46:31,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 03:46:33,263][12883] Updated weights for policy 0, policy_version 51801 (0.0031) +[2024-06-18 03:46:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41780.7, 300 sec: 42043.0). Total num frames: 848855040. Throughput: 0: 42414.9. Samples: 848977920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 03:46:36,995][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 03:46:37,071][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051811_848871424.pth... +[2024-06-18 03:46:37,078][12883] Updated weights for policy 0, policy_version 51811 (0.0031) +[2024-06-18 03:46:37,116][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051194_838762496.pth +[2024-06-18 03:46:40,785][12883] Updated weights for policy 0, policy_version 51821 (0.0034) +[2024-06-18 03:46:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42329.8, 300 sec: 42043.0). Total num frames: 849068032. Throughput: 0: 42406.8. Samples: 849229780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 03:46:41,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 03:46:45,019][12883] Updated weights for policy 0, policy_version 51831 (0.0025) +[2024-06-18 03:46:46,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 849297408. Throughput: 0: 42568.0. Samples: 849363740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 03:46:46,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 03:46:48,669][12883] Updated weights for policy 0, policy_version 51841 (0.0038) +[2024-06-18 03:46:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 849477632. Throughput: 0: 42364.9. Samples: 849613120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 03:46:51,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 03:46:52,727][12883] Updated weights for policy 0, policy_version 51851 (0.0039) +[2024-06-18 03:46:56,367][12883] Updated weights for policy 0, policy_version 51861 (0.0033) +[2024-06-18 03:46:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42098.9). Total num frames: 849707008. Throughput: 0: 42366.2. Samples: 849868180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 03:46:56,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 03:47:00,566][12883] Updated weights for policy 0, policy_version 51871 (0.0041) +[2024-06-18 03:47:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 849920000. Throughput: 0: 42440.0. Samples: 849998000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 03:47:01,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 03:47:02,149][12862] Saving new best policy, reward=0.416! +[2024-06-18 03:47:04,147][12883] Updated weights for policy 0, policy_version 51881 (0.0025) +[2024-06-18 03:47:06,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42323.8, 300 sec: 42153.8). Total num frames: 850132992. Throughput: 0: 42161.5. Samples: 850244040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:47:06,997][12645] Avg episode reward: [(0, '0.215')] +[2024-06-18 03:47:08,487][12883] Updated weights for policy 0, policy_version 51891 (0.0029) +[2024-06-18 03:47:11,754][12883] Updated weights for policy 0, policy_version 51901 (0.0039) +[2024-06-18 03:47:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 850345984. Throughput: 0: 42366.3. Samples: 850504820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:47:11,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 03:47:16,026][12883] Updated weights for policy 0, policy_version 51911 (0.0033) +[2024-06-18 03:47:16,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 850542592. Throughput: 0: 42236.0. Samples: 850628720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:47:17,002][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 03:47:19,550][12883] Updated weights for policy 0, policy_version 51921 (0.0031) +[2024-06-18 03:47:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 850788352. Throughput: 0: 42300.1. Samples: 850881420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:47:21,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 03:47:22,003][12862] Saving new best policy, reward=0.440! +[2024-06-18 03:47:23,688][12883] Updated weights for policy 0, policy_version 51931 (0.0027) +[2024-06-18 03:47:26,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 850984960. Throughput: 0: 42426.4. Samples: 851138980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:47:26,994][12645] Avg episode reward: [(0, '0.145')] +[2024-06-18 03:47:27,341][12883] Updated weights for policy 0, policy_version 51941 (0.0037) +[2024-06-18 03:47:31,350][12883] Updated weights for policy 0, policy_version 51951 (0.0032) +[2024-06-18 03:47:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 851181568. Throughput: 0: 42182.6. Samples: 851261960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:47:31,994][12645] Avg episode reward: [(0, '0.145')] +[2024-06-18 03:47:34,905][12883] Updated weights for policy 0, policy_version 51961 (0.0040) +[2024-06-18 03:47:35,992][12862] Signal inference workers to stop experience collection... (12250 times) +[2024-06-18 03:47:35,992][12862] Signal inference workers to resume experience collection... (12250 times) +[2024-06-18 03:47:36,009][12883] InferenceWorker_p0-w0: stopping experience collection (12250 times) +[2024-06-18 03:47:36,009][12883] InferenceWorker_p0-w0: resuming experience collection (12250 times) +[2024-06-18 03:47:36,994][12645] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 851410944. Throughput: 0: 42381.5. Samples: 851520280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 03:47:36,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 03:47:39,194][12883] Updated weights for policy 0, policy_version 51971 (0.0034) +[2024-06-18 03:47:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 851607552. Throughput: 0: 42355.1. Samples: 851774160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:47:41,994][12645] Avg episode reward: [(0, '0.010')] +[2024-06-18 03:47:42,773][12883] Updated weights for policy 0, policy_version 51981 (0.0025) +[2024-06-18 03:47:46,789][12883] Updated weights for policy 0, policy_version 51991 (0.0037) +[2024-06-18 03:47:46,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.1, 300 sec: 42155.0). Total num frames: 851820544. Throughput: 0: 42187.5. Samples: 851896440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:47:46,994][12645] Avg episode reward: [(0, '0.051')] +[2024-06-18 03:47:50,418][12883] Updated weights for policy 0, policy_version 52001 (0.0038) +[2024-06-18 03:47:51,995][12645] Fps is (10 sec: 44231.0, 60 sec: 42870.5, 300 sec: 42154.2). Total num frames: 852049920. Throughput: 0: 42367.4. Samples: 852150540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:47:51,996][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 03:47:54,757][12883] Updated weights for policy 0, policy_version 52011 (0.0036) +[2024-06-18 03:47:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 852246528. Throughput: 0: 42354.7. Samples: 852410780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:47:56,994][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 03:47:57,979][12883] Updated weights for policy 0, policy_version 52021 (0.0024) +[2024-06-18 03:48:01,994][12645] Fps is (10 sec: 40965.8, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 852459520. Throughput: 0: 42240.8. Samples: 852529560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:48:01,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 03:48:02,357][12883] Updated weights for policy 0, policy_version 52031 (0.0037) +[2024-06-18 03:48:05,783][12883] Updated weights for policy 0, policy_version 52041 (0.0028) +[2024-06-18 03:48:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42326.9, 300 sec: 42154.1). Total num frames: 852672512. Throughput: 0: 42197.2. Samples: 852780300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 03:48:06,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 03:48:09,860][12883] Updated weights for policy 0, policy_version 52051 (0.0033) +[2024-06-18 03:48:11,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42050.7, 300 sec: 42153.8). Total num frames: 852869120. Throughput: 0: 42140.3. Samples: 853035380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:48:11,996][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 03:48:13,801][12883] Updated weights for policy 0, policy_version 52061 (0.0034) +[2024-06-18 03:48:16,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 853098496. Throughput: 0: 42293.0. Samples: 853165240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:48:16,996][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 03:48:17,534][12883] Updated weights for policy 0, policy_version 52071 (0.0021) +[2024-06-18 03:48:21,546][12883] Updated weights for policy 0, policy_version 52081 (0.0047) +[2024-06-18 03:48:21,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 853311488. Throughput: 0: 42138.5. Samples: 853416520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:48:21,994][12645] Avg episode reward: [(0, '0.228')] +[2024-06-18 03:48:25,531][12883] Updated weights for policy 0, policy_version 52091 (0.0038) +[2024-06-18 03:48:26,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.5, 300 sec: 42210.5). Total num frames: 853524480. Throughput: 0: 42187.2. Samples: 853672580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:48:26,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 03:48:29,346][12883] Updated weights for policy 0, policy_version 52101 (0.0039) +[2024-06-18 03:48:31,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 853704704. Throughput: 0: 42212.6. Samples: 853796000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:48:31,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 03:48:33,250][12883] Updated weights for policy 0, policy_version 52111 (0.0049) +[2024-06-18 03:48:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 853934080. Throughput: 0: 42217.4. Samples: 854050260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:48:36,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 03:48:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052121_853950464.pth... +[2024-06-18 03:48:37,070][12883] Updated weights for policy 0, policy_version 52121 (0.0039) +[2024-06-18 03:48:37,113][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051500_843776000.pth +[2024-06-18 03:48:41,187][12883] Updated weights for policy 0, policy_version 52131 (0.0036) +[2024-06-18 03:48:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 854147072. Throughput: 0: 42005.4. Samples: 854301020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 03:48:41,994][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 03:48:45,090][12883] Updated weights for policy 0, policy_version 52141 (0.0030) +[2024-06-18 03:48:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 854343680. Throughput: 0: 42168.1. Samples: 854427120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 03:48:46,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 03:48:48,839][12883] Updated weights for policy 0, policy_version 52151 (0.0052) +[2024-06-18 03:48:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42053.2, 300 sec: 42209.6). Total num frames: 854573056. Throughput: 0: 42228.5. Samples: 854680580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 03:48:51,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 03:48:52,764][12883] Updated weights for policy 0, policy_version 52161 (0.0031) +[2024-06-18 03:48:56,522][12883] Updated weights for policy 0, policy_version 52171 (0.0038) +[2024-06-18 03:48:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 854786048. Throughput: 0: 42248.8. Samples: 854936480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 03:48:56,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 03:49:00,540][12883] Updated weights for policy 0, policy_version 52181 (0.0033) +[2024-06-18 03:49:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 854999040. Throughput: 0: 42159.8. Samples: 855062340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 03:49:01,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 03:49:04,274][12883] Updated weights for policy 0, policy_version 52191 (0.0033) +[2024-06-18 03:49:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42210.1). Total num frames: 855212032. Throughput: 0: 42213.8. Samples: 855316140. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 03:49:06,994][12645] Avg episode reward: [(0, '0.126')] +[2024-06-18 03:49:08,339][12883] Updated weights for policy 0, policy_version 52201 (0.0025) +[2024-06-18 03:49:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 855408640. Throughput: 0: 42208.0. Samples: 855571940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 03:49:11,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 03:49:12,033][12883] Updated weights for policy 0, policy_version 52211 (0.0042) +[2024-06-18 03:49:13,300][12862] Signal inference workers to stop experience collection... (12300 times) +[2024-06-18 03:49:13,301][12862] Signal inference workers to resume experience collection... (12300 times) +[2024-06-18 03:49:13,315][12883] InferenceWorker_p0-w0: stopping experience collection (12300 times) +[2024-06-18 03:49:13,315][12883] InferenceWorker_p0-w0: resuming experience collection (12300 times) +[2024-06-18 03:49:16,095][12883] Updated weights for policy 0, policy_version 52221 (0.0026) +[2024-06-18 03:49:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42053.9, 300 sec: 42265.2). Total num frames: 855621632. Throughput: 0: 42265.8. Samples: 855697960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:49:16,994][12645] Avg episode reward: [(0, '0.145')] +[2024-06-18 03:49:19,836][12883] Updated weights for policy 0, policy_version 52231 (0.0032) +[2024-06-18 03:49:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 855834624. Throughput: 0: 42237.7. Samples: 855950960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:49:21,994][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 03:49:23,750][12883] Updated weights for policy 0, policy_version 52241 (0.0028) +[2024-06-18 03:49:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42266.1). Total num frames: 856031232. Throughput: 0: 42391.5. Samples: 856208640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:49:26,994][12645] Avg episode reward: [(0, '0.057')] +[2024-06-18 03:49:27,768][12883] Updated weights for policy 0, policy_version 52251 (0.0042) +[2024-06-18 03:49:31,385][12883] Updated weights for policy 0, policy_version 52261 (0.0031) +[2024-06-18 03:49:31,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42320.4). Total num frames: 856260608. Throughput: 0: 42246.3. Samples: 856328300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:49:31,997][12645] Avg episode reward: [(0, '0.064')] +[2024-06-18 03:49:35,384][12883] Updated weights for policy 0, policy_version 52271 (0.0039) +[2024-06-18 03:49:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 856473600. Throughput: 0: 42292.0. Samples: 856583720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:49:36,994][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 03:49:39,457][12883] Updated weights for policy 0, policy_version 52281 (0.0028) +[2024-06-18 03:49:41,994][12645] Fps is (10 sec: 39330.8, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 856653824. Throughput: 0: 42336.1. Samples: 856841600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:49:41,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 03:49:43,112][12883] Updated weights for policy 0, policy_version 52291 (0.0031) +[2024-06-18 03:49:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 856883200. Throughput: 0: 42264.9. Samples: 856964260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:49:46,995][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 03:49:47,169][12883] Updated weights for policy 0, policy_version 52301 (0.0035) +[2024-06-18 03:49:50,767][12883] Updated weights for policy 0, policy_version 52311 (0.0028) +[2024-06-18 03:49:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 857096192. Throughput: 0: 42396.6. Samples: 857223980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 03:49:51,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 03:49:54,670][12883] Updated weights for policy 0, policy_version 52321 (0.0050) +[2024-06-18 03:49:56,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41777.5, 300 sec: 42209.3). Total num frames: 857292800. Throughput: 0: 42307.1. Samples: 857475860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 03:49:56,997][12645] Avg episode reward: [(0, '0.124')] +[2024-06-18 03:49:58,680][12883] Updated weights for policy 0, policy_version 52331 (0.0033) +[2024-06-18 03:50:01,994][12645] Fps is (10 sec: 44234.7, 60 sec: 42325.1, 300 sec: 42265.4). Total num frames: 857538560. Throughput: 0: 42237.3. Samples: 857598660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 03:50:01,995][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 03:50:02,811][12883] Updated weights for policy 0, policy_version 52341 (0.0039) +[2024-06-18 03:50:06,428][12883] Updated weights for policy 0, policy_version 52351 (0.0045) +[2024-06-18 03:50:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42052.2, 300 sec: 42265.1). Total num frames: 857735168. Throughput: 0: 42336.6. Samples: 857856120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 03:50:06,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 03:50:10,314][12883] Updated weights for policy 0, policy_version 52361 (0.0032) +[2024-06-18 03:50:11,994][12645] Fps is (10 sec: 40962.0, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 857948160. Throughput: 0: 42177.8. Samples: 858106640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 03:50:11,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 03:50:14,295][12883] Updated weights for policy 0, policy_version 52371 (0.0031) +[2024-06-18 03:50:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 858161152. Throughput: 0: 42486.5. Samples: 858240100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 03:50:16,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 03:50:17,879][12883] Updated weights for policy 0, policy_version 52381 (0.0025) +[2024-06-18 03:50:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 858357760. Throughput: 0: 42460.5. Samples: 858494440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:50:21,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 03:50:22,140][12883] Updated weights for policy 0, policy_version 52391 (0.0036) +[2024-06-18 03:50:25,661][12883] Updated weights for policy 0, policy_version 52401 (0.0038) +[2024-06-18 03:50:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 858587136. Throughput: 0: 42153.7. Samples: 858738520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:50:26,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 03:50:29,995][12883] Updated weights for policy 0, policy_version 52411 (0.0039) +[2024-06-18 03:50:30,394][12862] Signal inference workers to stop experience collection... (12350 times) +[2024-06-18 03:50:30,395][12862] Signal inference workers to resume experience collection... (12350 times) +[2024-06-18 03:50:30,407][12883] InferenceWorker_p0-w0: stopping experience collection (12350 times) +[2024-06-18 03:50:30,407][12883] InferenceWorker_p0-w0: resuming experience collection (12350 times) +[2024-06-18 03:50:31,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42053.9, 300 sec: 42154.4). Total num frames: 858783744. Throughput: 0: 42385.0. Samples: 858871580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:50:31,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 03:50:33,394][12883] Updated weights for policy 0, policy_version 52421 (0.0032) +[2024-06-18 03:50:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42210.5). Total num frames: 858980352. Throughput: 0: 42097.3. Samples: 859118360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:50:36,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 03:50:37,109][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052429_858996736.pth... +[2024-06-18 03:50:37,156][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051811_848871424.pth +[2024-06-18 03:50:37,695][12883] Updated weights for policy 0, policy_version 52431 (0.0032) +[2024-06-18 03:50:41,013][12883] Updated weights for policy 0, policy_version 52441 (0.0030) +[2024-06-18 03:50:41,995][12645] Fps is (10 sec: 44232.7, 60 sec: 42870.8, 300 sec: 42376.1). Total num frames: 859226112. Throughput: 0: 42131.2. Samples: 859371700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:50:41,995][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 03:50:45,437][12883] Updated weights for policy 0, policy_version 52451 (0.0039) +[2024-06-18 03:50:46,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.9, 300 sec: 42264.8). Total num frames: 859439104. Throughput: 0: 42348.5. Samples: 859504420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:50:46,996][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 03:50:48,662][12883] Updated weights for policy 0, policy_version 52461 (0.0028) +[2024-06-18 03:50:51,996][12645] Fps is (10 sec: 40954.5, 60 sec: 42323.8, 300 sec: 42264.9). Total num frames: 859635712. Throughput: 0: 42159.0. Samples: 859753360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 03:50:51,996][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 03:50:53,038][12883] Updated weights for policy 0, policy_version 52471 (0.0033) +[2024-06-18 03:50:56,479][12883] Updated weights for policy 0, policy_version 52481 (0.0028) +[2024-06-18 03:50:56,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42873.2, 300 sec: 42376.2). Total num frames: 859865088. Throughput: 0: 42264.4. Samples: 860008540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:50:56,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 03:51:01,158][12883] Updated weights for policy 0, policy_version 52491 (0.0031) +[2024-06-18 03:51:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42052.6, 300 sec: 42265.2). Total num frames: 860061696. Throughput: 0: 42180.5. Samples: 860138220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:51:01,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 03:51:04,081][12883] Updated weights for policy 0, policy_version 52501 (0.0041) +[2024-06-18 03:51:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 860274688. Throughput: 0: 42158.3. Samples: 860391560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:51:06,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 03:51:08,869][12883] Updated weights for policy 0, policy_version 52511 (0.0029) +[2024-06-18 03:51:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 860487680. Throughput: 0: 42269.4. Samples: 860640640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:51:11,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 03:51:12,340][12883] Updated weights for policy 0, policy_version 52521 (0.0035) +[2024-06-18 03:51:16,515][12883] Updated weights for policy 0, policy_version 52531 (0.0031) +[2024-06-18 03:51:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 860684288. Throughput: 0: 42175.9. Samples: 860769500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:51:16,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 03:51:19,757][12883] Updated weights for policy 0, policy_version 52541 (0.0035) +[2024-06-18 03:51:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 860913664. Throughput: 0: 42370.6. Samples: 861025040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 03:51:21,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 03:51:24,082][12883] Updated weights for policy 0, policy_version 52551 (0.0037) +[2024-06-18 03:51:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 861126656. Throughput: 0: 42494.9. Samples: 861283940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 03:51:26,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 03:51:27,380][12883] Updated weights for policy 0, policy_version 52561 (0.0035) +[2024-06-18 03:51:31,912][12883] Updated weights for policy 0, policy_version 52571 (0.0035) +[2024-06-18 03:51:31,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42323.7, 300 sec: 42264.9). Total num frames: 861323264. Throughput: 0: 42323.6. Samples: 861408980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 03:51:31,996][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 03:51:35,237][12883] Updated weights for policy 0, policy_version 52581 (0.0033) +[2024-06-18 03:51:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 861552640. Throughput: 0: 42433.6. Samples: 861662780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 03:51:36,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 03:51:39,606][12883] Updated weights for policy 0, policy_version 52591 (0.0032) +[2024-06-18 03:51:41,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42325.9, 300 sec: 42265.1). Total num frames: 861765632. Throughput: 0: 42352.4. Samples: 861914400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 03:51:41,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 03:51:43,229][12883] Updated weights for policy 0, policy_version 52601 (0.0050) +[2024-06-18 03:51:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42053.9, 300 sec: 42320.7). Total num frames: 861962240. Throughput: 0: 42300.5. Samples: 862041740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 03:51:46,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 03:51:47,211][12883] Updated weights for policy 0, policy_version 52611 (0.0026) +[2024-06-18 03:51:50,697][12883] Updated weights for policy 0, policy_version 52621 (0.0033) +[2024-06-18 03:51:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42600.0, 300 sec: 42320.7). Total num frames: 862191616. Throughput: 0: 42315.5. Samples: 862295760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 03:51:51,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 03:51:55,170][12883] Updated weights for policy 0, policy_version 52631 (0.0038) +[2024-06-18 03:51:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 862388224. Throughput: 0: 42458.3. Samples: 862551260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 03:51:56,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 03:51:58,550][12883] Updated weights for policy 0, policy_version 52641 (0.0035) +[2024-06-18 03:52:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42265.5). Total num frames: 862601216. Throughput: 0: 42289.3. Samples: 862672520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 03:52:01,994][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 03:52:03,013][12883] Updated weights for policy 0, policy_version 52651 (0.0029) +[2024-06-18 03:52:03,534][12862] Signal inference workers to stop experience collection... (12400 times) +[2024-06-18 03:52:03,552][12883] InferenceWorker_p0-w0: stopping experience collection (12400 times) +[2024-06-18 03:52:03,591][12862] Signal inference workers to resume experience collection... (12400 times) +[2024-06-18 03:52:03,592][12883] InferenceWorker_p0-w0: resuming experience collection (12400 times) +[2024-06-18 03:52:06,706][12883] Updated weights for policy 0, policy_version 52661 (0.0029) +[2024-06-18 03:52:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 862830592. Throughput: 0: 42414.7. Samples: 862933700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 03:52:06,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 03:52:10,635][12883] Updated weights for policy 0, policy_version 52671 (0.0040) +[2024-06-18 03:52:11,995][12645] Fps is (10 sec: 42591.8, 60 sec: 42324.2, 300 sec: 42320.5). Total num frames: 863027200. Throughput: 0: 42101.3. Samples: 863178560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 03:52:11,996][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 03:52:14,387][12883] Updated weights for policy 0, policy_version 52681 (0.0044) +[2024-06-18 03:52:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 863240192. Throughput: 0: 42247.3. Samples: 863310020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 03:52:16,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 03:52:18,395][12883] Updated weights for policy 0, policy_version 52691 (0.0036) +[2024-06-18 03:52:21,994][12645] Fps is (10 sec: 40966.5, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 863436800. Throughput: 0: 42086.3. Samples: 863556660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 03:52:21,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 03:52:22,231][12883] Updated weights for policy 0, policy_version 52701 (0.0033) +[2024-06-18 03:52:26,035][12883] Updated weights for policy 0, policy_version 52711 (0.0033) +[2024-06-18 03:52:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 863666176. Throughput: 0: 42019.2. Samples: 863805260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 03:52:26,994][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 03:52:30,042][12883] Updated weights for policy 0, policy_version 52721 (0.0037) +[2024-06-18 03:52:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42053.8, 300 sec: 42154.1). Total num frames: 863846400. Throughput: 0: 42099.4. Samples: 863936220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:52:31,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 03:52:33,638][12883] Updated weights for policy 0, policy_version 52731 (0.0039) +[2024-06-18 03:52:36,994][12645] Fps is (10 sec: 39320.9, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 864059392. Throughput: 0: 41916.7. Samples: 864182020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:52:36,994][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 03:52:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052738_864059392.pth... +[2024-06-18 03:52:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052121_853950464.pth +[2024-06-18 03:52:38,045][12883] Updated weights for policy 0, policy_version 52741 (0.0028) +[2024-06-18 03:52:41,362][12883] Updated weights for policy 0, policy_version 52751 (0.0032) +[2024-06-18 03:52:41,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 864305152. Throughput: 0: 41818.6. Samples: 864433100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:52:41,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 03:52:45,695][12883] Updated weights for policy 0, policy_version 52761 (0.0034) +[2024-06-18 03:52:46,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41779.2, 300 sec: 42098.8). Total num frames: 864468992. Throughput: 0: 42037.9. Samples: 864564220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:52:46,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 03:52:49,102][12883] Updated weights for policy 0, policy_version 52771 (0.0040) +[2024-06-18 03:52:51,996][12645] Fps is (10 sec: 39312.9, 60 sec: 41777.6, 300 sec: 42209.3). Total num frames: 864698368. Throughput: 0: 41831.7. Samples: 864816220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:52:51,996][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 03:52:53,386][12883] Updated weights for policy 0, policy_version 52781 (0.0051) +[2024-06-18 03:52:56,824][12883] Updated weights for policy 0, policy_version 52791 (0.0042) +[2024-06-18 03:52:56,996][12645] Fps is (10 sec: 45864.3, 60 sec: 42323.7, 300 sec: 42264.8). Total num frames: 864927744. Throughput: 0: 41901.5. Samples: 865064160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:52:56,997][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 03:53:01,083][12883] Updated weights for policy 0, policy_version 52801 (0.0040) +[2024-06-18 03:53:01,996][12645] Fps is (10 sec: 39321.2, 60 sec: 41504.5, 300 sec: 42098.2). Total num frames: 865091584. Throughput: 0: 41702.4. Samples: 865186720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 03:53:01,997][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 03:53:04,870][12883] Updated weights for policy 0, policy_version 52811 (0.0036) +[2024-06-18 03:53:06,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41779.1, 300 sec: 42265.5). Total num frames: 865337344. Throughput: 0: 41792.8. Samples: 865437340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 03:53:06,995][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 03:53:09,047][12883] Updated weights for policy 0, policy_version 52821 (0.0044) +[2024-06-18 03:53:11,994][12645] Fps is (10 sec: 44247.5, 60 sec: 41780.3, 300 sec: 42154.4). Total num frames: 865533952. Throughput: 0: 41993.8. Samples: 865694980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 03:53:11,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 03:53:12,651][12883] Updated weights for policy 0, policy_version 52831 (0.0023) +[2024-06-18 03:53:16,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 865730560. Throughput: 0: 41706.7. Samples: 865813020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 03:53:16,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 03:53:17,759][12883] Updated weights for policy 0, policy_version 52841 (0.0028) +[2024-06-18 03:53:19,991][12862] Signal inference workers to stop experience collection... (12450 times) +[2024-06-18 03:53:19,991][12862] Signal inference workers to resume experience collection... (12450 times) +[2024-06-18 03:53:20,012][12883] InferenceWorker_p0-w0: stopping experience collection (12450 times) +[2024-06-18 03:53:20,012][12883] InferenceWorker_p0-w0: resuming experience collection (12450 times) +[2024-06-18 03:53:20,303][12883] Updated weights for policy 0, policy_version 52851 (0.0036) +[2024-06-18 03:53:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 865976320. Throughput: 0: 41854.0. Samples: 866065440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 03:53:21,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 03:53:25,344][12883] Updated weights for policy 0, policy_version 52861 (0.0035) +[2024-06-18 03:53:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 866156544. Throughput: 0: 42197.7. Samples: 866332000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 03:53:26,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 03:53:28,072][12883] Updated weights for policy 0, policy_version 52871 (0.0036) +[2024-06-18 03:53:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 866353152. Throughput: 0: 42025.3. Samples: 866455360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 03:53:31,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 03:53:32,874][12883] Updated weights for policy 0, policy_version 52881 (0.0034) +[2024-06-18 03:53:36,188][12883] Updated weights for policy 0, policy_version 52891 (0.0027) +[2024-06-18 03:53:37,010][12645] Fps is (10 sec: 45799.5, 60 sec: 42586.7, 300 sec: 42262.8). Total num frames: 866615296. Throughput: 0: 41975.1. Samples: 866705700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) +[2024-06-18 03:53:37,011][12645] Avg episode reward: [(0, '0.137')] +[2024-06-18 03:53:40,499][12883] Updated weights for policy 0, policy_version 52901 (0.0040) +[2024-06-18 03:53:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 866795520. Throughput: 0: 42305.3. Samples: 866967800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) +[2024-06-18 03:53:41,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 03:53:43,760][12883] Updated weights for policy 0, policy_version 52911 (0.0032) +[2024-06-18 03:53:46,994][12645] Fps is (10 sec: 39385.8, 60 sec: 42325.1, 300 sec: 42154.1). Total num frames: 867008512. Throughput: 0: 42177.9. Samples: 867084640. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) +[2024-06-18 03:53:46,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 03:53:48,552][12883] Updated weights for policy 0, policy_version 52921 (0.0030) +[2024-06-18 03:53:51,542][12883] Updated weights for policy 0, policy_version 52931 (0.0030) +[2024-06-18 03:53:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 867237888. Throughput: 0: 42402.4. Samples: 867345440. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) +[2024-06-18 03:53:51,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 03:53:56,095][12883] Updated weights for policy 0, policy_version 52941 (0.0032) +[2024-06-18 03:53:56,994][12645] Fps is (10 sec: 42599.7, 60 sec: 41780.8, 300 sec: 42154.1). Total num frames: 867434496. Throughput: 0: 42275.5. Samples: 867597380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) +[2024-06-18 03:53:56,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 03:53:59,611][12883] Updated weights for policy 0, policy_version 52951 (0.0026) +[2024-06-18 03:54:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42873.1, 300 sec: 42209.6). Total num frames: 867663872. Throughput: 0: 42392.8. Samples: 867720700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) +[2024-06-18 03:54:01,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 03:54:03,893][12883] Updated weights for policy 0, policy_version 52961 (0.0037) +[2024-06-18 03:54:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 867860480. Throughput: 0: 42561.2. Samples: 867980700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) +[2024-06-18 03:54:06,998][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 03:54:07,145][12883] Updated weights for policy 0, policy_version 52971 (0.0033) +[2024-06-18 03:54:11,599][12883] Updated weights for policy 0, policy_version 52981 (0.0026) +[2024-06-18 03:54:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 868057088. Throughput: 0: 42359.1. Samples: 868238160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 03:54:11,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 03:54:14,791][12883] Updated weights for policy 0, policy_version 52991 (0.0028) +[2024-06-18 03:54:16,993][12645] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 868286464. Throughput: 0: 42252.5. Samples: 868356720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 03:54:16,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 03:54:19,197][12883] Updated weights for policy 0, policy_version 53001 (0.0039) +[2024-06-18 03:54:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.1, 300 sec: 42265.1). Total num frames: 868499456. Throughput: 0: 42423.5. Samples: 868614060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 03:54:21,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 03:54:22,387][12883] Updated weights for policy 0, policy_version 53011 (0.0036) +[2024-06-18 03:54:26,803][12883] Updated weights for policy 0, policy_version 53021 (0.0037) +[2024-06-18 03:54:26,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42154.4). Total num frames: 868696064. Throughput: 0: 42218.6. Samples: 868867640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 03:54:26,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 03:54:30,173][12883] Updated weights for policy 0, policy_version 53031 (0.0022) +[2024-06-18 03:54:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 868925440. Throughput: 0: 42437.7. Samples: 868994320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 03:54:31,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 03:54:34,371][12883] Updated weights for policy 0, policy_version 53041 (0.0031) +[2024-06-18 03:54:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41517.6, 300 sec: 42209.6). Total num frames: 869105664. Throughput: 0: 42310.6. Samples: 869249420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 03:54:36,994][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 03:54:37,179][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053048_869138432.pth... +[2024-06-18 03:54:37,238][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052429_858996736.pth +[2024-06-18 03:54:37,441][12862] Signal inference workers to stop experience collection... (12500 times) +[2024-06-18 03:54:37,441][12862] Signal inference workers to resume experience collection... (12500 times) +[2024-06-18 03:54:37,474][12883] InferenceWorker_p0-w0: stopping experience collection (12500 times) +[2024-06-18 03:54:37,474][12883] InferenceWorker_p0-w0: resuming experience collection (12500 times) +[2024-06-18 03:54:37,791][12883] Updated weights for policy 0, policy_version 53051 (0.0034) +[2024-06-18 03:54:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 869335040. Throughput: 0: 42143.5. Samples: 869493840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) +[2024-06-18 03:54:41,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 03:54:42,014][12883] Updated weights for policy 0, policy_version 53061 (0.0036) +[2024-06-18 03:54:45,390][12883] Updated weights for policy 0, policy_version 53071 (0.0024) +[2024-06-18 03:54:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 869548032. Throughput: 0: 42374.6. Samples: 869627560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:54:46,994][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 03:54:49,813][12883] Updated weights for policy 0, policy_version 53081 (0.0038) +[2024-06-18 03:54:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42265.5). Total num frames: 869761024. Throughput: 0: 42103.2. Samples: 869875340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:54:51,996][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 03:54:54,052][12883] Updated weights for policy 0, policy_version 53091 (0.0031) +[2024-06-18 03:54:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 42043.1). Total num frames: 869941248. Throughput: 0: 41987.7. Samples: 870127600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:54:56,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 03:54:57,788][12883] Updated weights for policy 0, policy_version 53101 (0.0036) +[2024-06-18 03:55:01,743][12883] Updated weights for policy 0, policy_version 53111 (0.0032) +[2024-06-18 03:55:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 870170624. Throughput: 0: 42051.5. Samples: 870249040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:55:01,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 03:55:05,543][12883] Updated weights for policy 0, policy_version 53121 (0.0035) +[2024-06-18 03:55:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 870383616. Throughput: 0: 41863.2. Samples: 870497900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:55:06,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 03:55:09,499][12883] Updated weights for policy 0, policy_version 53131 (0.0024) +[2024-06-18 03:55:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 870563840. Throughput: 0: 41911.6. Samples: 870753660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:55:11,995][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 03:55:13,330][12883] Updated weights for policy 0, policy_version 53141 (0.0028) +[2024-06-18 03:55:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 870809600. Throughput: 0: 41805.6. Samples: 870875580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 03:55:16,994][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 03:55:17,196][12883] Updated weights for policy 0, policy_version 53151 (0.0036) +[2024-06-18 03:55:21,361][12883] Updated weights for policy 0, policy_version 53161 (0.0034) +[2024-06-18 03:55:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 871022592. Throughput: 0: 41856.9. Samples: 871132980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:55:21,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 03:55:24,957][12883] Updated weights for policy 0, policy_version 53171 (0.0041) +[2024-06-18 03:55:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 871202816. Throughput: 0: 41997.4. Samples: 871383720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:55:26,994][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 03:55:29,020][12883] Updated weights for policy 0, policy_version 53181 (0.0047) +[2024-06-18 03:55:31,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42050.6, 300 sec: 42264.8). Total num frames: 871448576. Throughput: 0: 41814.8. Samples: 871509320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:55:31,997][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 03:55:32,974][12883] Updated weights for policy 0, policy_version 53191 (0.0036) +[2024-06-18 03:55:36,836][12883] Updated weights for policy 0, policy_version 53201 (0.0030) +[2024-06-18 03:55:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42098.7). Total num frames: 871645184. Throughput: 0: 42019.1. Samples: 871766200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:55:36,999][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 03:55:40,554][12883] Updated weights for policy 0, policy_version 53211 (0.0029) +[2024-06-18 03:55:41,994][12645] Fps is (10 sec: 39330.5, 60 sec: 41779.2, 300 sec: 42043.3). Total num frames: 871841792. Throughput: 0: 41828.4. Samples: 872009880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:55:41,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 03:55:44,811][12883] Updated weights for policy 0, policy_version 53221 (0.0041) +[2024-06-18 03:55:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 872071168. Throughput: 0: 41951.4. Samples: 872136860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 03:55:46,994][12645] Avg episode reward: [(0, '0.143')] +[2024-06-18 03:55:48,587][12883] Updated weights for policy 0, policy_version 53231 (0.0033) +[2024-06-18 03:55:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 872251392. Throughput: 0: 42069.4. Samples: 872391020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 03:55:51,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 03:55:52,671][12883] Updated weights for policy 0, policy_version 53241 (0.0039) +[2024-06-18 03:55:56,060][12883] Updated weights for policy 0, policy_version 53251 (0.0035) +[2024-06-18 03:55:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 872464384. Throughput: 0: 41961.4. Samples: 872641920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 03:55:56,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 03:56:00,482][12883] Updated weights for policy 0, policy_version 53261 (0.0026) +[2024-06-18 03:56:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 872710144. Throughput: 0: 42120.1. Samples: 872770980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 03:56:01,994][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 03:56:04,107][12883] Updated weights for policy 0, policy_version 53271 (0.0028) +[2024-06-18 03:56:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 872906752. Throughput: 0: 42008.5. Samples: 873023360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 03:56:06,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 03:56:08,222][12883] Updated weights for policy 0, policy_version 53281 (0.0042) +[2024-06-18 03:56:10,230][12862] Signal inference workers to stop experience collection... (12550 times) +[2024-06-18 03:56:10,230][12862] Signal inference workers to resume experience collection... (12550 times) +[2024-06-18 03:56:10,240][12883] InferenceWorker_p0-w0: stopping experience collection (12550 times) +[2024-06-18 03:56:10,244][12883] InferenceWorker_p0-w0: resuming experience collection (12550 times) +[2024-06-18 03:56:11,789][12883] Updated weights for policy 0, policy_version 53291 (0.0039) +[2024-06-18 03:56:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 873119744. Throughput: 0: 41896.4. Samples: 873269060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 03:56:11,994][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 03:56:16,128][12883] Updated weights for policy 0, policy_version 53301 (0.0023) +[2024-06-18 03:56:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 873332736. Throughput: 0: 41912.3. Samples: 873395280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 03:56:16,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 03:56:19,498][12883] Updated weights for policy 0, policy_version 53311 (0.0032) +[2024-06-18 03:56:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 873512960. Throughput: 0: 41878.2. Samples: 873650720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 03:56:21,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 03:56:24,161][12883] Updated weights for policy 0, policy_version 53321 (0.0042) +[2024-06-18 03:56:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42154.4). Total num frames: 873758720. Throughput: 0: 41748.0. Samples: 873888540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 03:56:26,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 03:56:27,546][12883] Updated weights for policy 0, policy_version 53331 (0.0035) +[2024-06-18 03:56:31,970][12883] Updated weights for policy 0, policy_version 53341 (0.0029) +[2024-06-18 03:56:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41507.7, 300 sec: 41987.5). Total num frames: 873938944. Throughput: 0: 41921.8. Samples: 874023340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 03:56:31,995][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 03:56:35,927][12883] Updated weights for policy 0, policy_version 53351 (0.0039) +[2024-06-18 03:56:36,994][12645] Fps is (10 sec: 37682.9, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 874135552. Throughput: 0: 41823.9. Samples: 874273100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 03:56:36,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 03:56:37,114][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053354_874151936.pth... +[2024-06-18 03:56:37,174][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052738_864059392.pth +[2024-06-18 03:56:39,852][12883] Updated weights for policy 0, policy_version 53361 (0.0031) +[2024-06-18 03:56:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 874381312. Throughput: 0: 41609.3. Samples: 874514340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 03:56:41,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 03:56:43,749][12883] Updated weights for policy 0, policy_version 53371 (0.0041) +[2024-06-18 03:56:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 874545152. Throughput: 0: 41852.0. Samples: 874654320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 03:56:46,994][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 03:56:47,678][12883] Updated weights for policy 0, policy_version 53381 (0.0035) +[2024-06-18 03:56:51,431][12883] Updated weights for policy 0, policy_version 53391 (0.0029) +[2024-06-18 03:56:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 874758144. Throughput: 0: 41579.6. Samples: 874894440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 03:56:51,994][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 03:56:55,396][12883] Updated weights for policy 0, policy_version 53401 (0.0034) +[2024-06-18 03:56:56,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 875020288. Throughput: 0: 41729.7. Samples: 875146900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) +[2024-06-18 03:56:56,994][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 03:56:59,204][12883] Updated weights for policy 0, policy_version 53411 (0.0044) +[2024-06-18 03:57:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 875200512. Throughput: 0: 41925.4. Samples: 875281920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) +[2024-06-18 03:57:01,994][12645] Avg episode reward: [(0, '0.124')] +[2024-06-18 03:57:03,070][12883] Updated weights for policy 0, policy_version 53421 (0.0046) +[2024-06-18 03:57:06,820][12883] Updated weights for policy 0, policy_version 53431 (0.0044) +[2024-06-18 03:57:06,996][12645] Fps is (10 sec: 39313.2, 60 sec: 41777.6, 300 sec: 41987.4). Total num frames: 875413504. Throughput: 0: 41695.8. Samples: 875527120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) +[2024-06-18 03:57:06,996][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 03:57:10,852][12883] Updated weights for policy 0, policy_version 53441 (0.0035) +[2024-06-18 03:57:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 875626496. Throughput: 0: 42113.9. Samples: 875783660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) +[2024-06-18 03:57:11,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 03:57:14,794][12883] Updated weights for policy 0, policy_version 53451 (0.0038) +[2024-06-18 03:57:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 875839488. Throughput: 0: 41948.9. Samples: 875911040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) +[2024-06-18 03:57:16,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 03:57:18,829][12883] Updated weights for policy 0, policy_version 53461 (0.0037) +[2024-06-18 03:57:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 876036096. Throughput: 0: 41769.3. Samples: 876152720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) +[2024-06-18 03:57:21,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 03:57:22,802][12883] Updated weights for policy 0, policy_version 53471 (0.0040) +[2024-06-18 03:57:26,555][12883] Updated weights for policy 0, policy_version 53481 (0.0045) +[2024-06-18 03:57:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 876249088. Throughput: 0: 42000.5. Samples: 876404360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) +[2024-06-18 03:57:26,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 03:57:30,200][12862] Signal inference workers to stop experience collection... (12600 times) +[2024-06-18 03:57:30,201][12862] Signal inference workers to resume experience collection... (12600 times) +[2024-06-18 03:57:30,231][12883] InferenceWorker_p0-w0: stopping experience collection (12600 times) +[2024-06-18 03:57:30,232][12883] InferenceWorker_p0-w0: resuming experience collection (12600 times) +[2024-06-18 03:57:30,556][12883] Updated weights for policy 0, policy_version 53491 (0.0028) +[2024-06-18 03:57:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 876462080. Throughput: 0: 41706.1. Samples: 876531100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:57:31,994][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 03:57:34,168][12883] Updated weights for policy 0, policy_version 53501 (0.0047) +[2024-06-18 03:57:36,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 41876.1). Total num frames: 876658688. Throughput: 0: 41867.6. Samples: 876778580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:57:36,996][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 03:57:38,724][12883] Updated weights for policy 0, policy_version 53511 (0.0032) +[2024-06-18 03:57:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 876871680. Throughput: 0: 41792.9. Samples: 877027580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:57:41,994][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 03:57:42,107][12883] Updated weights for policy 0, policy_version 53521 (0.0046) +[2024-06-18 03:57:46,436][12883] Updated weights for policy 0, policy_version 53531 (0.0034) +[2024-06-18 03:57:46,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42052.2, 300 sec: 41932.2). Total num frames: 877068288. Throughput: 0: 41728.3. Samples: 877159700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:57:46,994][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 03:57:49,662][12883] Updated weights for policy 0, policy_version 53541 (0.0032) +[2024-06-18 03:57:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 41932.3). Total num frames: 877297664. Throughput: 0: 41720.2. Samples: 877404440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:57:51,996][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 03:57:54,542][12883] Updated weights for policy 0, policy_version 53551 (0.0032) +[2024-06-18 03:57:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.1, 300 sec: 42154.4). Total num frames: 877527040. Throughput: 0: 41594.4. Samples: 877655420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:57:56,994][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 03:57:57,559][12883] Updated weights for policy 0, policy_version 53561 (0.0032) +[2024-06-18 03:58:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 877690880. Throughput: 0: 41585.2. Samples: 877782380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 03:58:01,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 03:58:02,230][12883] Updated weights for policy 0, policy_version 53571 (0.0045) +[2024-06-18 03:58:05,205][12883] Updated weights for policy 0, policy_version 53581 (0.0031) +[2024-06-18 03:58:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41507.6, 300 sec: 41931.9). Total num frames: 877903872. Throughput: 0: 41675.6. Samples: 878028120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 03:58:06,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 03:58:09,837][12883] Updated weights for policy 0, policy_version 53591 (0.0033) +[2024-06-18 03:58:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 878133248. Throughput: 0: 41743.1. Samples: 878282800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 03:58:11,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 03:58:13,054][12883] Updated weights for policy 0, policy_version 53601 (0.0035) +[2024-06-18 03:58:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 878313472. Throughput: 0: 41724.4. Samples: 878408700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 03:58:16,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 03:58:18,084][12883] Updated weights for policy 0, policy_version 53611 (0.0033) +[2024-06-18 03:58:20,747][12883] Updated weights for policy 0, policy_version 53621 (0.0039) +[2024-06-18 03:58:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 878559232. Throughput: 0: 41666.5. Samples: 878653480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 03:58:21,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 03:58:25,694][12883] Updated weights for policy 0, policy_version 53631 (0.0037) +[2024-06-18 03:58:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41779.0, 300 sec: 42043.0). Total num frames: 878755840. Throughput: 0: 41874.1. Samples: 878911920. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 03:58:26,995][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 03:58:28,606][12883] Updated weights for policy 0, policy_version 53641 (0.0039) +[2024-06-18 03:58:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41823.2). Total num frames: 878952448. Throughput: 0: 41687.2. Samples: 879035620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 03:58:31,994][12645] Avg episode reward: [(0, '0.044')] +[2024-06-18 03:58:33,441][12883] Updated weights for policy 0, policy_version 53651 (0.0022) +[2024-06-18 03:58:36,377][12883] Updated weights for policy 0, policy_version 53661 (0.0031) +[2024-06-18 03:58:36,996][12645] Fps is (10 sec: 44227.7, 60 sec: 42325.3, 300 sec: 42042.7). Total num frames: 879198208. Throughput: 0: 41893.0. Samples: 879289720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 03:58:36,997][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 03:58:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053662_879198208.pth... +[2024-06-18 03:58:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053048_869138432.pth +[2024-06-18 03:58:41,109][12883] Updated weights for policy 0, policy_version 53671 (0.0030) +[2024-06-18 03:58:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 879394816. Throughput: 0: 42128.7. Samples: 879551200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 03:58:41,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 03:58:44,119][12883] Updated weights for policy 0, policy_version 53681 (0.0049) +[2024-06-18 03:58:46,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 879591424. Throughput: 0: 41937.0. Samples: 879669540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 03:58:46,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 03:58:47,360][12862] Signal inference workers to stop experience collection... (12650 times) +[2024-06-18 03:58:47,370][12883] InferenceWorker_p0-w0: stopping experience collection (12650 times) +[2024-06-18 03:58:47,418][12862] Signal inference workers to resume experience collection... (12650 times) +[2024-06-18 03:58:47,419][12883] InferenceWorker_p0-w0: resuming experience collection (12650 times) +[2024-06-18 03:58:48,836][12883] Updated weights for policy 0, policy_version 53691 (0.0039) +[2024-06-18 03:58:51,973][12883] Updated weights for policy 0, policy_version 53701 (0.0036) +[2024-06-18 03:58:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 879837184. Throughput: 0: 42152.5. Samples: 879924980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 03:58:51,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 03:58:56,427][12883] Updated weights for policy 0, policy_version 53711 (0.0028) +[2024-06-18 03:58:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 880017408. Throughput: 0: 42148.0. Samples: 880179460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 03:58:56,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 03:58:59,646][12883] Updated weights for policy 0, policy_version 53721 (0.0042) +[2024-06-18 03:59:01,994][12645] Fps is (10 sec: 36044.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 880197632. Throughput: 0: 42050.7. Samples: 880300980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 03:59:01,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 03:59:04,355][12883] Updated weights for policy 0, policy_version 53731 (0.0036) +[2024-06-18 03:59:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 880443392. Throughput: 0: 42250.2. Samples: 880554740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 03:59:06,994][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 03:59:08,152][12883] Updated weights for policy 0, policy_version 53741 (0.0033) +[2024-06-18 03:59:11,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 880640000. Throughput: 0: 42155.8. Samples: 880808920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 03:59:11,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 03:59:12,013][12883] Updated weights for policy 0, policy_version 53751 (0.0027) +[2024-06-18 03:59:15,918][12883] Updated weights for policy 0, policy_version 53761 (0.0027) +[2024-06-18 03:59:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 880852992. Throughput: 0: 42143.0. Samples: 880932060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 03:59:16,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 03:59:19,703][12883] Updated weights for policy 0, policy_version 53771 (0.0031) +[2024-06-18 03:59:21,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 881098752. Throughput: 0: 42181.6. Samples: 881187800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 03:59:21,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 03:59:24,101][12883] Updated weights for policy 0, policy_version 53781 (0.0029) +[2024-06-18 03:59:27,000][12645] Fps is (10 sec: 44209.5, 60 sec: 42321.1, 300 sec: 41931.0). Total num frames: 881295360. Throughput: 0: 42057.2. Samples: 881444040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 03:59:27,000][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 03:59:27,286][12883] Updated weights for policy 0, policy_version 53791 (0.0042) +[2024-06-18 03:59:31,917][12883] Updated weights for policy 0, policy_version 53801 (0.0038) +[2024-06-18 03:59:31,998][12645] Fps is (10 sec: 37667.0, 60 sec: 42049.2, 300 sec: 41931.3). Total num frames: 881475584. Throughput: 0: 42154.1. Samples: 881566660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 03:59:31,999][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 03:59:35,145][12883] Updated weights for policy 0, policy_version 53811 (0.0032) +[2024-06-18 03:59:36,994][12645] Fps is (10 sec: 40985.9, 60 sec: 41780.8, 300 sec: 41931.9). Total num frames: 881704960. Throughput: 0: 42032.5. Samples: 881816440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 03:59:36,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 03:59:39,536][12883] Updated weights for policy 0, policy_version 53821 (0.0027) +[2024-06-18 03:59:41,994][12645] Fps is (10 sec: 44256.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 881917952. Throughput: 0: 42166.7. Samples: 882076960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 03:59:42,000][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 03:59:42,883][12883] Updated weights for policy 0, policy_version 53831 (0.0030) +[2024-06-18 03:59:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 882114560. Throughput: 0: 42151.3. Samples: 882197780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:59:46,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 03:59:47,400][12883] Updated weights for policy 0, policy_version 53841 (0.0040) +[2024-06-18 03:59:50,469][12883] Updated weights for policy 0, policy_version 53851 (0.0041) +[2024-06-18 03:59:51,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 882360320. Throughput: 0: 42036.2. Samples: 882446460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:59:51,996][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 03:59:55,261][12883] Updated weights for policy 0, policy_version 53861 (0.0028) +[2024-06-18 03:59:56,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 882540544. Throughput: 0: 42104.2. Samples: 882703620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 03:59:56,994][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 03:59:58,289][12883] Updated weights for policy 0, policy_version 53871 (0.0040) +[2024-06-18 04:00:01,994][12645] Fps is (10 sec: 37691.3, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 882737152. Throughput: 0: 42118.3. Samples: 882827380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 04:00:01,994][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 04:00:03,035][12883] Updated weights for policy 0, policy_version 53881 (0.0033) +[2024-06-18 04:00:05,909][12883] Updated weights for policy 0, policy_version 53891 (0.0032) +[2024-06-18 04:00:06,994][12645] Fps is (10 sec: 44238.1, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 882982912. Throughput: 0: 42041.5. Samples: 883079660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 04:00:06,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 04:00:10,456][12862] Signal inference workers to stop experience collection... (12700 times) +[2024-06-18 04:00:10,456][12862] Signal inference workers to resume experience collection... (12700 times) +[2024-06-18 04:00:10,505][12883] InferenceWorker_p0-w0: stopping experience collection (12700 times) +[2024-06-18 04:00:10,505][12883] InferenceWorker_p0-w0: resuming experience collection (12700 times) +[2024-06-18 04:00:10,786][12883] Updated weights for policy 0, policy_version 53901 (0.0032) +[2024-06-18 04:00:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 883163136. Throughput: 0: 42200.5. Samples: 883342800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 04:00:11,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 04:00:13,523][12883] Updated weights for policy 0, policy_version 53911 (0.0030) +[2024-06-18 04:00:16,994][12645] Fps is (10 sec: 37682.6, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 883359744. Throughput: 0: 42150.7. Samples: 883463260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 04:00:17,000][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 04:00:18,496][12883] Updated weights for policy 0, policy_version 53921 (0.0051) +[2024-06-18 04:00:21,201][12883] Updated weights for policy 0, policy_version 53931 (0.0033) +[2024-06-18 04:00:21,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 883638272. Throughput: 0: 42258.1. Samples: 883718060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:00:21,995][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 04:00:26,410][12883] Updated weights for policy 0, policy_version 53941 (0.0042) +[2024-06-18 04:00:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41510.5, 300 sec: 41821.2). Total num frames: 883785728. Throughput: 0: 42159.1. Samples: 883974120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:00:26,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 04:00:28,894][12883] Updated weights for policy 0, policy_version 53951 (0.0023) +[2024-06-18 04:00:31,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42328.5, 300 sec: 41931.9). Total num frames: 884015104. Throughput: 0: 42031.1. Samples: 884089180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:00:31,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 04:00:34,205][12883] Updated weights for policy 0, policy_version 53961 (0.0046) +[2024-06-18 04:00:36,802][12883] Updated weights for policy 0, policy_version 53971 (0.0033) +[2024-06-18 04:00:36,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 884260864. Throughput: 0: 42277.1. Samples: 884348840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:00:36,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 04:00:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053971_884260864.pth... +[2024-06-18 04:00:37,060][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053354_874151936.pth +[2024-06-18 04:00:41,937][12883] Updated weights for policy 0, policy_version 53981 (0.0044) +[2024-06-18 04:00:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 884424704. Throughput: 0: 42378.8. Samples: 884610660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:00:41,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 04:00:44,574][12883] Updated weights for policy 0, policy_version 53991 (0.0027) +[2024-06-18 04:00:46,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42596.8, 300 sec: 42098.2). Total num frames: 884670464. Throughput: 0: 42219.3. Samples: 884727340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:00:46,997][12645] Avg episode reward: [(0, '0.174')] +[2024-06-18 04:00:49,482][12883] Updated weights for policy 0, policy_version 54001 (0.0038) +[2024-06-18 04:00:51,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42326.8, 300 sec: 42154.1). Total num frames: 884899840. Throughput: 0: 42418.1. Samples: 884988480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 04:00:51,994][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 04:00:52,242][12883] Updated weights for policy 0, policy_version 54011 (0.0041) +[2024-06-18 04:00:56,994][12645] Fps is (10 sec: 36053.3, 60 sec: 41506.4, 300 sec: 41765.3). Total num frames: 885030912. Throughput: 0: 42483.7. Samples: 885254560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 04:00:56,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 04:00:57,311][12883] Updated weights for policy 0, policy_version 54021 (0.0042) +[2024-06-18 04:01:00,027][12883] Updated weights for policy 0, policy_version 54031 (0.0039) +[2024-06-18 04:01:01,999][12645] Fps is (10 sec: 40938.3, 60 sec: 42867.7, 300 sec: 42042.2). Total num frames: 885309440. Throughput: 0: 42199.5. Samples: 885362460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 04:01:01,999][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 04:01:04,855][12883] Updated weights for policy 0, policy_version 54041 (0.0028) +[2024-06-18 04:01:06,266][12862] Signal inference workers to stop experience collection... (12750 times) +[2024-06-18 04:01:06,266][12862] Signal inference workers to resume experience collection... (12750 times) +[2024-06-18 04:01:06,277][12883] InferenceWorker_p0-w0: stopping experience collection (12750 times) +[2024-06-18 04:01:06,277][12883] InferenceWorker_p0-w0: resuming experience collection (12750 times) +[2024-06-18 04:01:06,994][12645] Fps is (10 sec: 49152.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 885522432. Throughput: 0: 42466.8. Samples: 885629060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 04:01:06,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 04:01:07,667][12883] Updated weights for policy 0, policy_version 54051 (0.0030) +[2024-06-18 04:01:11,994][12645] Fps is (10 sec: 39342.6, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 885702656. Throughput: 0: 42635.5. Samples: 885892720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 04:01:11,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 04:01:12,523][12883] Updated weights for policy 0, policy_version 54061 (0.0023) +[2024-06-18 04:01:15,245][12883] Updated weights for policy 0, policy_version 54071 (0.0027) +[2024-06-18 04:01:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42209.6). Total num frames: 885964800. Throughput: 0: 42640.8. Samples: 886008020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 04:01:16,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 04:01:20,348][12883] Updated weights for policy 0, policy_version 54081 (0.0046) +[2024-06-18 04:01:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 886145024. Throughput: 0: 42565.9. Samples: 886264300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) +[2024-06-18 04:01:21,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 04:01:23,145][12883] Updated weights for policy 0, policy_version 54091 (0.0052) +[2024-06-18 04:01:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 886341632. Throughput: 0: 42450.7. Samples: 886520940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 04:01:26,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 04:01:28,051][12883] Updated weights for policy 0, policy_version 54101 (0.0042) +[2024-06-18 04:01:30,964][12883] Updated weights for policy 0, policy_version 54111 (0.0029) +[2024-06-18 04:01:31,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 886587392. Throughput: 0: 42641.6. Samples: 886646120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 04:01:31,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 04:01:35,619][12883] Updated weights for policy 0, policy_version 54121 (0.0022) +[2024-06-18 04:01:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 886751232. Throughput: 0: 42377.4. Samples: 886895460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 04:01:36,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 04:01:39,094][12883] Updated weights for policy 0, policy_version 54131 (0.0028) +[2024-06-18 04:01:42,000][12645] Fps is (10 sec: 37660.1, 60 sec: 42320.9, 300 sec: 42097.7). Total num frames: 886964224. Throughput: 0: 42030.1. Samples: 887146180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 04:01:42,001][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 04:01:43,351][12883] Updated weights for policy 0, policy_version 54141 (0.0033) +[2024-06-18 04:01:46,905][12883] Updated weights for policy 0, policy_version 54151 (0.0028) +[2024-06-18 04:01:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 887209984. Throughput: 0: 42553.5. Samples: 887277140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 04:01:46,994][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 04:01:50,929][12883] Updated weights for policy 0, policy_version 54161 (0.0032) +[2024-06-18 04:01:51,994][12645] Fps is (10 sec: 42624.7, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 887390208. Throughput: 0: 42190.5. Samples: 887527640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 04:01:51,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 04:01:54,828][12883] Updated weights for policy 0, policy_version 54171 (0.0038) +[2024-06-18 04:01:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42098.5). Total num frames: 887619584. Throughput: 0: 41782.6. Samples: 887772940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 04:01:56,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 04:01:58,880][12883] Updated weights for policy 0, policy_version 54181 (0.0046) +[2024-06-18 04:02:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42056.1, 300 sec: 42098.9). Total num frames: 887832576. Throughput: 0: 42229.4. Samples: 887908340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 04:02:01,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 04:02:02,408][12883] Updated weights for policy 0, policy_version 54191 (0.0039) +[2024-06-18 04:02:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 888012800. Throughput: 0: 41812.8. Samples: 888145880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 04:02:06,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 04:02:07,186][12883] Updated weights for policy 0, policy_version 54201 (0.0033) +[2024-06-18 04:02:10,075][12862] Signal inference workers to stop experience collection... (12800 times) +[2024-06-18 04:02:10,076][12862] Signal inference workers to resume experience collection... (12800 times) +[2024-06-18 04:02:10,097][12883] InferenceWorker_p0-w0: stopping experience collection (12800 times) +[2024-06-18 04:02:10,097][12883] InferenceWorker_p0-w0: resuming experience collection (12800 times) +[2024-06-18 04:02:10,238][12883] Updated weights for policy 0, policy_version 54211 (0.0024) +[2024-06-18 04:02:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 888258560. Throughput: 0: 41661.3. Samples: 888395700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 04:02:11,994][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 04:02:14,895][12883] Updated weights for policy 0, policy_version 54221 (0.0041) +[2024-06-18 04:02:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41232.9, 300 sec: 42043.0). Total num frames: 888438784. Throughput: 0: 41715.5. Samples: 888523320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 04:02:16,995][12645] Avg episode reward: [(0, '0.145')] +[2024-06-18 04:02:18,503][12883] Updated weights for policy 0, policy_version 54231 (0.0035) +[2024-06-18 04:02:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 888668160. Throughput: 0: 41732.6. Samples: 888773420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 04:02:21,994][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 04:02:22,501][12883] Updated weights for policy 0, policy_version 54241 (0.0057) +[2024-06-18 04:02:26,472][12883] Updated weights for policy 0, policy_version 54251 (0.0045) +[2024-06-18 04:02:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 888864768. Throughput: 0: 41810.1. Samples: 889027380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 04:02:26,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 04:02:30,217][12883] Updated weights for policy 0, policy_version 54261 (0.0033) +[2024-06-18 04:02:31,994][12645] Fps is (10 sec: 40959.1, 60 sec: 41506.1, 300 sec: 42098.9). Total num frames: 889077760. Throughput: 0: 41606.6. Samples: 889149440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:02:31,994][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 04:02:34,637][12883] Updated weights for policy 0, policy_version 54271 (0.0037) +[2024-06-18 04:02:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 889290752. Throughput: 0: 41619.1. Samples: 889400500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:02:36,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 04:02:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054278_889290752.pth... +[2024-06-18 04:02:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053662_879198208.pth +[2024-06-18 04:02:38,079][12883] Updated weights for policy 0, policy_version 54281 (0.0034) +[2024-06-18 04:02:41,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42055.1, 300 sec: 42098.2). Total num frames: 889487360. Throughput: 0: 41904.6. Samples: 889658740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:02:41,997][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 04:02:42,393][12883] Updated weights for policy 0, policy_version 54291 (0.0028) +[2024-06-18 04:02:45,674][12883] Updated weights for policy 0, policy_version 54301 (0.0032) +[2024-06-18 04:02:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 889700352. Throughput: 0: 41694.1. Samples: 889784580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:02:46,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 04:02:50,094][12883] Updated weights for policy 0, policy_version 54311 (0.0040) +[2024-06-18 04:02:51,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 889929728. Throughput: 0: 42003.1. Samples: 890036020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:02:51,994][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 04:02:53,550][12883] Updated weights for policy 0, policy_version 54321 (0.0034) +[2024-06-18 04:02:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 890142720. Throughput: 0: 42139.5. Samples: 890291980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:02:56,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 04:02:57,718][12883] Updated weights for policy 0, policy_version 54331 (0.0031) +[2024-06-18 04:03:01,168][12883] Updated weights for policy 0, policy_version 54341 (0.0036) +[2024-06-18 04:03:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 890355712. Throughput: 0: 42093.1. Samples: 890417500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:03:01,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 04:03:05,591][12883] Updated weights for policy 0, policy_version 54351 (0.0027) +[2024-06-18 04:03:06,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 890568704. Throughput: 0: 42144.5. Samples: 890670020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 04:03:06,997][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 04:03:08,893][12883] Updated weights for policy 0, policy_version 54361 (0.0034) +[2024-06-18 04:03:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 890765312. Throughput: 0: 42045.0. Samples: 890919400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 04:03:11,994][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 04:03:13,516][12883] Updated weights for policy 0, policy_version 54371 (0.0028) +[2024-06-18 04:03:16,695][12883] Updated weights for policy 0, policy_version 54381 (0.0038) +[2024-06-18 04:03:16,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 890994688. Throughput: 0: 42112.9. Samples: 891044520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 04:03:16,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 04:03:21,258][12883] Updated weights for policy 0, policy_version 54391 (0.0037) +[2024-06-18 04:03:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 891174912. Throughput: 0: 42175.2. Samples: 891298380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 04:03:21,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 04:03:24,414][12883] Updated weights for policy 0, policy_version 54401 (0.0042) +[2024-06-18 04:03:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 891387904. Throughput: 0: 42199.5. Samples: 891557620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 04:03:26,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 04:03:29,023][12883] Updated weights for policy 0, policy_version 54411 (0.0030) +[2024-06-18 04:03:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42098.9). Total num frames: 891617280. Throughput: 0: 42096.6. Samples: 891678920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 04:03:31,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 04:03:32,105][12883] Updated weights for policy 0, policy_version 54421 (0.0038) +[2024-06-18 04:03:36,780][12883] Updated weights for policy 0, policy_version 54431 (0.0035) +[2024-06-18 04:03:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 891813888. Throughput: 0: 42055.5. Samples: 891928520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 04:03:36,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 04:03:38,159][12862] Signal inference workers to stop experience collection... (12850 times) +[2024-06-18 04:03:38,160][12862] Signal inference workers to resume experience collection... (12850 times) +[2024-06-18 04:03:38,180][12883] InferenceWorker_p0-w0: stopping experience collection (12850 times) +[2024-06-18 04:03:38,181][12883] InferenceWorker_p0-w0: resuming experience collection (12850 times) +[2024-06-18 04:03:39,845][12883] Updated weights for policy 0, policy_version 54441 (0.0038) +[2024-06-18 04:03:41,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42325.3, 300 sec: 42153.8). Total num frames: 892026880. Throughput: 0: 42134.8. Samples: 892188140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:03:41,997][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 04:03:44,484][12883] Updated weights for policy 0, policy_version 54451 (0.0027) +[2024-06-18 04:03:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 892239872. Throughput: 0: 42081.0. Samples: 892311140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:03:46,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 04:03:47,824][12883] Updated weights for policy 0, policy_version 54461 (0.0039) +[2024-06-18 04:03:51,994][12645] Fps is (10 sec: 40968.8, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 892436480. Throughput: 0: 42025.1. Samples: 892561060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:03:51,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 04:03:52,059][12883] Updated weights for policy 0, policy_version 54471 (0.0033) +[2024-06-18 04:03:55,848][12883] Updated weights for policy 0, policy_version 54481 (0.0036) +[2024-06-18 04:03:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 892649472. Throughput: 0: 42245.2. Samples: 892820440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:03:56,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 04:04:00,165][12883] Updated weights for policy 0, policy_version 54491 (0.0044) +[2024-06-18 04:04:01,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 892862464. Throughput: 0: 42248.6. Samples: 892945700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:01,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 04:04:03,526][12883] Updated weights for policy 0, policy_version 54501 (0.0038) +[2024-06-18 04:04:06,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41780.7, 300 sec: 42154.1). Total num frames: 893075456. Throughput: 0: 42171.5. Samples: 893196100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:06,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 04:04:07,941][12883] Updated weights for policy 0, policy_version 54511 (0.0039) +[2024-06-18 04:04:11,422][12883] Updated weights for policy 0, policy_version 54521 (0.0038) +[2024-06-18 04:04:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 893288448. Throughput: 0: 42072.0. Samples: 893450860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:11,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 04:04:15,450][12883] Updated weights for policy 0, policy_version 54531 (0.0023) +[2024-06-18 04:04:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 893517824. Throughput: 0: 42221.7. Samples: 893578900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:16,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 04:04:19,158][12883] Updated weights for policy 0, policy_version 54541 (0.0030) +[2024-06-18 04:04:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42099.4). Total num frames: 893714432. Throughput: 0: 42287.1. Samples: 893831440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:21,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 04:04:22,995][12883] Updated weights for policy 0, policy_version 54551 (0.0032) +[2024-06-18 04:04:26,885][12883] Updated weights for policy 0, policy_version 54561 (0.0038) +[2024-06-18 04:04:26,994][12645] Fps is (10 sec: 40957.5, 60 sec: 42324.9, 300 sec: 42210.2). Total num frames: 893927424. Throughput: 0: 42130.0. Samples: 894083920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:26,995][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 04:04:31,052][12883] Updated weights for policy 0, policy_version 54571 (0.0031) +[2024-06-18 04:04:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 894156800. Throughput: 0: 42219.9. Samples: 894211040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:31,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 04:04:34,601][12883] Updated weights for policy 0, policy_version 54581 (0.0034) +[2024-06-18 04:04:36,994][12645] Fps is (10 sec: 39323.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 894320640. Throughput: 0: 42180.1. Samples: 894459160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:36,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 04:04:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054586_894337024.pth... +[2024-06-18 04:04:37,119][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053971_884260864.pth +[2024-06-18 04:04:38,507][12883] Updated weights for policy 0, policy_version 54591 (0.0031) +[2024-06-18 04:04:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 894550016. Throughput: 0: 42132.6. Samples: 894716400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:41,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 04:04:42,543][12883] Updated weights for policy 0, policy_version 54601 (0.0037) +[2024-06-18 04:04:45,996][12883] Updated weights for policy 0, policy_version 54611 (0.0036) +[2024-06-18 04:04:46,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 42098.9). Total num frames: 894779392. Throughput: 0: 42233.4. Samples: 894846200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 04:04:46,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 04:04:50,341][12883] Updated weights for policy 0, policy_version 54621 (0.0034) +[2024-06-18 04:04:51,994][12645] Fps is (10 sec: 42597.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 894976000. Throughput: 0: 42271.0. Samples: 895098300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 04:04:51,995][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 04:04:53,879][12883] Updated weights for policy 0, policy_version 54631 (0.0031) +[2024-06-18 04:04:56,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.9, 300 sec: 42209.3). Total num frames: 895188992. Throughput: 0: 42073.5. Samples: 895344260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 04:04:56,996][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 04:04:58,065][12883] Updated weights for policy 0, policy_version 54641 (0.0030) +[2024-06-18 04:05:01,407][12883] Updated weights for policy 0, policy_version 54651 (0.0041) +[2024-06-18 04:05:01,996][12645] Fps is (10 sec: 42589.6, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 895401984. Throughput: 0: 42253.9. Samples: 895480420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 04:05:01,997][12645] Avg episode reward: [(0, '0.174')] +[2024-06-18 04:05:04,900][12862] Signal inference workers to stop experience collection... (12900 times) +[2024-06-18 04:05:04,901][12862] Signal inference workers to resume experience collection... (12900 times) +[2024-06-18 04:05:04,955][12883] InferenceWorker_p0-w0: stopping experience collection (12900 times) +[2024-06-18 04:05:04,955][12883] InferenceWorker_p0-w0: resuming experience collection (12900 times) +[2024-06-18 04:05:05,802][12883] Updated weights for policy 0, policy_version 54661 (0.0026) +[2024-06-18 04:05:06,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 895598592. Throughput: 0: 42264.0. Samples: 895733320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 04:05:06,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 04:05:09,235][12883] Updated weights for policy 0, policy_version 54671 (0.0034) +[2024-06-18 04:05:11,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 895811584. Throughput: 0: 42305.9. Samples: 895987660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 04:05:11,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 04:05:13,483][12883] Updated weights for policy 0, policy_version 54681 (0.0037) +[2024-06-18 04:05:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 896024576. Throughput: 0: 42210.3. Samples: 896110500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 04:05:16,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 04:05:17,444][12883] Updated weights for policy 0, policy_version 54691 (0.0029) +[2024-06-18 04:05:21,141][12883] Updated weights for policy 0, policy_version 54701 (0.0039) +[2024-06-18 04:05:21,996][12645] Fps is (10 sec: 40950.6, 60 sec: 41777.6, 300 sec: 42153.8). Total num frames: 896221184. Throughput: 0: 42109.0. Samples: 896354160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 04:05:21,997][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 04:05:25,102][12883] Updated weights for policy 0, policy_version 54711 (0.0031) +[2024-06-18 04:05:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.7, 300 sec: 42154.1). Total num frames: 896450560. Throughput: 0: 42093.7. Samples: 896610620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:05:26,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 04:05:28,713][12883] Updated weights for policy 0, policy_version 54721 (0.0025) +[2024-06-18 04:05:31,996][12645] Fps is (10 sec: 42598.4, 60 sec: 41504.6, 300 sec: 41987.2). Total num frames: 896647168. Throughput: 0: 42136.0. Samples: 896742420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:05:31,997][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 04:05:32,820][12883] Updated weights for policy 0, policy_version 54731 (0.0026) +[2024-06-18 04:05:36,247][12883] Updated weights for policy 0, policy_version 54741 (0.0033) +[2024-06-18 04:05:36,993][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 896876544. Throughput: 0: 41822.1. Samples: 896980280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:05:36,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 04:05:40,494][12883] Updated weights for policy 0, policy_version 54751 (0.0043) +[2024-06-18 04:05:41,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.2, 300 sec: 42043.3). Total num frames: 897073152. Throughput: 0: 42087.4. Samples: 897238100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:05:41,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 04:05:44,181][12883] Updated weights for policy 0, policy_version 54761 (0.0034) +[2024-06-18 04:05:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41932.0). Total num frames: 897269760. Throughput: 0: 41891.5. Samples: 897365440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:05:46,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 04:05:48,044][12883] Updated weights for policy 0, policy_version 54771 (0.0037) +[2024-06-18 04:05:51,659][12883] Updated weights for policy 0, policy_version 54781 (0.0025) +[2024-06-18 04:05:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.6, 300 sec: 42376.2). Total num frames: 897531904. Throughput: 0: 42010.7. Samples: 897623800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:05:51,994][12645] Avg episode reward: [(0, '0.078')] +[2024-06-18 04:05:53,487][12862] Signal inference workers to stop experience collection... (12950 times) +[2024-06-18 04:05:53,543][12862] Signal inference workers to resume experience collection... (12950 times) +[2024-06-18 04:05:53,548][12883] InferenceWorker_p0-w0: stopping experience collection (12950 times) +[2024-06-18 04:05:53,572][12883] InferenceWorker_p0-w0: resuming experience collection (12950 times) +[2024-06-18 04:05:55,787][12883] Updated weights for policy 0, policy_version 54791 (0.0037) +[2024-06-18 04:05:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42053.8, 300 sec: 42043.8). Total num frames: 897712128. Throughput: 0: 42066.6. Samples: 897880660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 04:05:56,994][12645] Avg episode reward: [(0, '0.067')] +[2024-06-18 04:05:59,249][12883] Updated weights for policy 0, policy_version 54801 (0.0035) +[2024-06-18 04:06:01,993][12645] Fps is (10 sec: 36045.3, 60 sec: 41507.8, 300 sec: 41931.9). Total num frames: 897892352. Throughput: 0: 41964.5. Samples: 897998900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 04:06:01,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 04:06:03,828][12883] Updated weights for policy 0, policy_version 54811 (0.0032) +[2024-06-18 04:06:06,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 898170880. Throughput: 0: 42282.9. Samples: 898256800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 04:06:06,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 04:06:07,008][12883] Updated weights for policy 0, policy_version 54821 (0.0024) +[2024-06-18 04:06:11,597][12883] Updated weights for policy 0, policy_version 54831 (0.0040) +[2024-06-18 04:06:12,000][12645] Fps is (10 sec: 45845.9, 60 sec: 42320.9, 300 sec: 41986.6). Total num frames: 898351104. Throughput: 0: 42116.8. Samples: 898506140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 04:06:12,000][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 04:06:15,775][12883] Updated weights for policy 0, policy_version 54841 (0.0033) +[2024-06-18 04:06:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 898547712. Throughput: 0: 41921.2. Samples: 898628780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 04:06:16,999][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 04:06:19,555][12883] Updated weights for policy 0, policy_version 54851 (0.0030) +[2024-06-18 04:06:21,993][12645] Fps is (10 sec: 42625.6, 60 sec: 42600.1, 300 sec: 42154.1). Total num frames: 898777088. Throughput: 0: 42452.9. Samples: 898890660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 04:06:21,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 04:06:21,995][12862] Saving new best policy, reward=0.447! +[2024-06-18 04:06:23,994][12883] Updated weights for policy 0, policy_version 54861 (0.0026) +[2024-06-18 04:06:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 898990080. Throughput: 0: 42271.1. Samples: 899140300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 04:06:26,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 04:06:27,652][12883] Updated weights for policy 0, policy_version 54871 (0.0031) +[2024-06-18 04:06:31,695][12883] Updated weights for policy 0, policy_version 54881 (0.0029) +[2024-06-18 04:06:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42053.9, 300 sec: 42098.6). Total num frames: 899170304. Throughput: 0: 42305.7. Samples: 899269200. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) +[2024-06-18 04:06:31,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 04:06:35,095][12883] Updated weights for policy 0, policy_version 54891 (0.0029) +[2024-06-18 04:06:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42210.5). Total num frames: 899416064. Throughput: 0: 42260.9. Samples: 899525540. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) +[2024-06-18 04:06:36,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 04:06:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054896_899416064.pth... +[2024-06-18 04:06:37,054][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054278_889290752.pth +[2024-06-18 04:06:39,135][12883] Updated weights for policy 0, policy_version 54901 (0.0034) +[2024-06-18 04:06:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 899612672. Throughput: 0: 42178.7. Samples: 899778700. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) +[2024-06-18 04:06:41,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 04:06:42,792][12883] Updated weights for policy 0, policy_version 54911 (0.0026) +[2024-06-18 04:06:46,668][12883] Updated weights for policy 0, policy_version 54921 (0.0033) +[2024-06-18 04:06:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 899825664. Throughput: 0: 42382.1. Samples: 899906100. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) +[2024-06-18 04:06:46,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 04:06:50,687][12883] Updated weights for policy 0, policy_version 54931 (0.0039) +[2024-06-18 04:06:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 900055040. Throughput: 0: 42395.2. Samples: 900164580. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) +[2024-06-18 04:06:51,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 04:06:54,351][12883] Updated weights for policy 0, policy_version 54941 (0.0033) +[2024-06-18 04:06:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 900251648. Throughput: 0: 42465.5. Samples: 900416820. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) +[2024-06-18 04:06:56,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 04:06:58,368][12883] Updated weights for policy 0, policy_version 54951 (0.0030) +[2024-06-18 04:07:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 900464640. Throughput: 0: 42545.9. Samples: 900543340. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) +[2024-06-18 04:07:01,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 04:07:02,227][12883] Updated weights for policy 0, policy_version 54961 (0.0048) +[2024-06-18 04:07:06,171][12883] Updated weights for policy 0, policy_version 54971 (0.0050) +[2024-06-18 04:07:06,996][12645] Fps is (10 sec: 42588.1, 60 sec: 41777.7, 300 sec: 42098.2). Total num frames: 900677632. Throughput: 0: 42401.7. Samples: 900798840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:07:06,997][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 04:07:09,770][12883] Updated weights for policy 0, policy_version 54981 (0.0023) +[2024-06-18 04:07:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.8, 300 sec: 42209.7). Total num frames: 900890624. Throughput: 0: 42367.7. Samples: 901046840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:07:11,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 04:07:13,700][12883] Updated weights for policy 0, policy_version 54991 (0.0032) +[2024-06-18 04:07:16,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 901103616. Throughput: 0: 42273.3. Samples: 901171500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:07:16,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 04:07:17,801][12883] Updated weights for policy 0, policy_version 55001 (0.0039) +[2024-06-18 04:07:18,561][12862] Signal inference workers to stop experience collection... (13000 times) +[2024-06-18 04:07:18,562][12862] Signal inference workers to resume experience collection... (13000 times) +[2024-06-18 04:07:18,577][12883] InferenceWorker_p0-w0: stopping experience collection (13000 times) +[2024-06-18 04:07:18,577][12883] InferenceWorker_p0-w0: resuming experience collection (13000 times) +[2024-06-18 04:07:21,599][12883] Updated weights for policy 0, policy_version 55011 (0.0029) +[2024-06-18 04:07:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 901316608. Throughput: 0: 42389.0. Samples: 901433040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:07:21,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 04:07:25,472][12883] Updated weights for policy 0, policy_version 55021 (0.0046) +[2024-06-18 04:07:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42209.7). Total num frames: 901529600. Throughput: 0: 42253.3. Samples: 901680100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:07:26,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 04:07:29,351][12883] Updated weights for policy 0, policy_version 55031 (0.0032) +[2024-06-18 04:07:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 901742592. Throughput: 0: 42283.6. Samples: 901808860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:07:31,994][12645] Avg episode reward: [(0, '0.228')] +[2024-06-18 04:07:33,395][12883] Updated weights for policy 0, policy_version 55041 (0.0032) +[2024-06-18 04:07:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42210.0). Total num frames: 901939200. Throughput: 0: 42211.7. Samples: 902064100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:07:36,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 04:07:37,015][12883] Updated weights for policy 0, policy_version 55051 (0.0028) +[2024-06-18 04:07:41,163][12883] Updated weights for policy 0, policy_version 55061 (0.0037) +[2024-06-18 04:07:41,993][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 902152192. Throughput: 0: 42210.3. Samples: 902316280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 04:07:41,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 04:07:44,877][12883] Updated weights for policy 0, policy_version 55071 (0.0028) +[2024-06-18 04:07:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 902365184. Throughput: 0: 42257.8. Samples: 902444940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 04:07:46,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 04:07:48,951][12883] Updated weights for policy 0, policy_version 55081 (0.0038) +[2024-06-18 04:07:51,996][12645] Fps is (10 sec: 44226.2, 60 sec: 42323.8, 300 sec: 42209.3). Total num frames: 902594560. Throughput: 0: 42182.3. Samples: 902697040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 04:07:51,996][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 04:07:52,650][12883] Updated weights for policy 0, policy_version 55091 (0.0029) +[2024-06-18 04:07:56,707][12883] Updated weights for policy 0, policy_version 55101 (0.0041) +[2024-06-18 04:07:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 902774784. Throughput: 0: 42381.4. Samples: 902954000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 04:07:56,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 04:08:00,373][12883] Updated weights for policy 0, policy_version 55111 (0.0038) +[2024-06-18 04:08:01,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42154.4). Total num frames: 903004160. Throughput: 0: 42334.7. Samples: 903076560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 04:08:01,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 04:08:04,210][12883] Updated weights for policy 0, policy_version 55121 (0.0029) +[2024-06-18 04:08:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 903200768. Throughput: 0: 42065.2. Samples: 903325980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 04:08:06,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 04:08:08,109][12883] Updated weights for policy 0, policy_version 55131 (0.0048) +[2024-06-18 04:08:11,993][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 903413760. Throughput: 0: 42256.5. Samples: 903581640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 04:08:11,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 04:08:12,055][12883] Updated weights for policy 0, policy_version 55141 (0.0030) +[2024-06-18 04:08:16,193][12883] Updated weights for policy 0, policy_version 55151 (0.0033) +[2024-06-18 04:08:16,994][12645] Fps is (10 sec: 42595.8, 60 sec: 42051.8, 300 sec: 42209.5). Total num frames: 903626752. Throughput: 0: 42186.5. Samples: 903707280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:08:16,995][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 04:08:19,586][12883] Updated weights for policy 0, policy_version 55161 (0.0027) +[2024-06-18 04:08:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 903839744. Throughput: 0: 42156.0. Samples: 903961120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:08:21,994][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 04:08:23,646][12883] Updated weights for policy 0, policy_version 55171 (0.0045) +[2024-06-18 04:08:26,994][12645] Fps is (10 sec: 44239.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 904069120. Throughput: 0: 42126.2. Samples: 904211960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:08:26,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 04:08:27,316][12883] Updated weights for policy 0, policy_version 55181 (0.0025) +[2024-06-18 04:08:31,809][12883] Updated weights for policy 0, policy_version 55191 (0.0040) +[2024-06-18 04:08:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 904249344. Throughput: 0: 42108.0. Samples: 904339800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:08:31,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 04:08:35,112][12883] Updated weights for policy 0, policy_version 55201 (0.0026) +[2024-06-18 04:08:36,996][12645] Fps is (10 sec: 39312.2, 60 sec: 42050.6, 300 sec: 42154.1). Total num frames: 904462336. Throughput: 0: 42059.5. Samples: 904589720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:08:36,996][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 04:08:37,055][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055205_904478720.pth... +[2024-06-18 04:08:37,130][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054586_894337024.pth +[2024-06-18 04:08:39,591][12883] Updated weights for policy 0, policy_version 55211 (0.0039) +[2024-06-18 04:08:41,996][12645] Fps is (10 sec: 44226.2, 60 sec: 42323.6, 300 sec: 42209.3). Total num frames: 904691712. Throughput: 0: 41796.0. Samples: 904834920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:08:41,997][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 04:08:43,007][12883] Updated weights for policy 0, policy_version 55221 (0.0028) +[2024-06-18 04:08:46,996][12645] Fps is (10 sec: 42598.4, 60 sec: 42050.6, 300 sec: 42209.3). Total num frames: 904888320. Throughput: 0: 42075.1. Samples: 904970040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:08:46,996][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 04:08:47,251][12883] Updated weights for policy 0, policy_version 55231 (0.0041) +[2024-06-18 04:08:47,757][12862] Signal inference workers to stop experience collection... (13050 times) +[2024-06-18 04:08:47,815][12883] InferenceWorker_p0-w0: stopping experience collection (13050 times) +[2024-06-18 04:08:47,874][12862] Signal inference workers to resume experience collection... (13050 times) +[2024-06-18 04:08:47,874][12883] InferenceWorker_p0-w0: resuming experience collection (13050 times) +[2024-06-18 04:08:51,127][12883] Updated weights for policy 0, policy_version 55241 (0.0029) +[2024-06-18 04:08:51,996][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42209.3). Total num frames: 905101312. Throughput: 0: 42015.2. Samples: 905216760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:08:51,997][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 04:08:55,258][12883] Updated weights for policy 0, policy_version 55251 (0.0027) +[2024-06-18 04:08:56,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 905314304. Throughput: 0: 41921.7. Samples: 905468120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:08:56,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 04:08:58,931][12883] Updated weights for policy 0, policy_version 55261 (0.0054) +[2024-06-18 04:09:01,996][12645] Fps is (10 sec: 40959.6, 60 sec: 41777.5, 300 sec: 42153.8). Total num frames: 905510912. Throughput: 0: 41975.3. Samples: 905596240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:09:01,997][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 04:09:03,241][12883] Updated weights for policy 0, policy_version 55271 (0.0032) +[2024-06-18 04:09:06,564][12883] Updated weights for policy 0, policy_version 55281 (0.0053) +[2024-06-18 04:09:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 905740288. Throughput: 0: 41958.1. Samples: 905849240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:09:06,994][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 04:09:10,796][12883] Updated weights for policy 0, policy_version 55291 (0.0037) +[2024-06-18 04:09:11,993][12645] Fps is (10 sec: 44247.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 905953280. Throughput: 0: 42051.6. Samples: 906104280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:09:11,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 04:09:14,313][12883] Updated weights for policy 0, policy_version 55301 (0.0041) +[2024-06-18 04:09:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.7, 300 sec: 42154.1). Total num frames: 906149888. Throughput: 0: 42137.3. Samples: 906235980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:09:16,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 04:09:18,252][12883] Updated weights for policy 0, policy_version 55311 (0.0049) +[2024-06-18 04:09:21,996][12645] Fps is (10 sec: 40950.3, 60 sec: 42050.6, 300 sec: 42153.9). Total num frames: 906362880. Throughput: 0: 42064.9. Samples: 906482640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:09:21,997][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 04:09:22,346][12883] Updated weights for policy 0, policy_version 55321 (0.0027) +[2024-06-18 04:09:25,877][12883] Updated weights for policy 0, policy_version 55331 (0.0027) +[2024-06-18 04:09:26,993][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 906575872. Throughput: 0: 42386.3. Samples: 906742200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:09:26,994][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 04:09:30,004][12883] Updated weights for policy 0, policy_version 55341 (0.0037) +[2024-06-18 04:09:31,993][12645] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 906788864. Throughput: 0: 42302.3. Samples: 906873540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:09:31,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 04:09:33,424][12883] Updated weights for policy 0, policy_version 55351 (0.0032) +[2024-06-18 04:09:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 907001856. Throughput: 0: 42443.5. Samples: 907126620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:09:36,994][12645] Avg episode reward: [(0, '0.126')] +[2024-06-18 04:09:37,558][12883] Updated weights for policy 0, policy_version 55361 (0.0025) +[2024-06-18 04:09:41,004][12883] Updated weights for policy 0, policy_version 55371 (0.0041) +[2024-06-18 04:09:42,000][12645] Fps is (10 sec: 44208.5, 60 sec: 42322.5, 300 sec: 42208.7). Total num frames: 907231232. Throughput: 0: 42553.2. Samples: 907383280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:09:42,000][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 04:09:45,180][12883] Updated weights for policy 0, policy_version 55381 (0.0041) +[2024-06-18 04:09:46,997][12645] Fps is (10 sec: 42582.3, 60 sec: 42324.3, 300 sec: 42209.1). Total num frames: 907427840. Throughput: 0: 42563.5. Samples: 907511660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:09:46,998][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 04:09:48,776][12883] Updated weights for policy 0, policy_version 55391 (0.0029) +[2024-06-18 04:09:51,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42327.0, 300 sec: 42209.9). Total num frames: 907640832. Throughput: 0: 42618.7. Samples: 907767080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:09:51,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 04:09:52,788][12883] Updated weights for policy 0, policy_version 55401 (0.0038) +[2024-06-18 04:09:56,442][12883] Updated weights for policy 0, policy_version 55411 (0.0037) +[2024-06-18 04:09:56,994][12645] Fps is (10 sec: 42614.7, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 907853824. Throughput: 0: 42497.3. Samples: 908016660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:09:56,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 04:10:00,681][12883] Updated weights for policy 0, policy_version 55421 (0.0028) +[2024-06-18 04:10:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42320.7). Total num frames: 908083200. Throughput: 0: 42460.4. Samples: 908146700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 04:10:01,994][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 04:10:04,151][12883] Updated weights for policy 0, policy_version 55431 (0.0033) +[2024-06-18 04:10:05,212][12862] Signal inference workers to stop experience collection... (13100 times) +[2024-06-18 04:10:05,260][12862] Signal inference workers to resume experience collection... (13100 times) +[2024-06-18 04:10:05,261][12883] InferenceWorker_p0-w0: stopping experience collection (13100 times) +[2024-06-18 04:10:05,275][12883] InferenceWorker_p0-w0: resuming experience collection (13100 times) +[2024-06-18 04:10:06,995][12645] Fps is (10 sec: 40953.8, 60 sec: 42051.3, 300 sec: 42209.4). Total num frames: 908263424. Throughput: 0: 42491.9. Samples: 908394740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 04:10:06,995][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 04:10:08,315][12883] Updated weights for policy 0, policy_version 55441 (0.0024) +[2024-06-18 04:10:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 908492800. Throughput: 0: 42414.6. Samples: 908650860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 04:10:11,994][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 04:10:12,289][12883] Updated weights for policy 0, policy_version 55451 (0.0043) +[2024-06-18 04:10:16,397][12883] Updated weights for policy 0, policy_version 55461 (0.0028) +[2024-06-18 04:10:16,997][12645] Fps is (10 sec: 44227.4, 60 sec: 42595.8, 300 sec: 42320.5). Total num frames: 908705792. Throughput: 0: 42417.4. Samples: 908782480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 04:10:16,998][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 04:10:19,885][12883] Updated weights for policy 0, policy_version 55471 (0.0032) +[2024-06-18 04:10:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42265.2). Total num frames: 908918784. Throughput: 0: 42395.0. Samples: 909034400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 04:10:21,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 04:10:23,918][12883] Updated weights for policy 0, policy_version 55481 (0.0032) +[2024-06-18 04:10:26,994][12645] Fps is (10 sec: 42613.9, 60 sec: 42598.4, 300 sec: 42321.0). Total num frames: 909131776. Throughput: 0: 42357.1. Samples: 909289080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 04:10:26,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 04:10:27,412][12883] Updated weights for policy 0, policy_version 55491 (0.0030) +[2024-06-18 04:10:31,581][12883] Updated weights for policy 0, policy_version 55501 (0.0039) +[2024-06-18 04:10:31,993][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 909328384. Throughput: 0: 42316.1. Samples: 909415720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:10:31,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 04:10:35,418][12883] Updated weights for policy 0, policy_version 55511 (0.0045) +[2024-06-18 04:10:36,993][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 909557760. Throughput: 0: 42210.7. Samples: 909666560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:10:36,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 04:10:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055515_909557760.pth... +[2024-06-18 04:10:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054896_899416064.pth +[2024-06-18 04:10:39,392][12883] Updated weights for policy 0, policy_version 55521 (0.0032) +[2024-06-18 04:10:41,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42328.2, 300 sec: 42375.9). Total num frames: 909770752. Throughput: 0: 42322.7. Samples: 909921280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:10:41,996][12645] Avg episode reward: [(0, '0.103')] +[2024-06-18 04:10:42,930][12883] Updated weights for policy 0, policy_version 55531 (0.0032) +[2024-06-18 04:10:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42328.0, 300 sec: 42154.1). Total num frames: 909967360. Throughput: 0: 42250.7. Samples: 910047980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:10:46,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 04:10:47,284][12883] Updated weights for policy 0, policy_version 55541 (0.0035) +[2024-06-18 04:10:50,685][12883] Updated weights for policy 0, policy_version 55551 (0.0043) +[2024-06-18 04:10:51,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 910213120. Throughput: 0: 42344.5. Samples: 910300180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:10:51,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 04:10:54,935][12883] Updated weights for policy 0, policy_version 55561 (0.0028) +[2024-06-18 04:10:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 910376960. Throughput: 0: 42401.8. Samples: 910558940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:10:56,994][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 04:10:58,674][12883] Updated weights for policy 0, policy_version 55571 (0.0030) +[2024-06-18 04:11:01,993][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 910589952. Throughput: 0: 42061.7. Samples: 910675100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:11:01,994][12645] Avg episode reward: [(0, '0.052')] +[2024-06-18 04:11:02,687][12883] Updated weights for policy 0, policy_version 55581 (0.0031) +[2024-06-18 04:11:06,259][12883] Updated weights for policy 0, policy_version 55591 (0.0031) +[2024-06-18 04:11:06,993][12645] Fps is (10 sec: 45875.4, 60 sec: 42872.6, 300 sec: 42321.6). Total num frames: 910835712. Throughput: 0: 42047.7. Samples: 910926540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 04:11:06,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 04:11:10,748][12883] Updated weights for policy 0, policy_version 55601 (0.0037) +[2024-06-18 04:11:11,993][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 910999552. Throughput: 0: 42095.6. Samples: 911183380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 04:11:11,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 04:11:14,263][12883] Updated weights for policy 0, policy_version 55611 (0.0043) +[2024-06-18 04:11:16,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41781.7, 300 sec: 42154.1). Total num frames: 911212544. Throughput: 0: 41973.2. Samples: 911304520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 04:11:16,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 04:11:18,397][12883] Updated weights for policy 0, policy_version 55621 (0.0032) +[2024-06-18 04:11:21,993][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 911458304. Throughput: 0: 42097.4. Samples: 911560940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 04:11:21,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 04:11:21,998][12883] Updated weights for policy 0, policy_version 55631 (0.0025) +[2024-06-18 04:11:26,053][12883] Updated weights for policy 0, policy_version 55641 (0.0027) +[2024-06-18 04:11:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 911654912. Throughput: 0: 41974.9. Samples: 911810060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 04:11:27,003][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 04:11:29,604][12883] Updated weights for policy 0, policy_version 55651 (0.0033) +[2024-06-18 04:11:32,000][12645] Fps is (10 sec: 39296.6, 60 sec: 42047.8, 300 sec: 42153.2). Total num frames: 911851520. Throughput: 0: 41946.6. Samples: 911935840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 04:11:32,000][12645] Avg episode reward: [(0, '0.069')] +[2024-06-18 04:11:34,077][12883] Updated weights for policy 0, policy_version 55661 (0.0033) +[2024-06-18 04:11:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 912080896. Throughput: 0: 42005.8. Samples: 912190440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 04:11:36,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 04:11:37,487][12883] Updated weights for policy 0, policy_version 55671 (0.0039) +[2024-06-18 04:11:41,641][12883] Updated weights for policy 0, policy_version 55681 (0.0038) +[2024-06-18 04:11:41,993][12645] Fps is (10 sec: 42625.5, 60 sec: 41780.8, 300 sec: 42209.6). Total num frames: 912277504. Throughput: 0: 41960.1. Samples: 912447140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:11:41,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 04:11:45,103][12883] Updated weights for policy 0, policy_version 55691 (0.0038) +[2024-06-18 04:11:46,994][12645] Fps is (10 sec: 40956.5, 60 sec: 42051.7, 300 sec: 42154.0). Total num frames: 912490496. Throughput: 0: 42105.4. Samples: 912569880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:11:46,995][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 04:11:48,483][12862] Signal inference workers to stop experience collection... (13150 times) +[2024-06-18 04:11:48,488][12862] Signal inference workers to resume experience collection... (13150 times) +[2024-06-18 04:11:48,532][12883] InferenceWorker_p0-w0: stopping experience collection (13150 times) +[2024-06-18 04:11:48,532][12883] InferenceWorker_p0-w0: resuming experience collection (13150 times) +[2024-06-18 04:11:49,148][12883] Updated weights for policy 0, policy_version 55701 (0.0043) +[2024-06-18 04:11:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 912719872. Throughput: 0: 42331.5. Samples: 912831460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:11:51,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 04:11:52,619][12883] Updated weights for policy 0, policy_version 55711 (0.0047) +[2024-06-18 04:11:56,857][12883] Updated weights for policy 0, policy_version 55721 (0.0039) +[2024-06-18 04:11:56,994][12645] Fps is (10 sec: 44240.3, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 912932864. Throughput: 0: 42324.8. Samples: 913088000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:11:56,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 04:12:00,302][12883] Updated weights for policy 0, policy_version 55731 (0.0042) +[2024-06-18 04:12:01,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42323.7, 300 sec: 42209.6). Total num frames: 913129472. Throughput: 0: 42396.6. Samples: 913212460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:12:01,996][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 04:12:04,690][12883] Updated weights for policy 0, policy_version 55741 (0.0032) +[2024-06-18 04:12:06,998][12645] Fps is (10 sec: 42578.2, 60 sec: 42048.9, 300 sec: 42264.5). Total num frames: 913358848. Throughput: 0: 42366.9. Samples: 913467660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:12:06,999][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 04:12:08,209][12883] Updated weights for policy 0, policy_version 55751 (0.0039) +[2024-06-18 04:12:11,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 913571840. Throughput: 0: 42530.8. Samples: 913723940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:12:11,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 04:12:12,138][12883] Updated weights for policy 0, policy_version 55761 (0.0045) +[2024-06-18 04:12:15,771][12883] Updated weights for policy 0, policy_version 55771 (0.0030) +[2024-06-18 04:12:16,994][12645] Fps is (10 sec: 40979.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 913768448. Throughput: 0: 42473.9. Samples: 913846900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 04:12:16,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 04:12:20,609][12883] Updated weights for policy 0, policy_version 55781 (0.0045) +[2024-06-18 04:12:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 913997824. Throughput: 0: 42634.7. Samples: 914109000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 04:12:21,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 04:12:23,350][12883] Updated weights for policy 0, policy_version 55791 (0.0030) +[2024-06-18 04:12:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 914194432. Throughput: 0: 42586.1. Samples: 914363520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 04:12:26,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 04:12:28,179][12883] Updated weights for policy 0, policy_version 55801 (0.0045) +[2024-06-18 04:12:31,339][12883] Updated weights for policy 0, policy_version 55811 (0.0030) +[2024-06-18 04:12:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42602.8, 300 sec: 42265.2). Total num frames: 914407424. Throughput: 0: 42474.5. Samples: 914481200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 04:12:31,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 04:12:35,851][12883] Updated weights for policy 0, policy_version 55821 (0.0033) +[2024-06-18 04:12:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 914620416. Throughput: 0: 42337.8. Samples: 914736660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 04:12:36,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 04:12:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055825_914636800.pth... +[2024-06-18 04:12:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055205_904478720.pth +[2024-06-18 04:12:39,350][12883] Updated weights for policy 0, policy_version 55831 (0.0036) +[2024-06-18 04:12:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 914817024. Throughput: 0: 42214.3. Samples: 914987640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 04:12:41,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 04:12:43,466][12883] Updated weights for policy 0, policy_version 55841 (0.0027) +[2024-06-18 04:12:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.9, 300 sec: 42209.9). Total num frames: 915046400. Throughput: 0: 42307.0. Samples: 915116180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 04:12:46,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 04:12:47,168][12883] Updated weights for policy 0, policy_version 55851 (0.0044) +[2024-06-18 04:12:51,091][12883] Updated weights for policy 0, policy_version 55861 (0.0046) +[2024-06-18 04:12:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 915259392. Throughput: 0: 42355.7. Samples: 915373460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:12:51,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 04:12:54,983][12883] Updated weights for policy 0, policy_version 55871 (0.0035) +[2024-06-18 04:12:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 915472384. Throughput: 0: 42108.4. Samples: 915618820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:12:56,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 04:12:58,988][12883] Updated weights for policy 0, policy_version 55881 (0.0038) +[2024-06-18 04:13:00,588][12862] Signal inference workers to stop experience collection... (13200 times) +[2024-06-18 04:13:00,589][12862] Signal inference workers to resume experience collection... (13200 times) +[2024-06-18 04:13:00,618][12883] InferenceWorker_p0-w0: stopping experience collection (13200 times) +[2024-06-18 04:13:00,618][12883] InferenceWorker_p0-w0: resuming experience collection (13200 times) +[2024-06-18 04:13:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 915668992. Throughput: 0: 42206.3. Samples: 915746180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:13:01,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 04:13:02,699][12883] Updated weights for policy 0, policy_version 55891 (0.0024) +[2024-06-18 04:13:06,844][12883] Updated weights for policy 0, policy_version 55901 (0.0031) +[2024-06-18 04:13:06,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42054.0, 300 sec: 42264.8). Total num frames: 915881984. Throughput: 0: 42060.0. Samples: 916001800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:13:06,997][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 04:13:10,437][12883] Updated weights for policy 0, policy_version 55911 (0.0042) +[2024-06-18 04:13:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42209.7). Total num frames: 916078592. Throughput: 0: 42049.4. Samples: 916255740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:13:11,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 04:13:14,453][12883] Updated weights for policy 0, policy_version 55921 (0.0031) +[2024-06-18 04:13:16,994][12645] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 916307968. Throughput: 0: 42223.2. Samples: 916381240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:13:16,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 04:13:18,226][12883] Updated weights for policy 0, policy_version 55931 (0.0029) +[2024-06-18 04:13:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 916520960. Throughput: 0: 42149.7. Samples: 916633400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:13:21,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 04:13:22,449][12883] Updated weights for policy 0, policy_version 55941 (0.0037) +[2024-06-18 04:13:26,067][12883] Updated weights for policy 0, policy_version 55951 (0.0037) +[2024-06-18 04:13:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 916717568. Throughput: 0: 41975.9. Samples: 916876560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:13:26,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 04:13:30,139][12883] Updated weights for policy 0, policy_version 55961 (0.0028) +[2024-06-18 04:13:31,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 42265.2). Total num frames: 916930560. Throughput: 0: 41999.7. Samples: 917006260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:13:31,996][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 04:13:33,798][12883] Updated weights for policy 0, policy_version 55971 (0.0032) +[2024-06-18 04:13:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42210.0). Total num frames: 917143552. Throughput: 0: 41875.5. Samples: 917257860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:13:36,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 04:13:38,107][12883] Updated weights for policy 0, policy_version 55981 (0.0032) +[2024-06-18 04:13:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.2, 300 sec: 42210.0). Total num frames: 917340160. Throughput: 0: 41805.3. Samples: 917500060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:13:41,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 04:13:42,266][12883] Updated weights for policy 0, policy_version 55991 (0.0033) +[2024-06-18 04:13:45,851][12883] Updated weights for policy 0, policy_version 56001 (0.0035) +[2024-06-18 04:13:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42210.0). Total num frames: 917553152. Throughput: 0: 41715.6. Samples: 917623380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:13:46,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 04:13:49,943][12883] Updated weights for policy 0, policy_version 56011 (0.0027) +[2024-06-18 04:13:51,996][12645] Fps is (10 sec: 42588.6, 60 sec: 41777.5, 300 sec: 42209.3). Total num frames: 917766144. Throughput: 0: 41875.1. Samples: 917886180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:13:51,996][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 04:13:53,811][12883] Updated weights for policy 0, policy_version 56021 (0.0029) +[2024-06-18 04:13:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42265.5). Total num frames: 917979136. Throughput: 0: 41672.8. Samples: 918131020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:13:56,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 04:13:57,594][12883] Updated weights for policy 0, policy_version 56031 (0.0027) +[2024-06-18 04:14:01,684][12883] Updated weights for policy 0, policy_version 56041 (0.0030) +[2024-06-18 04:14:01,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 918192128. Throughput: 0: 41677.7. Samples: 918256740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:14:01,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 04:14:05,965][12883] Updated weights for policy 0, policy_version 56051 (0.0036) +[2024-06-18 04:14:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41780.9, 300 sec: 42154.1). Total num frames: 918388736. Throughput: 0: 41768.5. Samples: 918512980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:14:06,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 04:14:09,430][12883] Updated weights for policy 0, policy_version 56061 (0.0042) +[2024-06-18 04:14:11,993][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 918601728. Throughput: 0: 41869.9. Samples: 918760700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:14:11,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 04:14:13,688][12883] Updated weights for policy 0, policy_version 56071 (0.0032) +[2024-06-18 04:14:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42210.0). Total num frames: 918814720. Throughput: 0: 41781.7. Samples: 918886340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:14:16,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 04:14:17,337][12883] Updated weights for policy 0, policy_version 56081 (0.0031) +[2024-06-18 04:14:21,851][12883] Updated weights for policy 0, policy_version 56091 (0.0048) +[2024-06-18 04:14:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 42098.5). Total num frames: 918994944. Throughput: 0: 41729.4. Samples: 919135680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:14:21,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 04:14:25,258][12883] Updated weights for policy 0, policy_version 56101 (0.0041) +[2024-06-18 04:14:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 919224320. Throughput: 0: 41781.4. Samples: 919380220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:14:27,000][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 04:14:27,445][12862] Signal inference workers to stop experience collection... (13250 times) +[2024-06-18 04:14:27,445][12862] Signal inference workers to resume experience collection... (13250 times) +[2024-06-18 04:14:27,478][12883] InferenceWorker_p0-w0: stopping experience collection (13250 times) +[2024-06-18 04:14:27,479][12883] InferenceWorker_p0-w0: resuming experience collection (13250 times) +[2024-06-18 04:14:29,539][12883] Updated weights for policy 0, policy_version 56111 (0.0057) +[2024-06-18 04:14:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41507.7, 300 sec: 42098.6). Total num frames: 919420928. Throughput: 0: 41883.6. Samples: 919508140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:14:31,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 04:14:33,113][12883] Updated weights for policy 0, policy_version 56121 (0.0031) +[2024-06-18 04:14:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 42043.9). Total num frames: 919633920. Throughput: 0: 41442.1. Samples: 919750980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-18 04:14:36,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 04:14:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056130_919633920.pth... +[2024-06-18 04:14:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055515_909557760.pth +[2024-06-18 04:14:37,279][12883] Updated weights for policy 0, policy_version 56131 (0.0039) +[2024-06-18 04:14:41,341][12883] Updated weights for policy 0, policy_version 56141 (0.0036) +[2024-06-18 04:14:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42154.6). Total num frames: 919863296. Throughput: 0: 41621.8. Samples: 920004000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-18 04:14:41,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 04:14:45,534][12883] Updated weights for policy 0, policy_version 56151 (0.0034) +[2024-06-18 04:14:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 920059904. Throughput: 0: 41548.1. Samples: 920126400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-18 04:14:46,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 04:14:49,107][12883] Updated weights for policy 0, policy_version 56161 (0.0034) +[2024-06-18 04:14:51,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41234.7, 300 sec: 41987.5). Total num frames: 920240128. Throughput: 0: 41413.8. Samples: 920376600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-18 04:14:51,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 04:14:53,225][12883] Updated weights for policy 0, policy_version 56171 (0.0035) +[2024-06-18 04:14:56,793][12883] Updated weights for policy 0, policy_version 56181 (0.0044) +[2024-06-18 04:14:56,997][12645] Fps is (10 sec: 42582.3, 60 sec: 41776.6, 300 sec: 42042.5). Total num frames: 920485888. Throughput: 0: 41411.1. Samples: 920624360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-18 04:14:56,998][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 04:15:00,793][12883] Updated weights for policy 0, policy_version 56191 (0.0031) +[2024-06-18 04:15:01,993][12645] Fps is (10 sec: 45875.3, 60 sec: 41779.3, 300 sec: 42154.3). Total num frames: 920698880. Throughput: 0: 41506.7. Samples: 920754140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-18 04:15:01,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 04:15:04,461][12883] Updated weights for policy 0, policy_version 56201 (0.0036) +[2024-06-18 04:15:06,994][12645] Fps is (10 sec: 37697.0, 60 sec: 41232.9, 300 sec: 41931.9). Total num frames: 920862720. Throughput: 0: 41599.4. Samples: 921007660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) +[2024-06-18 04:15:06,994][12645] Avg episode reward: [(0, '0.145')] +[2024-06-18 04:15:08,555][12883] Updated weights for policy 0, policy_version 56211 (0.0028) +[2024-06-18 04:15:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42043.5). Total num frames: 921108480. Throughput: 0: 41756.0. Samples: 921259240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 04:15:11,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 04:15:12,113][12883] Updated weights for policy 0, policy_version 56221 (0.0037) +[2024-06-18 04:15:16,364][12883] Updated weights for policy 0, policy_version 56231 (0.0039) +[2024-06-18 04:15:16,993][12645] Fps is (10 sec: 45876.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 921321472. Throughput: 0: 41755.2. Samples: 921387120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 04:15:16,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 04:15:19,935][12883] Updated weights for policy 0, policy_version 56241 (0.0044) +[2024-06-18 04:15:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 921518080. Throughput: 0: 41769.4. Samples: 921630600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 04:15:21,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 04:15:24,239][12883] Updated weights for policy 0, policy_version 56251 (0.0031) +[2024-06-18 04:15:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 921731072. Throughput: 0: 41805.8. Samples: 921885260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 04:15:26,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 04:15:27,905][12883] Updated weights for policy 0, policy_version 56261 (0.0023) +[2024-06-18 04:15:31,912][12883] Updated weights for policy 0, policy_version 56271 (0.0030) +[2024-06-18 04:15:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 921944064. Throughput: 0: 41899.1. Samples: 922011860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 04:15:31,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 04:15:35,817][12883] Updated weights for policy 0, policy_version 56281 (0.0027) +[2024-06-18 04:15:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41932.2). Total num frames: 922140672. Throughput: 0: 41887.5. Samples: 922261540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 04:15:36,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 04:15:39,690][12883] Updated weights for policy 0, policy_version 56291 (0.0038) +[2024-06-18 04:15:41,996][12645] Fps is (10 sec: 40950.8, 60 sec: 41504.5, 300 sec: 41987.1). Total num frames: 922353664. Throughput: 0: 41964.9. Samples: 922512720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) +[2024-06-18 04:15:41,997][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 04:15:43,887][12883] Updated weights for policy 0, policy_version 56301 (0.0028) +[2024-06-18 04:15:44,318][12862] Signal inference workers to stop experience collection... (13300 times) +[2024-06-18 04:15:44,319][12862] Signal inference workers to resume experience collection... (13300 times) +[2024-06-18 04:15:44,349][12883] InferenceWorker_p0-w0: stopping experience collection (13300 times) +[2024-06-18 04:15:44,349][12883] InferenceWorker_p0-w0: resuming experience collection (13300 times) +[2024-06-18 04:15:46,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41504.5, 300 sec: 41820.5). Total num frames: 922550272. Throughput: 0: 41814.2. Samples: 922635880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 04:15:46,997][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 04:15:47,659][12883] Updated weights for policy 0, policy_version 56311 (0.0041) +[2024-06-18 04:15:51,682][12883] Updated weights for policy 0, policy_version 56321 (0.0027) +[2024-06-18 04:15:51,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 922763264. Throughput: 0: 41768.1. Samples: 922887220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 04:15:51,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 04:15:55,533][12883] Updated weights for policy 0, policy_version 56331 (0.0040) +[2024-06-18 04:15:56,994][12645] Fps is (10 sec: 42608.5, 60 sec: 41508.8, 300 sec: 41987.5). Total num frames: 922976256. Throughput: 0: 41711.6. Samples: 923136260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 04:15:56,994][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 04:15:59,344][12883] Updated weights for policy 0, policy_version 56341 (0.0037) +[2024-06-18 04:16:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 923172864. Throughput: 0: 41631.5. Samples: 923260540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 04:16:01,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 04:16:03,523][12883] Updated weights for policy 0, policy_version 56351 (0.0022) +[2024-06-18 04:16:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 923402240. Throughput: 0: 41805.7. Samples: 923511860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 04:16:06,994][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 04:16:07,056][12883] Updated weights for policy 0, policy_version 56361 (0.0031) +[2024-06-18 04:16:11,428][12883] Updated weights for policy 0, policy_version 56371 (0.0034) +[2024-06-18 04:16:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 923582464. Throughput: 0: 41800.0. Samples: 923766260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 04:16:11,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 04:16:14,914][12883] Updated weights for policy 0, policy_version 56381 (0.0029) +[2024-06-18 04:16:16,996][12645] Fps is (10 sec: 40950.6, 60 sec: 41504.4, 300 sec: 41876.0). Total num frames: 923811840. Throughput: 0: 41735.6. Samples: 923890060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 04:16:16,997][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 04:16:19,307][12883] Updated weights for policy 0, policy_version 56391 (0.0032) +[2024-06-18 04:16:21,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 924041216. Throughput: 0: 41870.2. Samples: 924145700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 04:16:21,995][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 04:16:22,534][12883] Updated weights for policy 0, policy_version 56401 (0.0029) +[2024-06-18 04:16:26,996][12645] Fps is (10 sec: 40960.4, 60 sec: 41504.5, 300 sec: 41932.5). Total num frames: 924221440. Throughput: 0: 41954.7. Samples: 924400680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 04:16:26,996][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 04:16:27,243][12883] Updated weights for policy 0, policy_version 56411 (0.0036) +[2024-06-18 04:16:30,316][12883] Updated weights for policy 0, policy_version 56421 (0.0038) +[2024-06-18 04:16:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 924434432. Throughput: 0: 41920.4. Samples: 924522200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 04:16:31,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 04:16:34,942][12883] Updated weights for policy 0, policy_version 56431 (0.0033) +[2024-06-18 04:16:36,996][12645] Fps is (10 sec: 42598.4, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 924647424. Throughput: 0: 41898.4. Samples: 924772740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 04:16:36,997][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 04:16:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056436_924647424.pth... +[2024-06-18 04:16:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055825_914636800.pth +[2024-06-18 04:16:38,188][12883] Updated weights for policy 0, policy_version 56441 (0.0039) +[2024-06-18 04:16:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41507.8, 300 sec: 41876.5). Total num frames: 924844032. Throughput: 0: 41931.5. Samples: 925023180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 04:16:41,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 04:16:42,938][12883] Updated weights for policy 0, policy_version 56451 (0.0023) +[2024-06-18 04:16:46,561][12883] Updated weights for policy 0, policy_version 56461 (0.0036) +[2024-06-18 04:16:46,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42053.8, 300 sec: 41876.4). Total num frames: 925073408. Throughput: 0: 41755.5. Samples: 925139540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 04:16:46,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 04:16:50,825][12883] Updated weights for policy 0, policy_version 56471 (0.0036) +[2024-06-18 04:16:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 925286400. Throughput: 0: 41964.1. Samples: 925400240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 04:16:51,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 04:16:54,359][12883] Updated weights for policy 0, policy_version 56481 (0.0027) +[2024-06-18 04:16:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 925466624. Throughput: 0: 41776.9. Samples: 925646220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:16:56,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 04:16:58,583][12883] Updated weights for policy 0, policy_version 56491 (0.0033) +[2024-06-18 04:17:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41821.5). Total num frames: 925696000. Throughput: 0: 41743.1. Samples: 925768400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:17:01,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 04:17:02,344][12883] Updated weights for policy 0, policy_version 56501 (0.0038) +[2024-06-18 04:17:06,235][12883] Updated weights for policy 0, policy_version 56511 (0.0034) +[2024-06-18 04:17:06,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 925925376. Throughput: 0: 41791.6. Samples: 926026320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:17:06,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 04:17:10,292][12883] Updated weights for policy 0, policy_version 56521 (0.0040) +[2024-06-18 04:17:11,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 41820.5). Total num frames: 926105600. Throughput: 0: 41572.9. Samples: 926271460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:17:11,996][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 04:17:13,699][12862] Signal inference workers to stop experience collection... (13350 times) +[2024-06-18 04:17:13,748][12883] InferenceWorker_p0-w0: stopping experience collection (13350 times) +[2024-06-18 04:17:13,752][12862] Signal inference workers to resume experience collection... (13350 times) +[2024-06-18 04:17:13,764][12883] InferenceWorker_p0-w0: resuming experience collection (13350 times) +[2024-06-18 04:17:14,061][12883] Updated weights for policy 0, policy_version 56531 (0.0027) +[2024-06-18 04:17:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 41820.8). Total num frames: 926334976. Throughput: 0: 41599.5. Samples: 926394180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:17:16,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 04:17:18,183][12883] Updated weights for policy 0, policy_version 56541 (0.0039) +[2024-06-18 04:17:21,993][12645] Fps is (10 sec: 39331.1, 60 sec: 40960.1, 300 sec: 41709.8). Total num frames: 926498816. Throughput: 0: 41536.9. Samples: 926641800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:17:21,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 04:17:22,645][12883] Updated weights for policy 0, policy_version 56551 (0.0032) +[2024-06-18 04:17:25,955][12883] Updated weights for policy 0, policy_version 56561 (0.0031) +[2024-06-18 04:17:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41780.7, 300 sec: 41765.3). Total num frames: 926728192. Throughput: 0: 41538.1. Samples: 926892400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:17:26,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 04:17:30,281][12883] Updated weights for policy 0, policy_version 56571 (0.0044) +[2024-06-18 04:17:31,996][12645] Fps is (10 sec: 45864.1, 60 sec: 42050.6, 300 sec: 41820.5). Total num frames: 926957568. Throughput: 0: 41834.8. Samples: 927022200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:17:31,996][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 04:17:33,810][12883] Updated weights for policy 0, policy_version 56581 (0.0028) +[2024-06-18 04:17:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41507.7, 300 sec: 41765.3). Total num frames: 927137792. Throughput: 0: 41590.2. Samples: 927271800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:17:36,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 04:17:37,845][12883] Updated weights for policy 0, policy_version 56591 (0.0040) +[2024-06-18 04:17:41,651][12883] Updated weights for policy 0, policy_version 56601 (0.0034) +[2024-06-18 04:17:41,994][12645] Fps is (10 sec: 39330.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 927350784. Throughput: 0: 41706.3. Samples: 927523000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:17:41,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 04:17:45,782][12883] Updated weights for policy 0, policy_version 56611 (0.0039) +[2024-06-18 04:17:46,996][12645] Fps is (10 sec: 44226.7, 60 sec: 41777.7, 300 sec: 41765.0). Total num frames: 927580160. Throughput: 0: 41747.7. Samples: 927647140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:17:46,996][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 04:17:49,501][12883] Updated weights for policy 0, policy_version 56621 (0.0032) +[2024-06-18 04:17:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 927776768. Throughput: 0: 41660.0. Samples: 927901020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:17:51,994][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 04:17:53,348][12883] Updated weights for policy 0, policy_version 56631 (0.0029) +[2024-06-18 04:17:56,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 927989760. Throughput: 0: 41676.3. Samples: 928146800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:17:56,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 04:17:57,478][12883] Updated weights for policy 0, policy_version 56641 (0.0038) +[2024-06-18 04:18:01,159][12883] Updated weights for policy 0, policy_version 56651 (0.0033) +[2024-06-18 04:18:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41765.7). Total num frames: 928202752. Throughput: 0: 41869.4. Samples: 928278300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:18:01,994][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 04:18:05,325][12883] Updated weights for policy 0, policy_version 56661 (0.0037) +[2024-06-18 04:18:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 928415744. Throughput: 0: 42121.7. Samples: 928537280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 04:18:06,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 04:18:08,971][12883] Updated weights for policy 0, policy_version 56671 (0.0035) +[2024-06-18 04:18:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42327.0, 300 sec: 41820.8). Total num frames: 928645120. Throughput: 0: 41868.1. Samples: 928776460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 04:18:11,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 04:18:13,581][12883] Updated weights for policy 0, policy_version 56681 (0.0045) +[2024-06-18 04:18:16,856][12883] Updated weights for policy 0, policy_version 56691 (0.0033) +[2024-06-18 04:18:16,996][12645] Fps is (10 sec: 40950.4, 60 sec: 41504.5, 300 sec: 41709.5). Total num frames: 928825344. Throughput: 0: 41872.9. Samples: 928906480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 04:18:16,996][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 04:18:21,179][12883] Updated weights for policy 0, policy_version 56701 (0.0027) +[2024-06-18 04:18:21,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42052.1, 300 sec: 41709.8). Total num frames: 929021952. Throughput: 0: 41974.2. Samples: 929160640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 04:18:21,994][12645] Avg episode reward: [(0, '0.094')] +[2024-06-18 04:18:24,401][12883] Updated weights for policy 0, policy_version 56711 (0.0030) +[2024-06-18 04:18:26,994][12645] Fps is (10 sec: 45886.0, 60 sec: 42598.5, 300 sec: 41876.7). Total num frames: 929284096. Throughput: 0: 41911.2. Samples: 929409000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 04:18:26,994][12645] Avg episode reward: [(0, '0.094')] +[2024-06-18 04:18:28,827][12883] Updated weights for policy 0, policy_version 56721 (0.0025) +[2024-06-18 04:18:31,930][12883] Updated weights for policy 0, policy_version 56731 (0.0024) +[2024-06-18 04:18:31,994][12645] Fps is (10 sec: 45873.5, 60 sec: 42053.6, 300 sec: 41820.8). Total num frames: 929480704. Throughput: 0: 42159.1. Samples: 929544220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 04:18:31,995][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 04:18:32,424][12862] Signal inference workers to stop experience collection... (13400 times) +[2024-06-18 04:18:32,424][12862] Signal inference workers to resume experience collection... (13400 times) +[2024-06-18 04:18:32,438][12883] InferenceWorker_p0-w0: stopping experience collection (13400 times) +[2024-06-18 04:18:32,462][12883] InferenceWorker_p0-w0: resuming experience collection (13400 times) +[2024-06-18 04:18:36,688][12883] Updated weights for policy 0, policy_version 56741 (0.0032) +[2024-06-18 04:18:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 929660928. Throughput: 0: 41995.6. Samples: 929790820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 04:18:36,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 04:18:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056742_929660928.pth... +[2024-06-18 04:18:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056130_919633920.pth +[2024-06-18 04:18:39,999][12883] Updated weights for policy 0, policy_version 56751 (0.0032) +[2024-06-18 04:18:41,994][12645] Fps is (10 sec: 40961.4, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 929890304. Throughput: 0: 42090.2. Samples: 930040860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 04:18:41,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 04:18:44,521][12883] Updated weights for policy 0, policy_version 56761 (0.0043) +[2024-06-18 04:18:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42053.9, 300 sec: 41821.2). Total num frames: 930103296. Throughput: 0: 42136.9. Samples: 930174460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 04:18:46,994][12645] Avg episode reward: [(0, '0.139')] +[2024-06-18 04:18:47,645][12883] Updated weights for policy 0, policy_version 56771 (0.0034) +[2024-06-18 04:18:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 930283520. Throughput: 0: 41884.4. Samples: 930422080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 04:18:51,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 04:18:52,022][12883] Updated weights for policy 0, policy_version 56781 (0.0045) +[2024-06-18 04:18:55,415][12883] Updated weights for policy 0, policy_version 56791 (0.0038) +[2024-06-18 04:18:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 930529280. Throughput: 0: 42142.2. Samples: 930672860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 04:18:56,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 04:18:59,991][12883] Updated weights for policy 0, policy_version 56801 (0.0026) +[2024-06-18 04:19:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 930709504. Throughput: 0: 42309.6. Samples: 930810320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 04:19:01,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 04:19:03,396][12883] Updated weights for policy 0, policy_version 56811 (0.0044) +[2024-06-18 04:19:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 930938880. Throughput: 0: 42085.3. Samples: 931054480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 04:19:06,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 04:19:07,844][12883] Updated weights for policy 0, policy_version 56821 (0.0029) +[2024-06-18 04:19:11,386][12883] Updated weights for policy 0, policy_version 56831 (0.0043) +[2024-06-18 04:19:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 931168256. Throughput: 0: 42039.0. Samples: 931300760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) +[2024-06-18 04:19:11,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 04:19:15,447][12883] Updated weights for policy 0, policy_version 56841 (0.0032) +[2024-06-18 04:19:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42053.8, 300 sec: 41876.4). Total num frames: 931348480. Throughput: 0: 41881.2. Samples: 931428860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 04:19:16,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 04:19:19,081][12883] Updated weights for policy 0, policy_version 56851 (0.0033) +[2024-06-18 04:19:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 41820.8). Total num frames: 931561472. Throughput: 0: 41919.5. Samples: 931677200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 04:19:21,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 04:19:23,405][12883] Updated weights for policy 0, policy_version 56861 (0.0028) +[2024-06-18 04:19:26,926][12883] Updated weights for policy 0, policy_version 56871 (0.0038) +[2024-06-18 04:19:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 931774464. Throughput: 0: 42112.1. Samples: 931935900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 04:19:26,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 04:19:31,114][12883] Updated weights for policy 0, policy_version 56881 (0.0038) +[2024-06-18 04:19:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41233.3, 300 sec: 41765.3). Total num frames: 931954688. Throughput: 0: 41755.4. Samples: 932053460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 04:19:31,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 04:19:34,606][12883] Updated weights for policy 0, policy_version 56891 (0.0037) +[2024-06-18 04:19:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 932200448. Throughput: 0: 41764.9. Samples: 932301500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 04:19:36,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 04:19:38,874][12883] Updated weights for policy 0, policy_version 56901 (0.0034) +[2024-06-18 04:19:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 932397056. Throughput: 0: 41925.9. Samples: 932559520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 04:19:41,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 04:19:42,349][12883] Updated weights for policy 0, policy_version 56911 (0.0033) +[2024-06-18 04:19:43,430][12862] Signal inference workers to stop experience collection... (13450 times) +[2024-06-18 04:19:43,460][12883] InferenceWorker_p0-w0: stopping experience collection (13450 times) +[2024-06-18 04:19:43,487][12862] Signal inference workers to resume experience collection... (13450 times) +[2024-06-18 04:19:43,488][12883] InferenceWorker_p0-w0: resuming experience collection (13450 times) +[2024-06-18 04:19:46,577][12883] Updated weights for policy 0, policy_version 56921 (0.0029) +[2024-06-18 04:19:46,996][12645] Fps is (10 sec: 39310.5, 60 sec: 41504.1, 300 sec: 41876.0). Total num frames: 932593664. Throughput: 0: 41510.8. Samples: 932678420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 04:19:47,005][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 04:19:50,122][12883] Updated weights for policy 0, policy_version 56931 (0.0033) +[2024-06-18 04:19:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41821.4). Total num frames: 932823040. Throughput: 0: 41688.9. Samples: 932930480. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) +[2024-06-18 04:19:51,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 04:19:54,285][12883] Updated weights for policy 0, policy_version 56941 (0.0029) +[2024-06-18 04:19:56,994][12645] Fps is (10 sec: 40971.4, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 933003264. Throughput: 0: 41772.9. Samples: 933180540. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) +[2024-06-18 04:19:56,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 04:19:58,595][12883] Updated weights for policy 0, policy_version 56951 (0.0038) +[2024-06-18 04:20:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 933232640. Throughput: 0: 41683.2. Samples: 933304600. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) +[2024-06-18 04:20:01,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 04:20:02,127][12883] Updated weights for policy 0, policy_version 56961 (0.0033) +[2024-06-18 04:20:06,353][12883] Updated weights for policy 0, policy_version 56971 (0.0027) +[2024-06-18 04:20:06,996][12645] Fps is (10 sec: 42589.1, 60 sec: 41504.6, 300 sec: 41765.0). Total num frames: 933429248. Throughput: 0: 41584.2. Samples: 933548580. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) +[2024-06-18 04:20:06,997][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 04:20:09,971][12883] Updated weights for policy 0, policy_version 56981 (0.0027) +[2024-06-18 04:20:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 933642240. Throughput: 0: 41572.3. Samples: 933806660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) +[2024-06-18 04:20:11,994][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 04:20:14,066][12883] Updated weights for policy 0, policy_version 56991 (0.0038) +[2024-06-18 04:20:16,994][12645] Fps is (10 sec: 42607.9, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 933855232. Throughput: 0: 41725.0. Samples: 933931080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) +[2024-06-18 04:20:16,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 04:20:17,696][12883] Updated weights for policy 0, policy_version 57001 (0.0033) +[2024-06-18 04:20:21,874][12883] Updated weights for policy 0, policy_version 57011 (0.0034) +[2024-06-18 04:20:21,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 934068224. Throughput: 0: 41907.2. Samples: 934187320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) +[2024-06-18 04:20:21,994][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 04:20:25,408][12883] Updated weights for policy 0, policy_version 57021 (0.0039) +[2024-06-18 04:20:26,996][12645] Fps is (10 sec: 42588.8, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 934281216. Throughput: 0: 41711.6. Samples: 934436640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 04:20:26,996][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 04:20:29,586][12883] Updated weights for policy 0, policy_version 57031 (0.0037) +[2024-06-18 04:20:31,996][12645] Fps is (10 sec: 42588.3, 60 sec: 42323.8, 300 sec: 41876.1). Total num frames: 934494208. Throughput: 0: 41883.6. Samples: 934563160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 04:20:31,997][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 04:20:33,567][12883] Updated weights for policy 0, policy_version 57041 (0.0037) +[2024-06-18 04:20:36,994][12645] Fps is (10 sec: 40969.2, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 934690816. Throughput: 0: 41848.0. Samples: 934813640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 04:20:36,994][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 04:20:37,027][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057049_934690816.pth... +[2024-06-18 04:20:37,094][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056436_924647424.pth +[2024-06-18 04:20:37,625][12883] Updated weights for policy 0, policy_version 57051 (0.0031) +[2024-06-18 04:20:41,198][12883] Updated weights for policy 0, policy_version 57061 (0.0032) +[2024-06-18 04:20:41,994][12645] Fps is (10 sec: 39330.7, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 934887424. Throughput: 0: 41840.5. Samples: 935063360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 04:20:41,994][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 04:20:45,741][12883] Updated weights for policy 0, policy_version 57071 (0.0032) +[2024-06-18 04:20:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42054.3, 300 sec: 41876.4). Total num frames: 935116800. Throughput: 0: 41961.4. Samples: 935192860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 04:20:46,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 04:20:48,685][12883] Updated weights for policy 0, policy_version 57081 (0.0033) +[2024-06-18 04:20:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 935329792. Throughput: 0: 42109.3. Samples: 935443400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 04:20:51,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 04:20:53,351][12883] Updated weights for policy 0, policy_version 57091 (0.0028) +[2024-06-18 04:20:56,376][12883] Updated weights for policy 0, policy_version 57101 (0.0034) +[2024-06-18 04:20:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 935542784. Throughput: 0: 41864.4. Samples: 935690560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 04:20:56,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 04:21:00,906][12883] Updated weights for policy 0, policy_version 57111 (0.0026) +[2024-06-18 04:21:01,996][12645] Fps is (10 sec: 40950.4, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 935739392. Throughput: 0: 42030.8. Samples: 935822560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:21:01,996][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 04:21:04,292][12883] Updated weights for policy 0, policy_version 57121 (0.0037) +[2024-06-18 04:21:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42053.9, 300 sec: 41931.9). Total num frames: 935952384. Throughput: 0: 42034.2. Samples: 936078860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:21:06,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 04:21:08,924][12883] Updated weights for policy 0, policy_version 57131 (0.0037) +[2024-06-18 04:21:11,994][12645] Fps is (10 sec: 44247.3, 60 sec: 42325.4, 300 sec: 41932.3). Total num frames: 936181760. Throughput: 0: 42069.3. Samples: 936329660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:21:11,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 04:21:12,090][12883] Updated weights for policy 0, policy_version 57141 (0.0035) +[2024-06-18 04:21:13,357][12862] Signal inference workers to stop experience collection... (13500 times) +[2024-06-18 04:21:13,357][12862] Signal inference workers to resume experience collection... (13500 times) +[2024-06-18 04:21:13,366][12883] InferenceWorker_p0-w0: stopping experience collection (13500 times) +[2024-06-18 04:21:13,378][12883] InferenceWorker_p0-w0: resuming experience collection (13500 times) +[2024-06-18 04:21:16,662][12883] Updated weights for policy 0, policy_version 57151 (0.0023) +[2024-06-18 04:21:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 936361984. Throughput: 0: 42198.2. Samples: 936461980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:21:16,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 04:21:19,766][12883] Updated weights for policy 0, policy_version 57161 (0.0027) +[2024-06-18 04:21:21,996][12645] Fps is (10 sec: 40951.9, 60 sec: 42050.9, 300 sec: 41932.0). Total num frames: 936591360. Throughput: 0: 42250.7. Samples: 936715000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:21:21,996][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 04:21:24,235][12883] Updated weights for policy 0, policy_version 57171 (0.0037) +[2024-06-18 04:21:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42326.9, 300 sec: 41987.5). Total num frames: 936820736. Throughput: 0: 42238.2. Samples: 936964080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:21:26,994][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 04:21:27,734][12883] Updated weights for policy 0, policy_version 57181 (0.0039) +[2024-06-18 04:21:31,994][12645] Fps is (10 sec: 40967.4, 60 sec: 41780.7, 300 sec: 41876.7). Total num frames: 937000960. Throughput: 0: 42135.9. Samples: 937088980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:21:31,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 04:21:32,482][12883] Updated weights for policy 0, policy_version 57191 (0.0031) +[2024-06-18 04:21:35,426][12883] Updated weights for policy 0, policy_version 57201 (0.0041) +[2024-06-18 04:21:36,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42042.7). Total num frames: 937246720. Throughput: 0: 42200.0. Samples: 937342500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 04:21:36,997][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 04:21:39,982][12883] Updated weights for policy 0, policy_version 57211 (0.0034) +[2024-06-18 04:21:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 937410560. Throughput: 0: 42419.2. Samples: 937599420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:21:41,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 04:21:43,234][12883] Updated weights for policy 0, policy_version 57221 (0.0051) +[2024-06-18 04:21:46,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 937656320. Throughput: 0: 42181.3. Samples: 937720620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:21:46,994][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 04:21:47,552][12883] Updated weights for policy 0, policy_version 57231 (0.0036) +[2024-06-18 04:21:50,953][12883] Updated weights for policy 0, policy_version 57241 (0.0024) +[2024-06-18 04:21:51,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 937885696. Throughput: 0: 42211.0. Samples: 937978360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:21:51,994][12645] Avg episode reward: [(0, '0.040')] +[2024-06-18 04:21:55,132][12883] Updated weights for policy 0, policy_version 57251 (0.0036) +[2024-06-18 04:21:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 938065920. Throughput: 0: 42399.4. Samples: 938237640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:21:56,994][12645] Avg episode reward: [(0, '0.044')] +[2024-06-18 04:21:58,611][12883] Updated weights for policy 0, policy_version 57261 (0.0027) +[2024-06-18 04:22:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42600.0, 300 sec: 41931.9). Total num frames: 938295296. Throughput: 0: 42123.9. Samples: 938357560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:22:01,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 04:22:02,663][12883] Updated weights for policy 0, policy_version 57271 (0.0045) +[2024-06-18 04:22:06,278][12883] Updated weights for policy 0, policy_version 57281 (0.0032) +[2024-06-18 04:22:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42043.3). Total num frames: 938508288. Throughput: 0: 42236.0. Samples: 938615540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:22:07,003][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 04:22:10,990][12883] Updated weights for policy 0, policy_version 57291 (0.0039) +[2024-06-18 04:22:11,996][12645] Fps is (10 sec: 39313.0, 60 sec: 41777.5, 300 sec: 41876.1). Total num frames: 938688512. Throughput: 0: 42255.2. Samples: 938865660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 04:22:11,996][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 04:22:14,110][12883] Updated weights for policy 0, policy_version 57301 (0.0032) +[2024-06-18 04:22:16,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42098.2). Total num frames: 938917888. Throughput: 0: 42169.9. Samples: 938986720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 04:22:16,997][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 04:22:18,606][12883] Updated weights for policy 0, policy_version 57311 (0.0029) +[2024-06-18 04:22:20,344][12862] Signal inference workers to stop experience collection... (13550 times) +[2024-06-18 04:22:20,344][12862] Signal inference workers to resume experience collection... (13550 times) +[2024-06-18 04:22:20,386][12883] InferenceWorker_p0-w0: stopping experience collection (13550 times) +[2024-06-18 04:22:20,386][12883] InferenceWorker_p0-w0: resuming experience collection (13550 times) +[2024-06-18 04:22:21,993][12645] Fps is (10 sec: 42608.7, 60 sec: 42053.7, 300 sec: 41987.5). Total num frames: 939114496. Throughput: 0: 42215.6. Samples: 939242100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 04:22:21,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 04:22:22,184][12883] Updated weights for policy 0, policy_version 57321 (0.0024) +[2024-06-18 04:22:26,205][12883] Updated weights for policy 0, policy_version 57331 (0.0034) +[2024-06-18 04:22:26,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 939327488. Throughput: 0: 42057.3. Samples: 939492000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 04:22:26,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 04:22:29,874][12883] Updated weights for policy 0, policy_version 57341 (0.0025) +[2024-06-18 04:22:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 939540480. Throughput: 0: 42123.0. Samples: 939616160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 04:22:31,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 04:22:33,835][12883] Updated weights for policy 0, policy_version 57351 (0.0031) +[2024-06-18 04:22:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41507.8, 300 sec: 41987.5). Total num frames: 939737088. Throughput: 0: 42166.8. Samples: 939875860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 04:22:36,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 04:22:37,201][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057359_939769856.pth... +[2024-06-18 04:22:37,245][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056742_929660928.pth +[2024-06-18 04:22:38,107][12883] Updated weights for policy 0, policy_version 57361 (0.0038) +[2024-06-18 04:22:41,506][12883] Updated weights for policy 0, policy_version 57371 (0.0036) +[2024-06-18 04:22:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42043.3). Total num frames: 939982848. Throughput: 0: 41779.6. Samples: 940117720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 04:22:41,994][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 04:22:45,811][12883] Updated weights for policy 0, policy_version 57381 (0.0033) +[2024-06-18 04:22:46,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 940179456. Throughput: 0: 42092.4. Samples: 940251720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 04:22:46,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 04:22:49,123][12883] Updated weights for policy 0, policy_version 57391 (0.0050) +[2024-06-18 04:22:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 940376064. Throughput: 0: 41995.2. Samples: 940505320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 04:22:51,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 04:22:53,505][12883] Updated weights for policy 0, policy_version 57401 (0.0037) +[2024-06-18 04:22:56,850][12883] Updated weights for policy 0, policy_version 57411 (0.0021) +[2024-06-18 04:22:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 940621824. Throughput: 0: 42024.3. Samples: 940756660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 04:22:56,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 04:23:01,214][12883] Updated weights for policy 0, policy_version 57421 (0.0026) +[2024-06-18 04:23:01,993][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.4, 300 sec: 41987.5). Total num frames: 940802048. Throughput: 0: 42244.1. Samples: 940887600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 04:23:01,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 04:23:04,909][12883] Updated weights for policy 0, policy_version 57431 (0.0038) +[2024-06-18 04:23:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 941015040. Throughput: 0: 42123.0. Samples: 941137640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 04:23:06,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 04:23:08,873][12883] Updated weights for policy 0, policy_version 57441 (0.0045) +[2024-06-18 04:23:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42600.1, 300 sec: 42098.9). Total num frames: 941244416. Throughput: 0: 42323.6. Samples: 941396560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 04:23:11,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 04:23:12,395][12883] Updated weights for policy 0, policy_version 57451 (0.0036) +[2024-06-18 04:23:16,715][12883] Updated weights for policy 0, policy_version 57461 (0.0027) +[2024-06-18 04:23:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42053.9, 300 sec: 42098.6). Total num frames: 941441024. Throughput: 0: 42272.0. Samples: 941518400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 04:23:16,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 04:23:20,342][12883] Updated weights for policy 0, policy_version 57471 (0.0042) +[2024-06-18 04:23:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 941670400. Throughput: 0: 42162.2. Samples: 941773160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 04:23:21,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 04:23:21,995][12862] Saving new best policy, reward=0.493! +[2024-06-18 04:23:24,387][12883] Updated weights for policy 0, policy_version 57481 (0.0033) +[2024-06-18 04:23:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 941850624. Throughput: 0: 42566.7. Samples: 942033220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:23:26,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 04:23:28,090][12883] Updated weights for policy 0, policy_version 57491 (0.0035) +[2024-06-18 04:23:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 942080000. Throughput: 0: 42318.5. Samples: 942156040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:23:31,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 04:23:32,193][12883] Updated weights for policy 0, policy_version 57501 (0.0026) +[2024-06-18 04:23:35,802][12883] Updated weights for policy 0, policy_version 57511 (0.0037) +[2024-06-18 04:23:36,896][12862] Signal inference workers to stop experience collection... (13600 times) +[2024-06-18 04:23:36,946][12883] InferenceWorker_p0-w0: stopping experience collection (13600 times) +[2024-06-18 04:23:36,949][12862] Signal inference workers to resume experience collection... (13600 times) +[2024-06-18 04:23:36,958][12883] InferenceWorker_p0-w0: resuming experience collection (13600 times) +[2024-06-18 04:23:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42098.6). Total num frames: 942309376. Throughput: 0: 42399.0. Samples: 942413280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:23:36,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 04:23:39,797][12883] Updated weights for policy 0, policy_version 57521 (0.0038) +[2024-06-18 04:23:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 942505984. Throughput: 0: 42516.1. Samples: 942669880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:23:41,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 04:23:43,515][12883] Updated weights for policy 0, policy_version 57531 (0.0030) +[2024-06-18 04:23:46,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42323.8, 300 sec: 42153.8). Total num frames: 942718976. Throughput: 0: 42310.6. Samples: 942791680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:23:46,997][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 04:23:47,392][12883] Updated weights for policy 0, policy_version 57541 (0.0036) +[2024-06-18 04:23:51,207][12883] Updated weights for policy 0, policy_version 57551 (0.0033) +[2024-06-18 04:23:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 942931968. Throughput: 0: 42474.3. Samples: 943048980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:23:51,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 04:23:55,483][12883] Updated weights for policy 0, policy_version 57561 (0.0033) +[2024-06-18 04:23:56,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 943144960. Throughput: 0: 42380.4. Samples: 943303680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:23:56,994][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 04:23:59,029][12883] Updated weights for policy 0, policy_version 57571 (0.0027) +[2024-06-18 04:24:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 943357952. Throughput: 0: 42498.3. Samples: 943430820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 04:24:01,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 04:24:02,943][12883] Updated weights for policy 0, policy_version 57581 (0.0029) +[2024-06-18 04:24:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 943554560. Throughput: 0: 42394.2. Samples: 943680900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 04:24:06,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 04:24:07,391][12883] Updated weights for policy 0, policy_version 57591 (0.0038) +[2024-06-18 04:24:11,437][12883] Updated weights for policy 0, policy_version 57601 (0.0035) +[2024-06-18 04:24:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 943783936. Throughput: 0: 42230.1. Samples: 943933580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 04:24:11,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 04:24:14,951][12883] Updated weights for policy 0, policy_version 57611 (0.0035) +[2024-06-18 04:24:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 943996928. Throughput: 0: 42265.2. Samples: 944057980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 04:24:16,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 04:24:18,979][12883] Updated weights for policy 0, policy_version 57621 (0.0030) +[2024-06-18 04:24:21,996][12645] Fps is (10 sec: 40951.3, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 944193536. Throughput: 0: 42260.6. Samples: 944315100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 04:24:21,996][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 04:24:22,570][12883] Updated weights for policy 0, policy_version 57631 (0.0041) +[2024-06-18 04:24:26,565][12883] Updated weights for policy 0, policy_version 57641 (0.0035) +[2024-06-18 04:24:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 944406528. Throughput: 0: 42124.9. Samples: 944565500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 04:24:26,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 04:24:30,645][12883] Updated weights for policy 0, policy_version 57651 (0.0036) +[2024-06-18 04:24:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 944619520. Throughput: 0: 42160.8. Samples: 944688820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 04:24:31,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 04:24:34,356][12883] Updated weights for policy 0, policy_version 57661 (0.0044) +[2024-06-18 04:24:36,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42050.7, 300 sec: 42153.8). Total num frames: 944832512. Throughput: 0: 42060.9. Samples: 944941820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:24:36,997][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 04:24:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057668_944832512.pth... +[2024-06-18 04:24:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057049_934690816.pth +[2024-06-18 04:24:38,319][12883] Updated weights for policy 0, policy_version 57671 (0.0028) +[2024-06-18 04:24:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.5). Total num frames: 945029120. Throughput: 0: 42025.0. Samples: 945194800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:24:41,994][12645] Avg episode reward: [(0, '0.041')] +[2024-06-18 04:24:42,019][12883] Updated weights for policy 0, policy_version 57681 (0.0043) +[2024-06-18 04:24:46,354][12883] Updated weights for policy 0, policy_version 57691 (0.0038) +[2024-06-18 04:24:46,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42053.9, 300 sec: 42098.6). Total num frames: 945242112. Throughput: 0: 41964.4. Samples: 945319220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:24:46,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 04:24:49,930][12883] Updated weights for policy 0, policy_version 57701 (0.0036) +[2024-06-18 04:24:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 945455104. Throughput: 0: 42012.9. Samples: 945571480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:24:51,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 04:24:54,101][12883] Updated weights for policy 0, policy_version 57711 (0.0024) +[2024-06-18 04:24:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 945668096. Throughput: 0: 42028.5. Samples: 945824860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:24:56,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 04:24:57,624][12883] Updated weights for policy 0, policy_version 57721 (0.0032) +[2024-06-18 04:25:01,710][12883] Updated weights for policy 0, policy_version 57731 (0.0033) +[2024-06-18 04:25:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42154.4). Total num frames: 945864704. Throughput: 0: 42036.8. Samples: 945949640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:25:01,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 04:25:05,418][12883] Updated weights for policy 0, policy_version 57741 (0.0028) +[2024-06-18 04:25:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 946110464. Throughput: 0: 41930.1. Samples: 946201860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:25:06,994][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 04:25:09,414][12883] Updated weights for policy 0, policy_version 57751 (0.0035) +[2024-06-18 04:25:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 946290688. Throughput: 0: 41975.6. Samples: 946454400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:25:11,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 04:25:13,451][12883] Updated weights for policy 0, policy_version 57761 (0.0032) +[2024-06-18 04:25:13,637][12862] Signal inference workers to stop experience collection... (13650 times) +[2024-06-18 04:25:13,660][12883] InferenceWorker_p0-w0: stopping experience collection (13650 times) +[2024-06-18 04:25:13,698][12862] Signal inference workers to resume experience collection... (13650 times) +[2024-06-18 04:25:13,698][12883] InferenceWorker_p0-w0: resuming experience collection (13650 times) +[2024-06-18 04:25:16,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 946487296. Throughput: 0: 41837.4. Samples: 946571500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:25:16,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 04:25:17,378][12883] Updated weights for policy 0, policy_version 57771 (0.0042) +[2024-06-18 04:25:21,105][12883] Updated weights for policy 0, policy_version 57781 (0.0032) +[2024-06-18 04:25:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42600.0, 300 sec: 42265.5). Total num frames: 946749440. Throughput: 0: 42062.2. Samples: 946834520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:25:21,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 04:25:25,072][12883] Updated weights for policy 0, policy_version 57791 (0.0028) +[2024-06-18 04:25:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 946929664. Throughput: 0: 41972.8. Samples: 947083580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:25:26,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 04:25:29,065][12883] Updated weights for policy 0, policy_version 57801 (0.0039) +[2024-06-18 04:25:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 947126272. Throughput: 0: 41881.4. Samples: 947203880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:25:31,994][12645] Avg episode reward: [(0, '0.205')] +[2024-06-18 04:25:33,326][12883] Updated weights for policy 0, policy_version 57811 (0.0026) +[2024-06-18 04:25:36,644][12883] Updated weights for policy 0, policy_version 57821 (0.0029) +[2024-06-18 04:25:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41780.8, 300 sec: 42209.6). Total num frames: 947339264. Throughput: 0: 41881.8. Samples: 947456160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:25:36,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 04:25:41,149][12883] Updated weights for policy 0, policy_version 57831 (0.0025) +[2024-06-18 04:25:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 947552256. Throughput: 0: 41972.0. Samples: 947713600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:25:41,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 04:25:44,603][12883] Updated weights for policy 0, policy_version 57841 (0.0031) +[2024-06-18 04:25:46,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42050.7, 300 sec: 42153.7). Total num frames: 947765248. Throughput: 0: 41903.7. Samples: 947835400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:25:46,997][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 04:25:49,042][12883] Updated weights for policy 0, policy_version 57851 (0.0046) +[2024-06-18 04:25:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 947978240. Throughput: 0: 42077.8. Samples: 948095360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:25:51,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 04:25:52,146][12883] Updated weights for policy 0, policy_version 57861 (0.0041) +[2024-06-18 04:25:56,649][12883] Updated weights for policy 0, policy_version 57871 (0.0036) +[2024-06-18 04:25:56,994][12645] Fps is (10 sec: 40969.8, 60 sec: 41779.3, 300 sec: 42154.4). Total num frames: 948174848. Throughput: 0: 42207.6. Samples: 948353740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:25:56,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 04:25:59,791][12883] Updated weights for policy 0, policy_version 57881 (0.0047) +[2024-06-18 04:26:01,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 948420608. Throughput: 0: 42325.7. Samples: 948476160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:26:01,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 04:26:04,256][12883] Updated weights for policy 0, policy_version 57891 (0.0036) +[2024-06-18 04:26:06,996][12645] Fps is (10 sec: 42588.3, 60 sec: 41504.6, 300 sec: 42098.2). Total num frames: 948600832. Throughput: 0: 42148.0. Samples: 948731280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:26:06,997][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 04:26:07,618][12883] Updated weights for policy 0, policy_version 57901 (0.0045) +[2024-06-18 04:26:11,996][12645] Fps is (10 sec: 37674.8, 60 sec: 41777.6, 300 sec: 42153.8). Total num frames: 948797440. Throughput: 0: 42292.0. Samples: 948986820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:26:11,997][12645] Avg episode reward: [(0, '0.081')] +[2024-06-18 04:26:12,246][12883] Updated weights for policy 0, policy_version 57911 (0.0032) +[2024-06-18 04:26:15,412][12883] Updated weights for policy 0, policy_version 57921 (0.0038) +[2024-06-18 04:26:16,994][12645] Fps is (10 sec: 47524.2, 60 sec: 43144.5, 300 sec: 42321.0). Total num frames: 949075968. Throughput: 0: 42344.3. Samples: 949109380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:26:16,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 04:26:20,050][12883] Updated weights for policy 0, policy_version 57931 (0.0040) +[2024-06-18 04:26:21,994][12645] Fps is (10 sec: 40969.6, 60 sec: 40960.0, 300 sec: 41987.5). Total num frames: 949207040. Throughput: 0: 42232.9. Samples: 949356640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:26:21,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 04:26:23,228][12883] Updated weights for policy 0, policy_version 57941 (0.0036) +[2024-06-18 04:26:26,994][12645] Fps is (10 sec: 34406.6, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 949420032. Throughput: 0: 42024.9. Samples: 949604720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:26:26,994][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 04:26:27,956][12883] Updated weights for policy 0, policy_version 57951 (0.0037) +[2024-06-18 04:26:31,118][12883] Updated weights for policy 0, policy_version 57961 (0.0027) +[2024-06-18 04:26:31,800][12862] Signal inference workers to stop experience collection... (13700 times) +[2024-06-18 04:26:31,831][12883] InferenceWorker_p0-w0: stopping experience collection (13700 times) +[2024-06-18 04:26:31,854][12862] Signal inference workers to resume experience collection... (13700 times) +[2024-06-18 04:26:31,855][12883] InferenceWorker_p0-w0: resuming experience collection (13700 times) +[2024-06-18 04:26:31,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 42154.4). Total num frames: 949682176. Throughput: 0: 42173.3. Samples: 949733100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:26:31,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 04:26:35,800][12883] Updated weights for policy 0, policy_version 57971 (0.0041) +[2024-06-18 04:26:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 949829632. Throughput: 0: 42026.7. Samples: 949986560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:26:36,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 04:26:37,040][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057974_949846016.pth... +[2024-06-18 04:26:37,103][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057359_939769856.pth +[2024-06-18 04:26:39,052][12883] Updated weights for policy 0, policy_version 57981 (0.0034) +[2024-06-18 04:26:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 950075392. Throughput: 0: 41576.9. Samples: 950224700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:26:41,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 04:26:44,061][12883] Updated weights for policy 0, policy_version 57991 (0.0034) +[2024-06-18 04:26:46,691][12883] Updated weights for policy 0, policy_version 58001 (0.0026) +[2024-06-18 04:26:46,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42326.9, 300 sec: 42098.6). Total num frames: 950304768. Throughput: 0: 41945.8. Samples: 950363720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:26:46,994][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 04:26:51,592][12883] Updated weights for policy 0, policy_version 58011 (0.0032) +[2024-06-18 04:26:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 950452224. Throughput: 0: 41705.3. Samples: 950607920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:26:51,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 04:26:54,386][12883] Updated weights for policy 0, policy_version 58021 (0.0029) +[2024-06-18 04:26:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 950714368. Throughput: 0: 41555.9. Samples: 950856740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:26:56,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 04:26:59,017][12883] Updated weights for policy 0, policy_version 58031 (0.0026) +[2024-06-18 04:27:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 41506.3, 300 sec: 42043.0). Total num frames: 950910976. Throughput: 0: 41853.9. Samples: 950992800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 04:27:01,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 04:27:02,166][12883] Updated weights for policy 0, policy_version 58041 (0.0048) +[2024-06-18 04:27:06,603][12883] Updated weights for policy 0, policy_version 58051 (0.0049) +[2024-06-18 04:27:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41780.8, 300 sec: 42098.9). Total num frames: 951107584. Throughput: 0: 41714.7. Samples: 951233800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 04:27:06,994][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 04:27:10,245][12883] Updated weights for policy 0, policy_version 58061 (0.0033) +[2024-06-18 04:27:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42098.9). Total num frames: 951336960. Throughput: 0: 41863.6. Samples: 951488580. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 04:27:11,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 04:27:13,984][12883] Updated weights for policy 0, policy_version 58071 (0.0033) +[2024-06-18 04:27:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.1, 300 sec: 42098.5). Total num frames: 951533568. Throughput: 0: 41854.2. Samples: 951616540. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 04:27:16,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 04:27:17,888][12883] Updated weights for policy 0, policy_version 58081 (0.0042) +[2024-06-18 04:27:21,656][12883] Updated weights for policy 0, policy_version 58091 (0.0030) +[2024-06-18 04:27:21,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 951762944. Throughput: 0: 41804.1. Samples: 951867840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 04:27:21,996][12645] Avg episode reward: [(0, '0.086')] +[2024-06-18 04:27:25,670][12883] Updated weights for policy 0, policy_version 58101 (0.0044) +[2024-06-18 04:27:26,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 951975936. Throughput: 0: 42045.4. Samples: 952116840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 04:27:26,996][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 04:27:29,579][12883] Updated weights for policy 0, policy_version 58111 (0.0036) +[2024-06-18 04:27:31,996][12645] Fps is (10 sec: 37683.3, 60 sec: 40958.4, 300 sec: 42042.7). Total num frames: 952139776. Throughput: 0: 41805.1. Samples: 952245040. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) +[2024-06-18 04:27:31,996][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 04:27:33,241][12883] Updated weights for policy 0, policy_version 58121 (0.0029) +[2024-06-18 04:27:36,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42871.4, 300 sec: 42098.5). Total num frames: 952401920. Throughput: 0: 42012.3. Samples: 952498480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 04:27:36,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 04:27:38,111][12883] Updated weights for policy 0, policy_version 58131 (0.0040) +[2024-06-18 04:27:41,133][12883] Updated weights for policy 0, policy_version 58141 (0.0030) +[2024-06-18 04:27:41,995][12645] Fps is (10 sec: 47518.4, 60 sec: 42324.4, 300 sec: 42153.9). Total num frames: 952614912. Throughput: 0: 42065.9. Samples: 952749760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 04:27:41,995][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 04:27:45,801][12883] Updated weights for policy 0, policy_version 58151 (0.0040) +[2024-06-18 04:27:46,994][12645] Fps is (10 sec: 37683.8, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 952778752. Throughput: 0: 41886.6. Samples: 952877700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 04:27:46,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 04:27:48,930][12883] Updated weights for policy 0, policy_version 58161 (0.0032) +[2024-06-18 04:27:51,996][12645] Fps is (10 sec: 40955.7, 60 sec: 42869.8, 300 sec: 42042.7). Total num frames: 953024512. Throughput: 0: 42119.6. Samples: 953129280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 04:27:51,997][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 04:27:53,711][12883] Updated weights for policy 0, policy_version 58171 (0.0031) +[2024-06-18 04:27:55,750][12862] Signal inference workers to stop experience collection... (13750 times) +[2024-06-18 04:27:55,785][12883] InferenceWorker_p0-w0: stopping experience collection (13750 times) +[2024-06-18 04:27:55,805][12862] Signal inference workers to resume experience collection... (13750 times) +[2024-06-18 04:27:55,808][12883] InferenceWorker_p0-w0: resuming experience collection (13750 times) +[2024-06-18 04:27:56,765][12883] Updated weights for policy 0, policy_version 58181 (0.0033) +[2024-06-18 04:27:56,996][12645] Fps is (10 sec: 45864.3, 60 sec: 42050.6, 300 sec: 42153.7). Total num frames: 953237504. Throughput: 0: 42008.9. Samples: 953379080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 04:27:56,997][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 04:28:01,479][12883] Updated weights for policy 0, policy_version 58191 (0.0039) +[2024-06-18 04:28:01,994][12645] Fps is (10 sec: 37691.4, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 953401344. Throughput: 0: 41999.8. Samples: 953506540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 04:28:01,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 04:28:04,574][12883] Updated weights for policy 0, policy_version 58201 (0.0042) +[2024-06-18 04:28:06,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 953663488. Throughput: 0: 42054.0. Samples: 953760180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 04:28:06,994][12645] Avg episode reward: [(0, '0.055')] +[2024-06-18 04:28:09,287][12883] Updated weights for policy 0, policy_version 58211 (0.0029) +[2024-06-18 04:28:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 953860096. Throughput: 0: 42236.3. Samples: 954017380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 04:28:11,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 04:28:12,475][12883] Updated weights for policy 0, policy_version 58221 (0.0036) +[2024-06-18 04:28:16,994][12645] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 954040320. Throughput: 0: 42179.9. Samples: 954143040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 04:28:16,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 04:28:17,331][12883] Updated weights for policy 0, policy_version 58231 (0.0027) +[2024-06-18 04:28:20,281][12883] Updated weights for policy 0, policy_version 58241 (0.0054) +[2024-06-18 04:28:22,000][12645] Fps is (10 sec: 44209.4, 60 sec: 42322.5, 300 sec: 42208.7). Total num frames: 954302464. Throughput: 0: 42095.1. Samples: 954393020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 04:28:22,001][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 04:28:25,156][12883] Updated weights for policy 0, policy_version 58251 (0.0048) +[2024-06-18 04:28:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42053.8, 300 sec: 42098.5). Total num frames: 954499072. Throughput: 0: 42129.6. Samples: 954645540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 04:28:26,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 04:28:28,034][12883] Updated weights for policy 0, policy_version 58261 (0.0041) +[2024-06-18 04:28:31,994][12645] Fps is (10 sec: 39346.6, 60 sec: 42600.0, 300 sec: 41987.5). Total num frames: 954695680. Throughput: 0: 41934.7. Samples: 954764760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 04:28:31,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 04:28:33,004][12883] Updated weights for policy 0, policy_version 58271 (0.0044) +[2024-06-18 04:28:35,610][12883] Updated weights for policy 0, policy_version 58281 (0.0033) +[2024-06-18 04:28:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 954925056. Throughput: 0: 42029.3. Samples: 955020500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 04:28:36,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 04:28:37,138][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058285_954941440.pth... +[2024-06-18 04:28:37,193][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057668_944832512.pth +[2024-06-18 04:28:40,517][12883] Updated weights for policy 0, policy_version 58291 (0.0032) +[2024-06-18 04:28:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41507.0, 300 sec: 41987.8). Total num frames: 955105280. Throughput: 0: 42331.1. Samples: 955283880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 04:28:41,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 04:28:43,285][12883] Updated weights for policy 0, policy_version 58301 (0.0035) +[2024-06-18 04:28:46,996][12645] Fps is (10 sec: 37674.5, 60 sec: 42050.6, 300 sec: 41931.6). Total num frames: 955301888. Throughput: 0: 42076.2. Samples: 955400060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 04:28:46,996][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 04:28:48,033][12883] Updated weights for policy 0, policy_version 58311 (0.0032) +[2024-06-18 04:28:51,554][12883] Updated weights for policy 0, policy_version 58321 (0.0035) +[2024-06-18 04:28:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42053.9, 300 sec: 42043.0). Total num frames: 955547648. Throughput: 0: 42120.2. Samples: 955655580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 04:28:51,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 04:28:55,901][12883] Updated weights for policy 0, policy_version 58331 (0.0029) +[2024-06-18 04:28:56,994][12645] Fps is (10 sec: 42608.5, 60 sec: 41507.8, 300 sec: 41931.9). Total num frames: 955727872. Throughput: 0: 42108.2. Samples: 955912240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 04:28:56,994][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 04:28:59,212][12883] Updated weights for policy 0, policy_version 58341 (0.0039) +[2024-06-18 04:29:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 955957248. Throughput: 0: 42037.7. Samples: 956034740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 04:29:01,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 04:29:03,339][12883] Updated weights for policy 0, policy_version 58351 (0.0039) +[2024-06-18 04:29:06,990][12883] Updated weights for policy 0, policy_version 58361 (0.0023) +[2024-06-18 04:29:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 956186624. Throughput: 0: 42227.8. Samples: 956293000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 04:29:06,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 04:29:10,729][12862] Signal inference workers to stop experience collection... (13800 times) +[2024-06-18 04:29:10,729][12862] Signal inference workers to resume experience collection... (13800 times) +[2024-06-18 04:29:10,760][12883] InferenceWorker_p0-w0: stopping experience collection (13800 times) +[2024-06-18 04:29:10,764][12883] InferenceWorker_p0-w0: resuming experience collection (13800 times) +[2024-06-18 04:29:11,055][12883] Updated weights for policy 0, policy_version 58371 (0.0030) +[2024-06-18 04:29:11,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 956399616. Throughput: 0: 42297.0. Samples: 956548900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 04:29:11,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 04:29:14,527][12883] Updated weights for policy 0, policy_version 58381 (0.0038) +[2024-06-18 04:29:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42043.3). Total num frames: 956596224. Throughput: 0: 42330.2. Samples: 956669620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 04:29:16,994][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 04:29:19,190][12883] Updated weights for policy 0, policy_version 58391 (0.0034) +[2024-06-18 04:29:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41783.6, 300 sec: 42043.0). Total num frames: 956809216. Throughput: 0: 42291.6. Samples: 956923620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 04:29:21,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 04:29:22,161][12883] Updated weights for policy 0, policy_version 58401 (0.0040) +[2024-06-18 04:29:26,740][12883] Updated weights for policy 0, policy_version 58411 (0.0030) +[2024-06-18 04:29:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 957005824. Throughput: 0: 42233.3. Samples: 957184380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 04:29:26,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 04:29:30,213][12883] Updated weights for policy 0, policy_version 58421 (0.0043) +[2024-06-18 04:29:31,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42596.8, 300 sec: 42098.6). Total num frames: 957251584. Throughput: 0: 42240.5. Samples: 957300880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 04:29:31,996][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 04:29:34,327][12883] Updated weights for policy 0, policy_version 58431 (0.0034) +[2024-06-18 04:29:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 957448192. Throughput: 0: 42264.3. Samples: 957557480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 04:29:36,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 04:29:37,942][12883] Updated weights for policy 0, policy_version 58441 (0.0041) +[2024-06-18 04:29:41,923][12883] Updated weights for policy 0, policy_version 58451 (0.0038) +[2024-06-18 04:29:41,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 957661184. Throughput: 0: 42356.7. Samples: 957818300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 04:29:41,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 04:29:45,534][12883] Updated weights for policy 0, policy_version 58461 (0.0045) +[2024-06-18 04:29:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42098.6). Total num frames: 957874176. Throughput: 0: 42430.7. Samples: 957944120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 04:29:46,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 04:29:49,474][12883] Updated weights for policy 0, policy_version 58471 (0.0024) +[2024-06-18 04:29:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 958087168. Throughput: 0: 42423.9. Samples: 958202080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 04:29:51,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 04:29:53,410][12883] Updated weights for policy 0, policy_version 58481 (0.0039) +[2024-06-18 04:29:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 958300160. Throughput: 0: 42401.3. Samples: 958456960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 04:29:56,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 04:29:57,312][12883] Updated weights for policy 0, policy_version 58491 (0.0035) +[2024-06-18 04:30:00,966][12883] Updated weights for policy 0, policy_version 58501 (0.0030) +[2024-06-18 04:30:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 958496768. Throughput: 0: 42475.1. Samples: 958581000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:30:01,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 04:30:05,171][12883] Updated weights for policy 0, policy_version 58511 (0.0022) +[2024-06-18 04:30:06,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42153.8). Total num frames: 958726144. Throughput: 0: 42408.4. Samples: 958832100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:30:06,996][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 04:30:08,631][12883] Updated weights for policy 0, policy_version 58521 (0.0038) +[2024-06-18 04:30:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 958922752. Throughput: 0: 42316.1. Samples: 959088600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:30:11,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 04:30:13,005][12883] Updated weights for policy 0, policy_version 58531 (0.0028) +[2024-06-18 04:30:16,584][12883] Updated weights for policy 0, policy_version 58541 (0.0038) +[2024-06-18 04:30:16,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 959135744. Throughput: 0: 42435.0. Samples: 959210360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:30:16,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 04:30:20,595][12883] Updated weights for policy 0, policy_version 58551 (0.0027) +[2024-06-18 04:30:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 959348736. Throughput: 0: 42353.1. Samples: 959463460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:30:21,996][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 04:30:24,290][12883] Updated weights for policy 0, policy_version 58561 (0.0026) +[2024-06-18 04:30:26,995][12645] Fps is (10 sec: 40953.9, 60 sec: 42324.3, 300 sec: 42098.3). Total num frames: 959545344. Throughput: 0: 42343.1. Samples: 959723800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:30:26,996][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 04:30:28,438][12883] Updated weights for policy 0, policy_version 58571 (0.0045) +[2024-06-18 04:30:31,942][12883] Updated weights for policy 0, policy_version 58581 (0.0051) +[2024-06-18 04:30:31,994][12645] Fps is (10 sec: 44245.9, 60 sec: 42326.8, 300 sec: 42209.6). Total num frames: 959791104. Throughput: 0: 42209.7. Samples: 959843560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:30:31,995][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 04:30:36,158][12883] Updated weights for policy 0, policy_version 58591 (0.0042) +[2024-06-18 04:30:36,996][12645] Fps is (10 sec: 42595.4, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 959971328. Throughput: 0: 42177.5. Samples: 960100160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 04:30:36,996][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 04:30:37,028][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058593_959987712.pth... +[2024-06-18 04:30:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057974_949846016.pth +[2024-06-18 04:30:39,583][12883] Updated weights for policy 0, policy_version 58601 (0.0037) +[2024-06-18 04:30:41,994][12645] Fps is (10 sec: 37684.0, 60 sec: 41779.3, 300 sec: 42043.4). Total num frames: 960167936. Throughput: 0: 42246.3. Samples: 960358040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 04:30:41,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 04:30:42,400][12862] Signal inference workers to stop experience collection... (13850 times) +[2024-06-18 04:30:42,400][12862] Signal inference workers to resume experience collection... (13850 times) +[2024-06-18 04:30:42,420][12883] InferenceWorker_p0-w0: stopping experience collection (13850 times) +[2024-06-18 04:30:42,420][12883] InferenceWorker_p0-w0: resuming experience collection (13850 times) +[2024-06-18 04:30:44,425][12883] Updated weights for policy 0, policy_version 58611 (0.0029) +[2024-06-18 04:30:46,996][12645] Fps is (10 sec: 44236.6, 60 sec: 42323.7, 300 sec: 42153.8). Total num frames: 960413696. Throughput: 0: 42286.8. Samples: 960484000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 04:30:46,996][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 04:30:47,441][12883] Updated weights for policy 0, policy_version 58621 (0.0036) +[2024-06-18 04:30:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 960593920. Throughput: 0: 42244.3. Samples: 960733000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 04:30:51,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 04:30:52,007][12883] Updated weights for policy 0, policy_version 58631 (0.0034) +[2024-06-18 04:30:55,059][12883] Updated weights for policy 0, policy_version 58641 (0.0022) +[2024-06-18 04:30:56,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 960823296. Throughput: 0: 42114.2. Samples: 960983740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 04:30:56,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 04:30:59,792][12883] Updated weights for policy 0, policy_version 58651 (0.0038) +[2024-06-18 04:31:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42154.4). Total num frames: 961036288. Throughput: 0: 42211.6. Samples: 961109880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 04:31:01,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 04:31:02,889][12883] Updated weights for policy 0, policy_version 58661 (0.0032) +[2024-06-18 04:31:06,996][12645] Fps is (10 sec: 40950.6, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 961232896. Throughput: 0: 42235.9. Samples: 961364080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 04:31:06,997][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 04:31:07,447][12883] Updated weights for policy 0, policy_version 58671 (0.0028) +[2024-06-18 04:31:10,732][12883] Updated weights for policy 0, policy_version 58681 (0.0032) +[2024-06-18 04:31:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 961462272. Throughput: 0: 41838.3. Samples: 961606460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 04:31:11,994][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 04:31:15,131][12883] Updated weights for policy 0, policy_version 58691 (0.0035) +[2024-06-18 04:31:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 961658880. Throughput: 0: 42176.1. Samples: 961741480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-18 04:31:16,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 04:31:18,721][12883] Updated weights for policy 0, policy_version 58701 (0.0030) +[2024-06-18 04:31:22,000][12645] Fps is (10 sec: 40934.3, 60 sec: 42049.4, 300 sec: 42208.7). Total num frames: 961871872. Throughput: 0: 42065.1. Samples: 961993260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-18 04:31:22,001][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 04:31:23,087][12883] Updated weights for policy 0, policy_version 58711 (0.0030) +[2024-06-18 04:31:26,575][12883] Updated weights for policy 0, policy_version 58721 (0.0035) +[2024-06-18 04:31:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42326.3, 300 sec: 42043.0). Total num frames: 962084864. Throughput: 0: 41899.4. Samples: 962243520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-18 04:31:26,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 04:31:30,746][12883] Updated weights for policy 0, policy_version 58731 (0.0027) +[2024-06-18 04:31:31,994][12645] Fps is (10 sec: 42625.3, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 962297856. Throughput: 0: 42031.0. Samples: 962375300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-18 04:31:31,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 04:31:34,714][12883] Updated weights for policy 0, policy_version 58741 (0.0031) +[2024-06-18 04:31:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42599.9, 300 sec: 42209.6). Total num frames: 962527232. Throughput: 0: 42042.1. Samples: 962624900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-18 04:31:36,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 04:31:38,359][12883] Updated weights for policy 0, policy_version 58751 (0.0022) +[2024-06-18 04:31:41,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42050.6, 300 sec: 41987.2). Total num frames: 962691072. Throughput: 0: 42203.2. Samples: 962882980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-18 04:31:41,996][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 04:31:42,417][12883] Updated weights for policy 0, policy_version 58761 (0.0037) +[2024-06-18 04:31:46,093][12883] Updated weights for policy 0, policy_version 58771 (0.0032) +[2024-06-18 04:31:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42053.8, 300 sec: 42320.7). Total num frames: 962936832. Throughput: 0: 42057.3. Samples: 963002460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) +[2024-06-18 04:31:46,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 04:31:50,202][12883] Updated weights for policy 0, policy_version 58781 (0.0032) +[2024-06-18 04:31:52,000][12645] Fps is (10 sec: 45856.8, 60 sec: 42594.0, 300 sec: 42153.2). Total num frames: 963149824. Throughput: 0: 42144.7. Samples: 963260760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:31:52,000][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 04:31:53,757][12883] Updated weights for policy 0, policy_version 58791 (0.0036) +[2024-06-18 04:31:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 963346432. Throughput: 0: 42495.2. Samples: 963518740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:31:56,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 04:31:58,143][12883] Updated weights for policy 0, policy_version 58801 (0.0029) +[2024-06-18 04:31:58,559][12862] Signal inference workers to stop experience collection... (13900 times) +[2024-06-18 04:31:58,616][12883] InferenceWorker_p0-w0: stopping experience collection (13900 times) +[2024-06-18 04:31:58,616][12862] Signal inference workers to resume experience collection... (13900 times) +[2024-06-18 04:31:58,634][12883] InferenceWorker_p0-w0: resuming experience collection (13900 times) +[2024-06-18 04:32:01,401][12883] Updated weights for policy 0, policy_version 58811 (0.0043) +[2024-06-18 04:32:01,994][12645] Fps is (10 sec: 42624.8, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 963575808. Throughput: 0: 42094.2. Samples: 963635720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:32:01,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 04:32:05,924][12883] Updated weights for policy 0, policy_version 58821 (0.0029) +[2024-06-18 04:32:06,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42598.4, 300 sec: 42209.3). Total num frames: 963788800. Throughput: 0: 42158.8. Samples: 963890240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:32:06,997][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 04:32:09,228][12883] Updated weights for policy 0, policy_version 58831 (0.0043) +[2024-06-18 04:32:11,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 963952640. Throughput: 0: 42208.4. Samples: 964142900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:32:11,994][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 04:32:13,541][12883] Updated weights for policy 0, policy_version 58841 (0.0033) +[2024-06-18 04:32:16,887][12883] Updated weights for policy 0, policy_version 58851 (0.0042) +[2024-06-18 04:32:16,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42209.9). Total num frames: 964214784. Throughput: 0: 41899.1. Samples: 964260760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:32:16,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 04:32:21,237][12883] Updated weights for policy 0, policy_version 58861 (0.0036) +[2024-06-18 04:32:21,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42329.8, 300 sec: 42154.4). Total num frames: 964411392. Throughput: 0: 42188.1. Samples: 964523360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:32:21,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 04:32:24,416][12883] Updated weights for policy 0, policy_version 58871 (0.0031) +[2024-06-18 04:32:26,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42209.9). Total num frames: 964591616. Throughput: 0: 41953.6. Samples: 964770800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:32:26,994][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 04:32:29,010][12883] Updated weights for policy 0, policy_version 58881 (0.0032) +[2024-06-18 04:32:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 964820992. Throughput: 0: 42016.5. Samples: 964893200. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) +[2024-06-18 04:32:31,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 04:32:32,701][12883] Updated weights for policy 0, policy_version 58891 (0.0039) +[2024-06-18 04:32:36,854][12883] Updated weights for policy 0, policy_version 58901 (0.0037) +[2024-06-18 04:32:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 42098.7). Total num frames: 965033984. Throughput: 0: 42022.8. Samples: 965151520. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) +[2024-06-18 04:32:36,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 04:32:37,027][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058901_965033984.pth... +[2024-06-18 04:32:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058285_954941440.pth +[2024-06-18 04:32:40,558][12883] Updated weights for policy 0, policy_version 58911 (0.0052) +[2024-06-18 04:32:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 965230592. Throughput: 0: 41754.2. Samples: 965397680. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) +[2024-06-18 04:32:41,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 04:32:44,558][12883] Updated weights for policy 0, policy_version 58921 (0.0038) +[2024-06-18 04:32:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 965459968. Throughput: 0: 41780.5. Samples: 965515840. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) +[2024-06-18 04:32:46,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 04:32:48,462][12883] Updated weights for policy 0, policy_version 58931 (0.0040) +[2024-06-18 04:32:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41510.5, 300 sec: 42043.3). Total num frames: 965640192. Throughput: 0: 41862.2. Samples: 965773940. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) +[2024-06-18 04:32:51,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 04:32:52,767][12883] Updated weights for policy 0, policy_version 58941 (0.0026) +[2024-06-18 04:32:56,561][12883] Updated weights for policy 0, policy_version 58951 (0.0037) +[2024-06-18 04:32:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 965853184. Throughput: 0: 41636.5. Samples: 966016540. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) +[2024-06-18 04:32:56,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 04:33:00,442][12883] Updated weights for policy 0, policy_version 58961 (0.0037) +[2024-06-18 04:33:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 966066176. Throughput: 0: 41873.9. Samples: 966145080. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) +[2024-06-18 04:33:01,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 04:33:04,535][12883] Updated weights for policy 0, policy_version 58971 (0.0036) +[2024-06-18 04:33:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41234.7, 300 sec: 42043.0). Total num frames: 966262784. Throughput: 0: 41530.2. Samples: 966392220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 04:33:06,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 04:33:08,437][12883] Updated weights for policy 0, policy_version 58981 (0.0028) +[2024-06-18 04:33:11,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42323.9, 300 sec: 42209.3). Total num frames: 966492160. Throughput: 0: 41592.7. Samples: 966642560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 04:33:11,996][12645] Avg episode reward: [(0, '0.124')] +[2024-06-18 04:33:12,126][12883] Updated weights for policy 0, policy_version 58991 (0.0026) +[2024-06-18 04:33:15,997][12883] Updated weights for policy 0, policy_version 59001 (0.0027) +[2024-06-18 04:33:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41932.8). Total num frames: 966672384. Throughput: 0: 41789.3. Samples: 966773720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 04:33:16,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 04:33:17,000][12862] Signal inference workers to stop experience collection... (13950 times) +[2024-06-18 04:33:17,010][12883] InferenceWorker_p0-w0: stopping experience collection (13950 times) +[2024-06-18 04:33:17,055][12862] Signal inference workers to resume experience collection... (13950 times) +[2024-06-18 04:33:17,056][12883] InferenceWorker_p0-w0: resuming experience collection (13950 times) +[2024-06-18 04:33:19,851][12883] Updated weights for policy 0, policy_version 59011 (0.0042) +[2024-06-18 04:33:21,994][12645] Fps is (10 sec: 37691.5, 60 sec: 40960.0, 300 sec: 41931.9). Total num frames: 966868992. Throughput: 0: 41519.1. Samples: 967019880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 04:33:21,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 04:33:23,722][12883] Updated weights for policy 0, policy_version 59021 (0.0034) +[2024-06-18 04:33:26,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 967114752. Throughput: 0: 41616.6. Samples: 967270520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 04:33:26,996][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 04:33:28,037][12883] Updated weights for policy 0, policy_version 59031 (0.0035) +[2024-06-18 04:33:31,810][12883] Updated weights for policy 0, policy_version 59041 (0.0029) +[2024-06-18 04:33:31,994][12645] Fps is (10 sec: 45874.6, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 967327744. Throughput: 0: 41971.0. Samples: 967404540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 04:33:31,994][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 04:33:35,555][12883] Updated weights for policy 0, policy_version 59051 (0.0042) +[2024-06-18 04:33:36,996][12645] Fps is (10 sec: 40960.2, 60 sec: 41504.5, 300 sec: 42098.2). Total num frames: 967524352. Throughput: 0: 41688.5. Samples: 967650020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 04:33:36,996][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 04:33:39,527][12883] Updated weights for policy 0, policy_version 59061 (0.0029) +[2024-06-18 04:33:42,000][12645] Fps is (10 sec: 42572.0, 60 sec: 42047.9, 300 sec: 42209.1). Total num frames: 967753728. Throughput: 0: 41913.3. Samples: 967902900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 04:33:42,001][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 04:33:43,462][12883] Updated weights for policy 0, policy_version 59071 (0.0046) +[2024-06-18 04:33:46,994][12645] Fps is (10 sec: 44246.4, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 967966720. Throughput: 0: 41919.8. Samples: 968031480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 04:33:46,994][12645] Avg episode reward: [(0, '0.145')] +[2024-06-18 04:33:47,160][12883] Updated weights for policy 0, policy_version 59081 (0.0052) +[2024-06-18 04:33:51,606][12883] Updated weights for policy 0, policy_version 59091 (0.0037) +[2024-06-18 04:33:51,994][12645] Fps is (10 sec: 40985.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 968163328. Throughput: 0: 42015.9. Samples: 968282940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 04:33:51,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 04:33:55,012][12883] Updated weights for policy 0, policy_version 59101 (0.0036) +[2024-06-18 04:33:56,996][12645] Fps is (10 sec: 42590.4, 60 sec: 42324.0, 300 sec: 42153.8). Total num frames: 968392704. Throughput: 0: 41839.3. Samples: 968525320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 04:33:56,996][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 04:33:59,517][12883] Updated weights for policy 0, policy_version 59111 (0.0044) +[2024-06-18 04:34:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 968572928. Throughput: 0: 41861.8. Samples: 968657500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 04:34:01,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 04:34:02,789][12883] Updated weights for policy 0, policy_version 59121 (0.0027) +[2024-06-18 04:34:07,000][12645] Fps is (10 sec: 39304.6, 60 sec: 42047.9, 300 sec: 41986.6). Total num frames: 968785920. Throughput: 0: 41980.8. Samples: 968909280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 04:34:07,000][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 04:34:07,304][12883] Updated weights for policy 0, policy_version 59131 (0.0026) +[2024-06-18 04:34:10,690][12883] Updated weights for policy 0, policy_version 59141 (0.0029) +[2024-06-18 04:34:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42053.7, 300 sec: 42098.5). Total num frames: 969015296. Throughput: 0: 41749.6. Samples: 969149160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 04:34:11,995][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 04:34:15,462][12883] Updated weights for policy 0, policy_version 59151 (0.0042) +[2024-06-18 04:34:17,000][12645] Fps is (10 sec: 40960.0, 60 sec: 42047.9, 300 sec: 41986.6). Total num frames: 969195520. Throughput: 0: 41757.8. Samples: 969283900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 04:34:17,000][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 04:34:18,605][12883] Updated weights for policy 0, policy_version 59161 (0.0025) +[2024-06-18 04:34:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 969408512. Throughput: 0: 41733.6. Samples: 969527940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) +[2024-06-18 04:34:21,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 04:34:23,351][12883] Updated weights for policy 0, policy_version 59171 (0.0024) +[2024-06-18 04:34:25,321][12862] Signal inference workers to stop experience collection... (14000 times) +[2024-06-18 04:34:25,321][12862] Signal inference workers to resume experience collection... (14000 times) +[2024-06-18 04:34:25,362][12883] InferenceWorker_p0-w0: stopping experience collection (14000 times) +[2024-06-18 04:34:25,362][12883] InferenceWorker_p0-w0: resuming experience collection (14000 times) +[2024-06-18 04:34:26,537][12883] Updated weights for policy 0, policy_version 59181 (0.0040) +[2024-06-18 04:34:26,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42053.9, 300 sec: 41987.8). Total num frames: 969637888. Throughput: 0: 41633.3. Samples: 969776140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) +[2024-06-18 04:34:26,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 04:34:31,187][12883] Updated weights for policy 0, policy_version 59191 (0.0033) +[2024-06-18 04:34:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 969818112. Throughput: 0: 41682.6. Samples: 969907200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) +[2024-06-18 04:34:31,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 04:34:34,375][12883] Updated weights for policy 0, policy_version 59201 (0.0026) +[2024-06-18 04:34:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42053.8, 300 sec: 41987.5). Total num frames: 970047488. Throughput: 0: 41374.7. Samples: 970144800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) +[2024-06-18 04:34:36,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 04:34:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059207_970047488.pth... +[2024-06-18 04:34:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058593_959987712.pth +[2024-06-18 04:34:39,302][12883] Updated weights for policy 0, policy_version 59211 (0.0030) +[2024-06-18 04:34:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41510.5, 300 sec: 41931.9). Total num frames: 970244096. Throughput: 0: 41632.0. Samples: 970398680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) +[2024-06-18 04:34:41,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 04:34:42,606][12883] Updated weights for policy 0, policy_version 59221 (0.0029) +[2024-06-18 04:34:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40960.1, 300 sec: 41820.9). Total num frames: 970424320. Throughput: 0: 41386.7. Samples: 970519900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) +[2024-06-18 04:34:46,994][12645] Avg episode reward: [(0, '0.075')] +[2024-06-18 04:34:47,707][12883] Updated weights for policy 0, policy_version 59231 (0.0037) +[2024-06-18 04:34:50,404][12883] Updated weights for policy 0, policy_version 59241 (0.0036) +[2024-06-18 04:34:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 970686464. Throughput: 0: 41343.6. Samples: 970769480. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) +[2024-06-18 04:34:51,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 04:34:55,399][12883] Updated weights for policy 0, policy_version 59251 (0.0029) +[2024-06-18 04:34:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40961.4, 300 sec: 41876.4). Total num frames: 970850304. Throughput: 0: 41739.2. Samples: 971027420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:34:56,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 04:34:58,097][12883] Updated weights for policy 0, policy_version 59261 (0.0036) +[2024-06-18 04:35:01,994][12645] Fps is (10 sec: 37682.9, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 971063296. Throughput: 0: 41347.5. Samples: 971144280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:35:01,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 04:35:03,280][12883] Updated weights for policy 0, policy_version 59271 (0.0031) +[2024-06-18 04:35:06,091][12883] Updated weights for policy 0, policy_version 59281 (0.0044) +[2024-06-18 04:35:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41783.6, 300 sec: 41931.9). Total num frames: 971292672. Throughput: 0: 41561.8. Samples: 971398220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:35:06,994][12645] Avg episode reward: [(0, '0.065')] +[2024-06-18 04:35:10,948][12883] Updated weights for policy 0, policy_version 59291 (0.0028) +[2024-06-18 04:35:11,995][12645] Fps is (10 sec: 40954.3, 60 sec: 40959.1, 300 sec: 41820.7). Total num frames: 971472896. Throughput: 0: 41616.0. Samples: 971648920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:35:11,995][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 04:35:13,967][12883] Updated weights for policy 0, policy_version 59301 (0.0041) +[2024-06-18 04:35:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41783.6, 300 sec: 41876.7). Total num frames: 971702272. Throughput: 0: 41446.8. Samples: 971772300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:35:16,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 04:35:18,681][12883] Updated weights for policy 0, policy_version 59311 (0.0038) +[2024-06-18 04:35:21,994][12645] Fps is (10 sec: 42604.4, 60 sec: 41506.1, 300 sec: 41876.6). Total num frames: 971898880. Throughput: 0: 41867.2. Samples: 972028820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:35:21,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 04:35:22,084][12883] Updated weights for policy 0, policy_version 59321 (0.0043) +[2024-06-18 04:35:26,373][12883] Updated weights for policy 0, policy_version 59331 (0.0036) +[2024-06-18 04:35:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 972111872. Throughput: 0: 41858.8. Samples: 972282320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:35:26,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 04:35:29,785][12883] Updated weights for policy 0, policy_version 59341 (0.0027) +[2024-06-18 04:35:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 41932.2). Total num frames: 972341248. Throughput: 0: 41860.8. Samples: 972403640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-18 04:35:31,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 04:35:33,963][12883] Updated weights for policy 0, policy_version 59351 (0.0034) +[2024-06-18 04:35:36,354][12862] Signal inference workers to stop experience collection... (14050 times) +[2024-06-18 04:35:36,393][12883] InferenceWorker_p0-w0: stopping experience collection (14050 times) +[2024-06-18 04:35:36,401][12862] Signal inference workers to resume experience collection... (14050 times) +[2024-06-18 04:35:36,414][12883] InferenceWorker_p0-w0: resuming experience collection (14050 times) +[2024-06-18 04:35:36,996][12645] Fps is (10 sec: 44226.5, 60 sec: 41777.7, 300 sec: 41987.1). Total num frames: 972554240. Throughput: 0: 41950.3. Samples: 972657340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-18 04:35:36,996][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 04:35:37,458][12883] Updated weights for policy 0, policy_version 59361 (0.0032) +[2024-06-18 04:35:41,989][12883] Updated weights for policy 0, policy_version 59371 (0.0037) +[2024-06-18 04:35:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41765.6). Total num frames: 972734464. Throughput: 0: 41856.9. Samples: 972910980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-18 04:35:41,994][12645] Avg episode reward: [(0, '0.098')] +[2024-06-18 04:35:45,198][12883] Updated weights for policy 0, policy_version 59381 (0.0032) +[2024-06-18 04:35:46,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 972980224. Throughput: 0: 41935.5. Samples: 973031380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-18 04:35:46,994][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 04:35:49,627][12883] Updated weights for policy 0, policy_version 59391 (0.0023) +[2024-06-18 04:35:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 973176832. Throughput: 0: 42030.7. Samples: 973289600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-18 04:35:51,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 04:35:53,117][12883] Updated weights for policy 0, policy_version 59401 (0.0031) +[2024-06-18 04:35:56,994][12645] Fps is (10 sec: 39319.7, 60 sec: 42051.8, 300 sec: 41820.8). Total num frames: 973373440. Throughput: 0: 41999.9. Samples: 973538880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-18 04:35:56,995][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 04:35:57,299][12883] Updated weights for policy 0, policy_version 59411 (0.0027) +[2024-06-18 04:36:00,866][12883] Updated weights for policy 0, policy_version 59421 (0.0034) +[2024-06-18 04:36:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41932.3). Total num frames: 973602816. Throughput: 0: 42076.0. Samples: 973665720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-18 04:36:01,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 04:36:05,228][12883] Updated weights for policy 0, policy_version 59431 (0.0044) +[2024-06-18 04:36:06,994][12645] Fps is (10 sec: 40962.1, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 973783040. Throughput: 0: 41890.6. Samples: 973913900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) +[2024-06-18 04:36:06,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 04:36:08,511][12883] Updated weights for policy 0, policy_version 59441 (0.0033) +[2024-06-18 04:36:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42053.2, 300 sec: 41820.9). Total num frames: 973996032. Throughput: 0: 41945.2. Samples: 974169860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 04:36:11,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 04:36:12,847][12883] Updated weights for policy 0, policy_version 59451 (0.0030) +[2024-06-18 04:36:16,262][12883] Updated weights for policy 0, policy_version 59461 (0.0033) +[2024-06-18 04:36:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 41932.8). Total num frames: 974241792. Throughput: 0: 42104.4. Samples: 974298340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 04:36:16,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 04:36:20,862][12883] Updated weights for policy 0, policy_version 59471 (0.0029) +[2024-06-18 04:36:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 974422016. Throughput: 0: 42164.4. Samples: 974554640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 04:36:21,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 04:36:23,991][12883] Updated weights for policy 0, policy_version 59481 (0.0031) +[2024-06-18 04:36:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 974651392. Throughput: 0: 42029.7. Samples: 974802320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 04:36:26,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 04:36:28,586][12883] Updated weights for policy 0, policy_version 59491 (0.0042) +[2024-06-18 04:36:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 974848000. Throughput: 0: 42129.5. Samples: 974927200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 04:36:31,994][12645] Avg episode reward: [(0, '0.098')] +[2024-06-18 04:36:32,017][12883] Updated weights for policy 0, policy_version 59501 (0.0050) +[2024-06-18 04:36:32,714][12862] Signal inference workers to stop experience collection... (14100 times) +[2024-06-18 04:36:32,714][12862] Signal inference workers to resume experience collection... (14100 times) +[2024-06-18 04:36:32,728][12883] InferenceWorker_p0-w0: stopping experience collection (14100 times) +[2024-06-18 04:36:32,728][12883] InferenceWorker_p0-w0: resuming experience collection (14100 times) +[2024-06-18 04:36:36,206][12883] Updated weights for policy 0, policy_version 59511 (0.0036) +[2024-06-18 04:36:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41507.7, 300 sec: 41876.7). Total num frames: 975044608. Throughput: 0: 41953.8. Samples: 975177520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 04:36:36,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 04:36:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059512_975044608.pth... +[2024-06-18 04:36:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058901_965033984.pth +[2024-06-18 04:36:39,744][12883] Updated weights for policy 0, policy_version 59521 (0.0038) +[2024-06-18 04:36:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 975273984. Throughput: 0: 41930.3. Samples: 975425720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) +[2024-06-18 04:36:41,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 04:36:43,921][12883] Updated weights for policy 0, policy_version 59531 (0.0035) +[2024-06-18 04:36:46,996][12645] Fps is (10 sec: 42588.4, 60 sec: 41504.6, 300 sec: 41765.9). Total num frames: 975470592. Throughput: 0: 42023.6. Samples: 975556880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:36:46,997][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 04:36:47,297][12883] Updated weights for policy 0, policy_version 59541 (0.0033) +[2024-06-18 04:36:51,530][12883] Updated weights for policy 0, policy_version 59551 (0.0028) +[2024-06-18 04:36:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.1, 300 sec: 41876.4). Total num frames: 975699968. Throughput: 0: 42243.1. Samples: 975814840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:36:51,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 04:36:55,498][12883] Updated weights for policy 0, policy_version 59561 (0.0031) +[2024-06-18 04:36:56,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42325.8, 300 sec: 41820.9). Total num frames: 975912960. Throughput: 0: 41823.1. Samples: 976051900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:36:56,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 04:36:59,717][12883] Updated weights for policy 0, policy_version 59571 (0.0031) +[2024-06-18 04:37:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41710.1). Total num frames: 976093184. Throughput: 0: 41960.9. Samples: 976186580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:37:01,994][12645] Avg episode reward: [(0, '0.099')] +[2024-06-18 04:37:03,206][12883] Updated weights for policy 0, policy_version 59581 (0.0039) +[2024-06-18 04:37:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 976306176. Throughput: 0: 41858.7. Samples: 976438280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:37:06,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 04:37:07,307][12883] Updated weights for policy 0, policy_version 59591 (0.0031) +[2024-06-18 04:37:10,860][12883] Updated weights for policy 0, policy_version 59601 (0.0041) +[2024-06-18 04:37:11,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 41876.4). Total num frames: 976568320. Throughput: 0: 41793.3. Samples: 976683020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:37:11,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 04:37:15,280][12883] Updated weights for policy 0, policy_version 59611 (0.0036) +[2024-06-18 04:37:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 976732160. Throughput: 0: 42187.1. Samples: 976825620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:37:16,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 04:37:18,434][12883] Updated weights for policy 0, policy_version 59621 (0.0033) +[2024-06-18 04:37:21,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 976945152. Throughput: 0: 42216.0. Samples: 977077240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 04:37:21,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 04:37:22,825][12883] Updated weights for policy 0, policy_version 59631 (0.0028) +[2024-06-18 04:37:26,148][12883] Updated weights for policy 0, policy_version 59641 (0.0044) +[2024-06-18 04:37:26,994][12645] Fps is (10 sec: 47511.7, 60 sec: 42598.1, 300 sec: 41987.4). Total num frames: 977207296. Throughput: 0: 42158.8. Samples: 977322880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 04:37:26,995][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 04:37:30,890][12883] Updated weights for policy 0, policy_version 59651 (0.0044) +[2024-06-18 04:37:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 977354752. Throughput: 0: 42218.2. Samples: 977456600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 04:37:31,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 04:37:33,883][12883] Updated weights for policy 0, policy_version 59661 (0.0028) +[2024-06-18 04:37:36,994][12645] Fps is (10 sec: 37684.4, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 977584128. Throughput: 0: 41931.6. Samples: 977701760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 04:37:36,994][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 04:37:38,907][12883] Updated weights for policy 0, policy_version 59671 (0.0024) +[2024-06-18 04:37:41,642][12883] Updated weights for policy 0, policy_version 59681 (0.0039) +[2024-06-18 04:37:41,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 977829888. Throughput: 0: 42236.9. Samples: 977952560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 04:37:41,994][12645] Avg episode reward: [(0, '0.050')] +[2024-06-18 04:37:46,958][12883] Updated weights for policy 0, policy_version 59691 (0.0042) +[2024-06-18 04:37:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41780.8, 300 sec: 41820.8). Total num frames: 977977344. Throughput: 0: 42095.5. Samples: 978080880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 04:37:46,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 04:37:48,853][12862] Signal inference workers to stop experience collection... (14150 times) +[2024-06-18 04:37:48,853][12862] Signal inference workers to resume experience collection... (14150 times) +[2024-06-18 04:37:48,894][12883] InferenceWorker_p0-w0: stopping experience collection (14150 times) +[2024-06-18 04:37:48,894][12883] InferenceWorker_p0-w0: resuming experience collection (14150 times) +[2024-06-18 04:37:49,482][12883] Updated weights for policy 0, policy_version 59701 (0.0027) +[2024-06-18 04:37:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 978223104. Throughput: 0: 41899.8. Samples: 978323780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 04:37:51,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 04:37:54,584][12883] Updated weights for policy 0, policy_version 59711 (0.0035) +[2024-06-18 04:37:56,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 978436096. Throughput: 0: 42363.7. Samples: 978589380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 04:37:56,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 04:37:57,269][12883] Updated weights for policy 0, policy_version 59721 (0.0034) +[2024-06-18 04:38:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 978616320. Throughput: 0: 41837.3. Samples: 978708300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 04:38:01,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 04:38:02,085][12883] Updated weights for policy 0, policy_version 59731 (0.0048) +[2024-06-18 04:38:05,010][12883] Updated weights for policy 0, policy_version 59741 (0.0037) +[2024-06-18 04:38:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 41932.2). Total num frames: 978862080. Throughput: 0: 41802.7. Samples: 978958360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 04:38:06,994][12645] Avg episode reward: [(0, '0.215')] +[2024-06-18 04:38:09,663][12883] Updated weights for policy 0, policy_version 59751 (0.0031) +[2024-06-18 04:38:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 979042304. Throughput: 0: 42289.2. Samples: 979225880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 04:38:11,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 04:38:12,691][12883] Updated weights for policy 0, policy_version 59761 (0.0028) +[2024-06-18 04:38:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 979255296. Throughput: 0: 41946.2. Samples: 979344180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 04:38:16,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 04:38:17,799][12883] Updated weights for policy 0, policy_version 59771 (0.0028) +[2024-06-18 04:38:20,681][12883] Updated weights for policy 0, policy_version 59781 (0.0037) +[2024-06-18 04:38:21,994][12645] Fps is (10 sec: 45873.4, 60 sec: 42598.0, 300 sec: 41987.7). Total num frames: 979501056. Throughput: 0: 42181.0. Samples: 979599920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 04:38:21,995][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 04:38:25,424][12883] Updated weights for policy 0, policy_version 59791 (0.0040) +[2024-06-18 04:38:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.3, 300 sec: 41876.4). Total num frames: 979681280. Throughput: 0: 42416.5. Samples: 979861300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 04:38:26,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 04:38:28,328][12883] Updated weights for policy 0, policy_version 59801 (0.0034) +[2024-06-18 04:38:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.0, 300 sec: 41932.2). Total num frames: 979894272. Throughput: 0: 42272.1. Samples: 979983140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 04:38:31,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 04:38:33,102][12883] Updated weights for policy 0, policy_version 59811 (0.0035) +[2024-06-18 04:38:36,183][12883] Updated weights for policy 0, policy_version 59821 (0.0035) +[2024-06-18 04:38:36,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42043.9). Total num frames: 980156416. Throughput: 0: 42665.8. Samples: 980243740. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) +[2024-06-18 04:38:36,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 04:38:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059824_980156416.pth... +[2024-06-18 04:38:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059207_970047488.pth +[2024-06-18 04:38:40,948][12883] Updated weights for policy 0, policy_version 59831 (0.0040) +[2024-06-18 04:38:41,996][12645] Fps is (10 sec: 44228.5, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 980336640. Throughput: 0: 42357.3. Samples: 980495560. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) +[2024-06-18 04:38:41,996][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 04:38:43,925][12883] Updated weights for policy 0, policy_version 59841 (0.0043) +[2024-06-18 04:38:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 41932.0). Total num frames: 980533248. Throughput: 0: 42426.3. Samples: 980617480. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) +[2024-06-18 04:38:46,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 04:38:48,565][12883] Updated weights for policy 0, policy_version 59851 (0.0036) +[2024-06-18 04:38:51,669][12883] Updated weights for policy 0, policy_version 59861 (0.0024) +[2024-06-18 04:38:51,994][12645] Fps is (10 sec: 42608.5, 60 sec: 42325.5, 300 sec: 41932.2). Total num frames: 980762624. Throughput: 0: 42636.5. Samples: 980877000. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) +[2024-06-18 04:38:51,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 04:38:56,291][12883] Updated weights for policy 0, policy_version 59871 (0.0031) +[2024-06-18 04:38:56,318][12862] Signal inference workers to stop experience collection... (14200 times) +[2024-06-18 04:38:56,323][12862] Signal inference workers to resume experience collection... (14200 times) +[2024-06-18 04:38:56,346][12883] InferenceWorker_p0-w0: stopping experience collection (14200 times) +[2024-06-18 04:38:56,347][12883] InferenceWorker_p0-w0: resuming experience collection (14200 times) +[2024-06-18 04:38:56,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42050.7, 300 sec: 41987.2). Total num frames: 980959232. Throughput: 0: 42477.1. Samples: 981137440. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) +[2024-06-18 04:38:56,996][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 04:38:59,218][12883] Updated weights for policy 0, policy_version 59881 (0.0040) +[2024-06-18 04:39:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42043.9). Total num frames: 981188608. Throughput: 0: 42610.6. Samples: 981261660. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) +[2024-06-18 04:39:01,994][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 04:39:03,905][12883] Updated weights for policy 0, policy_version 59891 (0.0049) +[2024-06-18 04:39:06,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 981401600. Throughput: 0: 42598.6. Samples: 981516840. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) +[2024-06-18 04:39:06,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 04:39:07,490][12883] Updated weights for policy 0, policy_version 59901 (0.0037) +[2024-06-18 04:39:11,764][12883] Updated weights for policy 0, policy_version 59911 (0.0036) +[2024-06-18 04:39:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42043.9). Total num frames: 981598208. Throughput: 0: 42462.7. Samples: 981772120. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) +[2024-06-18 04:39:11,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 04:39:15,076][12883] Updated weights for policy 0, policy_version 59921 (0.0048) +[2024-06-18 04:39:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42154.1). Total num frames: 981843968. Throughput: 0: 42413.7. Samples: 981891740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:39:16,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 04:39:19,394][12883] Updated weights for policy 0, policy_version 59931 (0.0037) +[2024-06-18 04:39:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.5, 300 sec: 41987.5). Total num frames: 982024192. Throughput: 0: 42268.9. Samples: 982145840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:39:21,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 04:39:23,025][12883] Updated weights for policy 0, policy_version 59941 (0.0025) +[2024-06-18 04:39:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 982220800. Throughput: 0: 42306.1. Samples: 982399240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:39:26,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 04:39:27,059][12883] Updated weights for policy 0, policy_version 59951 (0.0040) +[2024-06-18 04:39:30,897][12883] Updated weights for policy 0, policy_version 59961 (0.0032) +[2024-06-18 04:39:31,994][12645] Fps is (10 sec: 45875.9, 60 sec: 43144.9, 300 sec: 42154.1). Total num frames: 982482944. Throughput: 0: 42409.8. Samples: 982525920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:39:31,994][12645] Avg episode reward: [(0, '0.143')] +[2024-06-18 04:39:34,604][12883] Updated weights for policy 0, policy_version 59971 (0.0032) +[2024-06-18 04:39:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 982663168. Throughput: 0: 42328.8. Samples: 982781800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:39:36,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 04:39:38,592][12883] Updated weights for policy 0, policy_version 59981 (0.0043) +[2024-06-18 04:39:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 982876160. Throughput: 0: 42107.9. Samples: 983032200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:39:41,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 04:39:42,099][12883] Updated weights for policy 0, policy_version 59991 (0.0038) +[2024-06-18 04:39:46,147][12883] Updated weights for policy 0, policy_version 60001 (0.0035) +[2024-06-18 04:39:46,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42042.7). Total num frames: 983089152. Throughput: 0: 42185.0. Samples: 983160080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:39:46,996][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 04:39:50,030][12883] Updated weights for policy 0, policy_version 60011 (0.0027) +[2024-06-18 04:39:51,994][12645] Fps is (10 sec: 40956.6, 60 sec: 42051.7, 300 sec: 42154.0). Total num frames: 983285760. Throughput: 0: 42127.4. Samples: 983412600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:39:51,995][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 04:39:53,785][12883] Updated weights for policy 0, policy_version 60021 (0.0036) +[2024-06-18 04:39:56,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42327.0, 300 sec: 42154.1). Total num frames: 983498752. Throughput: 0: 41992.5. Samples: 983661780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:39:56,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 04:39:57,797][12883] Updated weights for policy 0, policy_version 60031 (0.0029) +[2024-06-18 04:40:01,894][12883] Updated weights for policy 0, policy_version 60041 (0.0039) +[2024-06-18 04:40:01,994][12645] Fps is (10 sec: 42601.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 983711744. Throughput: 0: 42212.9. Samples: 983791320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:40:01,994][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 04:40:05,331][12883] Updated weights for policy 0, policy_version 60051 (0.0027) +[2024-06-18 04:40:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 42154.3). Total num frames: 983908352. Throughput: 0: 42002.6. Samples: 984035960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:40:06,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 04:40:09,550][12883] Updated weights for policy 0, policy_version 60061 (0.0035) +[2024-06-18 04:40:11,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42323.7, 300 sec: 42153.8). Total num frames: 984137728. Throughput: 0: 42105.5. Samples: 984294080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:40:11,996][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 04:40:12,953][12883] Updated weights for policy 0, policy_version 60071 (0.0053) +[2024-06-18 04:40:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 984334336. Throughput: 0: 42179.6. Samples: 984424000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:40:16,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 04:40:17,403][12883] Updated weights for policy 0, policy_version 60081 (0.0026) +[2024-06-18 04:40:20,575][12883] Updated weights for policy 0, policy_version 60091 (0.0033) +[2024-06-18 04:40:21,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 984547328. Throughput: 0: 41885.2. Samples: 984666640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:40:21,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 04:40:25,187][12883] Updated weights for policy 0, policy_version 60101 (0.0036) +[2024-06-18 04:40:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 984776704. Throughput: 0: 42031.0. Samples: 984923600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 04:40:26,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 04:40:28,783][12883] Updated weights for policy 0, policy_version 60111 (0.0037) +[2024-06-18 04:40:31,996][12645] Fps is (10 sec: 42589.1, 60 sec: 41504.5, 300 sec: 42098.5). Total num frames: 984973312. Throughput: 0: 41969.3. Samples: 985048700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:40:31,996][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 04:40:33,004][12883] Updated weights for policy 0, policy_version 60121 (0.0033) +[2024-06-18 04:40:33,010][12862] Signal inference workers to stop experience collection... (14250 times) +[2024-06-18 04:40:33,011][12862] Signal inference workers to resume experience collection... (14250 times) +[2024-06-18 04:40:33,025][12883] InferenceWorker_p0-w0: stopping experience collection (14250 times) +[2024-06-18 04:40:33,025][12883] InferenceWorker_p0-w0: resuming experience collection (14250 times) +[2024-06-18 04:40:36,617][12883] Updated weights for policy 0, policy_version 60131 (0.0041) +[2024-06-18 04:40:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 985186304. Throughput: 0: 41930.5. Samples: 985299440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:40:36,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 04:40:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060131_985186304.pth... +[2024-06-18 04:40:37,094][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059512_975044608.pth +[2024-06-18 04:40:40,821][12883] Updated weights for policy 0, policy_version 60141 (0.0044) +[2024-06-18 04:40:41,996][12645] Fps is (10 sec: 40960.0, 60 sec: 41777.5, 300 sec: 42042.7). Total num frames: 985382912. Throughput: 0: 42020.5. Samples: 985552800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:40:41,997][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 04:40:44,574][12883] Updated weights for policy 0, policy_version 60151 (0.0035) +[2024-06-18 04:40:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41780.8, 300 sec: 42098.5). Total num frames: 985595904. Throughput: 0: 41928.5. Samples: 985678100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:40:46,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 04:40:48,509][12883] Updated weights for policy 0, policy_version 60161 (0.0048) +[2024-06-18 04:40:51,994][12645] Fps is (10 sec: 44247.3, 60 sec: 42325.9, 300 sec: 42209.7). Total num frames: 985825280. Throughput: 0: 42168.6. Samples: 985933540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:40:51,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 04:40:52,206][12883] Updated weights for policy 0, policy_version 60171 (0.0038) +[2024-06-18 04:40:56,219][12883] Updated weights for policy 0, policy_version 60181 (0.0033) +[2024-06-18 04:40:56,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42050.6, 300 sec: 42098.2). Total num frames: 986021888. Throughput: 0: 42020.4. Samples: 986185000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:40:56,997][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 04:41:00,198][12883] Updated weights for policy 0, policy_version 60191 (0.0033) +[2024-06-18 04:41:01,996][12645] Fps is (10 sec: 40950.4, 60 sec: 42050.7, 300 sec: 42209.3). Total num frames: 986234880. Throughput: 0: 41900.9. Samples: 986309640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:41:01,996][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 04:41:03,842][12883] Updated weights for policy 0, policy_version 60201 (0.0027) +[2024-06-18 04:41:06,996][12645] Fps is (10 sec: 44237.0, 60 sec: 42596.9, 300 sec: 42264.8). Total num frames: 986464256. Throughput: 0: 42234.9. Samples: 986567300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 04:41:06,996][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 04:41:07,781][12883] Updated weights for policy 0, policy_version 60211 (0.0041) +[2024-06-18 04:41:11,475][12883] Updated weights for policy 0, policy_version 60221 (0.0023) +[2024-06-18 04:41:11,996][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42098.2). Total num frames: 986660864. Throughput: 0: 42119.7. Samples: 986819080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 04:41:11,996][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 04:41:15,598][12883] Updated weights for policy 0, policy_version 60231 (0.0038) +[2024-06-18 04:41:16,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 986873856. Throughput: 0: 42250.6. Samples: 986949880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 04:41:16,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 04:41:19,110][12883] Updated weights for policy 0, policy_version 60241 (0.0035) +[2024-06-18 04:41:21,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 987070464. Throughput: 0: 42190.7. Samples: 987198020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 04:41:21,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 04:41:23,311][12883] Updated weights for policy 0, policy_version 60251 (0.0036) +[2024-06-18 04:41:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 987299840. Throughput: 0: 42310.1. Samples: 987456660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 04:41:26,995][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 04:41:27,307][12883] Updated weights for policy 0, policy_version 60261 (0.0035) +[2024-06-18 04:41:31,005][12883] Updated weights for policy 0, policy_version 60271 (0.0036) +[2024-06-18 04:41:31,996][12645] Fps is (10 sec: 45864.5, 60 sec: 42598.4, 300 sec: 42320.4). Total num frames: 987529216. Throughput: 0: 42445.4. Samples: 987588240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 04:41:31,996][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 04:41:34,858][12883] Updated weights for policy 0, policy_version 60281 (0.0034) +[2024-06-18 04:41:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 987709440. Throughput: 0: 42279.6. Samples: 987836120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 04:41:36,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 04:41:38,599][12883] Updated weights for policy 0, policy_version 60291 (0.0041) +[2024-06-18 04:41:41,996][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 987922432. Throughput: 0: 42471.2. Samples: 988096200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 04:41:41,996][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 04:41:42,640][12883] Updated weights for policy 0, policy_version 60301 (0.0025) +[2024-06-18 04:41:46,467][12883] Updated weights for policy 0, policy_version 60311 (0.0035) +[2024-06-18 04:41:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 988151808. Throughput: 0: 42431.9. Samples: 988218980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 04:41:46,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 04:41:50,250][12883] Updated weights for policy 0, policy_version 60321 (0.0030) +[2024-06-18 04:41:51,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 988364800. Throughput: 0: 42341.6. Samples: 988472580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 04:41:51,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 04:41:54,103][12883] Updated weights for policy 0, policy_version 60331 (0.0038) +[2024-06-18 04:41:56,993][12645] Fps is (10 sec: 40960.6, 60 sec: 42327.0, 300 sec: 42265.2). Total num frames: 988561408. Throughput: 0: 42503.1. Samples: 988731620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 04:41:56,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 04:41:58,055][12883] Updated weights for policy 0, policy_version 60341 (0.0038) +[2024-06-18 04:42:01,886][12883] Updated weights for policy 0, policy_version 60351 (0.0036) +[2024-06-18 04:42:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42600.0, 300 sec: 42320.7). Total num frames: 988790784. Throughput: 0: 42284.0. Samples: 988852660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 04:42:01,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 04:42:05,639][12883] Updated weights for policy 0, policy_version 60361 (0.0038) +[2024-06-18 04:42:06,996][12645] Fps is (10 sec: 42587.8, 60 sec: 42052.2, 300 sec: 42098.2). Total num frames: 988987392. Throughput: 0: 42374.2. Samples: 989104960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 04:42:06,997][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 04:42:10,212][12883] Updated weights for policy 0, policy_version 60371 (0.0036) +[2024-06-18 04:42:11,358][12862] Signal inference workers to stop experience collection... (14300 times) +[2024-06-18 04:42:11,359][12862] Signal inference workers to resume experience collection... (14300 times) +[2024-06-18 04:42:11,387][12883] InferenceWorker_p0-w0: stopping experience collection (14300 times) +[2024-06-18 04:42:11,387][12883] InferenceWorker_p0-w0: resuming experience collection (14300 times) +[2024-06-18 04:42:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42053.8, 300 sec: 42209.6). Total num frames: 989184000. Throughput: 0: 42230.7. Samples: 989357040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 04:42:11,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 04:42:13,229][12883] Updated weights for policy 0, policy_version 60381 (0.0043) +[2024-06-18 04:42:16,994][12645] Fps is (10 sec: 40969.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 989396992. Throughput: 0: 42076.8. Samples: 989481600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 04:42:16,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 04:42:17,847][12883] Updated weights for policy 0, policy_version 60391 (0.0035) +[2024-06-18 04:42:21,105][12883] Updated weights for policy 0, policy_version 60401 (0.0038) +[2024-06-18 04:42:22,000][12645] Fps is (10 sec: 45846.6, 60 sec: 42866.9, 300 sec: 42153.2). Total num frames: 989642752. Throughput: 0: 42155.3. Samples: 989733380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:42:22,000][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 04:42:25,789][12883] Updated weights for policy 0, policy_version 60411 (0.0026) +[2024-06-18 04:42:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 989806592. Throughput: 0: 42144.0. Samples: 989992580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:42:26,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 04:42:28,840][12883] Updated weights for policy 0, policy_version 60421 (0.0037) +[2024-06-18 04:42:31,994][12645] Fps is (10 sec: 37706.8, 60 sec: 41507.7, 300 sec: 42154.1). Total num frames: 990019584. Throughput: 0: 41928.0. Samples: 990105740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:42:31,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 04:42:33,607][12883] Updated weights for policy 0, policy_version 60431 (0.0042) +[2024-06-18 04:42:36,588][12883] Updated weights for policy 0, policy_version 60441 (0.0039) +[2024-06-18 04:42:36,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42871.3, 300 sec: 42209.6). Total num frames: 990281728. Throughput: 0: 41954.2. Samples: 990360520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:42:36,995][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 04:42:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060442_990281728.pth... +[2024-06-18 04:42:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059824_980156416.pth +[2024-06-18 04:42:41,502][12883] Updated weights for policy 0, policy_version 60451 (0.0037) +[2024-06-18 04:42:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41780.8, 300 sec: 42209.6). Total num frames: 990429184. Throughput: 0: 41808.3. Samples: 990613000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:42:41,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 04:42:44,503][12883] Updated weights for policy 0, policy_version 60461 (0.0034) +[2024-06-18 04:42:46,997][12645] Fps is (10 sec: 37671.1, 60 sec: 41776.9, 300 sec: 42153.6). Total num frames: 990658560. Throughput: 0: 41718.3. Samples: 990730120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:42:46,998][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 04:42:49,358][12883] Updated weights for policy 0, policy_version 60471 (0.0039) +[2024-06-18 04:42:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 990887936. Throughput: 0: 41947.0. Samples: 990992480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:42:51,994][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 04:42:52,252][12883] Updated weights for policy 0, policy_version 60481 (0.0024) +[2024-06-18 04:42:56,994][12645] Fps is (10 sec: 39334.4, 60 sec: 41506.0, 300 sec: 42154.1). Total num frames: 991051776. Throughput: 0: 41994.7. Samples: 991246800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 04:42:56,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 04:42:57,303][12883] Updated weights for policy 0, policy_version 60491 (0.0022) +[2024-06-18 04:43:00,147][12883] Updated weights for policy 0, policy_version 60501 (0.0032) +[2024-06-18 04:43:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 991297536. Throughput: 0: 41893.3. Samples: 991366800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:43:01,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 04:43:05,017][12883] Updated weights for policy 0, policy_version 60511 (0.0036) +[2024-06-18 04:43:06,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42053.9, 300 sec: 42265.2). Total num frames: 991510528. Throughput: 0: 42026.8. Samples: 991624320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:43:06,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 04:43:07,933][12883] Updated weights for policy 0, policy_version 60521 (0.0031) +[2024-06-18 04:43:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 991707136. Throughput: 0: 41767.4. Samples: 991872120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:43:11,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 04:43:12,745][12883] Updated weights for policy 0, policy_version 60531 (0.0034) +[2024-06-18 04:43:15,956][12883] Updated weights for policy 0, policy_version 60541 (0.0030) +[2024-06-18 04:43:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 991936512. Throughput: 0: 41999.5. Samples: 991995720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:43:16,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 04:43:20,819][12883] Updated weights for policy 0, policy_version 60551 (0.0047) +[2024-06-18 04:43:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41510.4, 300 sec: 42209.6). Total num frames: 992133120. Throughput: 0: 42020.0. Samples: 992251420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:43:21,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 04:43:23,775][12883] Updated weights for policy 0, policy_version 60561 (0.0028) +[2024-06-18 04:43:24,767][12862] Signal inference workers to stop experience collection... (14350 times) +[2024-06-18 04:43:24,800][12883] InferenceWorker_p0-w0: stopping experience collection (14350 times) +[2024-06-18 04:43:24,825][12862] Signal inference workers to resume experience collection... (14350 times) +[2024-06-18 04:43:24,826][12883] InferenceWorker_p0-w0: resuming experience collection (14350 times) +[2024-06-18 04:43:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42209.7). Total num frames: 992346112. Throughput: 0: 41833.2. Samples: 992495500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:43:26,994][12645] Avg episode reward: [(0, '0.073')] +[2024-06-18 04:43:28,650][12883] Updated weights for policy 0, policy_version 60571 (0.0036) +[2024-06-18 04:43:31,955][12883] Updated weights for policy 0, policy_version 60581 (0.0043) +[2024-06-18 04:43:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 992559104. Throughput: 0: 42025.7. Samples: 992621140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 04:43:31,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 04:43:36,621][12883] Updated weights for policy 0, policy_version 60591 (0.0042) +[2024-06-18 04:43:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 42043.3). Total num frames: 992739328. Throughput: 0: 41691.9. Samples: 992868620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:43:36,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 04:43:39,658][12883] Updated weights for policy 0, policy_version 60601 (0.0029) +[2024-06-18 04:43:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 992952320. Throughput: 0: 41640.5. Samples: 993120620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:43:41,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 04:43:44,180][12883] Updated weights for policy 0, policy_version 60611 (0.0045) +[2024-06-18 04:43:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41781.5, 300 sec: 42043.0). Total num frames: 993165312. Throughput: 0: 41876.5. Samples: 993251240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:43:46,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 04:43:47,450][12883] Updated weights for policy 0, policy_version 60621 (0.0042) +[2024-06-18 04:43:51,812][12883] Updated weights for policy 0, policy_version 60631 (0.0035) +[2024-06-18 04:43:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 42098.9). Total num frames: 993378304. Throughput: 0: 41692.8. Samples: 993500500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:43:51,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 04:43:55,616][12883] Updated weights for policy 0, policy_version 60641 (0.0041) +[2024-06-18 04:43:56,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42323.7, 300 sec: 42042.7). Total num frames: 993591296. Throughput: 0: 41709.5. Samples: 993749140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:43:57,004][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 04:43:59,676][12883] Updated weights for policy 0, policy_version 60651 (0.0036) +[2024-06-18 04:44:01,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41504.6, 300 sec: 41987.2). Total num frames: 993787904. Throughput: 0: 41850.9. Samples: 993879100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:44:01,997][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 04:44:03,256][12883] Updated weights for policy 0, policy_version 60661 (0.0025) +[2024-06-18 04:44:06,994][12645] Fps is (10 sec: 42607.9, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 994017280. Throughput: 0: 41738.3. Samples: 994129640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 04:44:06,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 04:44:07,277][12883] Updated weights for policy 0, policy_version 60671 (0.0023) +[2024-06-18 04:44:11,057][12883] Updated weights for policy 0, policy_version 60681 (0.0039) +[2024-06-18 04:44:11,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 994230272. Throughput: 0: 41880.5. Samples: 994380120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) +[2024-06-18 04:44:11,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 04:44:15,039][12883] Updated weights for policy 0, policy_version 60691 (0.0028) +[2024-06-18 04:44:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 994426880. Throughput: 0: 41771.2. Samples: 994500840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) +[2024-06-18 04:44:16,994][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 04:44:18,959][12883] Updated weights for policy 0, policy_version 60701 (0.0033) +[2024-06-18 04:44:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 994639872. Throughput: 0: 41953.5. Samples: 994756520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) +[2024-06-18 04:44:21,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 04:44:22,753][12883] Updated weights for policy 0, policy_version 60711 (0.0030) +[2024-06-18 04:44:26,608][12883] Updated weights for policy 0, policy_version 60721 (0.0039) +[2024-06-18 04:44:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 994852864. Throughput: 0: 41859.6. Samples: 995004300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) +[2024-06-18 04:44:26,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 04:44:30,523][12883] Updated weights for policy 0, policy_version 60731 (0.0036) +[2024-06-18 04:44:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 995049472. Throughput: 0: 41909.7. Samples: 995137180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) +[2024-06-18 04:44:31,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 04:44:34,271][12883] Updated weights for policy 0, policy_version 60741 (0.0038) +[2024-06-18 04:44:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 995246080. Throughput: 0: 41924.6. Samples: 995387100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) +[2024-06-18 04:44:36,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 04:44:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060746_995262464.pth... +[2024-06-18 04:44:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060131_985186304.pth +[2024-06-18 04:44:38,381][12883] Updated weights for policy 0, policy_version 60751 (0.0052) +[2024-06-18 04:44:41,993][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42043.4). Total num frames: 995491840. Throughput: 0: 41970.2. Samples: 995637700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) +[2024-06-18 04:44:41,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 04:44:42,355][12883] Updated weights for policy 0, policy_version 60761 (0.0034) +[2024-06-18 04:44:46,137][12883] Updated weights for policy 0, policy_version 60771 (0.0035) +[2024-06-18 04:44:46,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42098.7). Total num frames: 995704832. Throughput: 0: 41971.9. Samples: 995767740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) +[2024-06-18 04:44:46,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 04:44:49,094][12862] Signal inference workers to stop experience collection... (14400 times) +[2024-06-18 04:44:49,094][12862] Signal inference workers to resume experience collection... (14400 times) +[2024-06-18 04:44:49,109][12883] InferenceWorker_p0-w0: stopping experience collection (14400 times) +[2024-06-18 04:44:49,110][12883] InferenceWorker_p0-w0: resuming experience collection (14400 times) +[2024-06-18 04:44:50,130][12883] Updated weights for policy 0, policy_version 60781 (0.0042) +[2024-06-18 04:44:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 995885056. Throughput: 0: 41845.9. Samples: 996012700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 04:44:51,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 04:44:54,016][12883] Updated weights for policy 0, policy_version 60791 (0.0032) +[2024-06-18 04:44:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42053.9, 300 sec: 42043.0). Total num frames: 996114432. Throughput: 0: 41916.9. Samples: 996266380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 04:44:56,994][12645] Avg episode reward: [(0, '0.098')] +[2024-06-18 04:44:57,710][12883] Updated weights for policy 0, policy_version 60801 (0.0033) +[2024-06-18 04:45:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42053.9, 300 sec: 42043.0). Total num frames: 996311040. Throughput: 0: 42064.4. Samples: 996393740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 04:45:01,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 04:45:02,225][12883] Updated weights for policy 0, policy_version 60811 (0.0034) +[2024-06-18 04:45:05,526][12883] Updated weights for policy 0, policy_version 60821 (0.0031) +[2024-06-18 04:45:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41987.8). Total num frames: 996524032. Throughput: 0: 41714.6. Samples: 996633680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 04:45:06,994][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 04:45:09,906][12883] Updated weights for policy 0, policy_version 60831 (0.0027) +[2024-06-18 04:45:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 996737024. Throughput: 0: 42033.3. Samples: 996895800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 04:45:11,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 04:45:13,625][12883] Updated weights for policy 0, policy_version 60841 (0.0039) +[2024-06-18 04:45:16,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 996966400. Throughput: 0: 41923.7. Samples: 997023840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 04:45:16,997][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 04:45:17,740][12883] Updated weights for policy 0, policy_version 60851 (0.0027) +[2024-06-18 04:45:21,357][12883] Updated weights for policy 0, policy_version 60861 (0.0041) +[2024-06-18 04:45:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 997146624. Throughput: 0: 41975.4. Samples: 997276000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 04:45:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 04:45:25,540][12883] Updated weights for policy 0, policy_version 60871 (0.0036) +[2024-06-18 04:45:26,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42043.3). Total num frames: 997376000. Throughput: 0: 41953.3. Samples: 997525600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:45:26,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 04:45:29,300][12883] Updated weights for policy 0, policy_version 60881 (0.0043) +[2024-06-18 04:45:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 997588992. Throughput: 0: 41888.5. Samples: 997652720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:45:31,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 04:45:33,200][12883] Updated weights for policy 0, policy_version 60891 (0.0040) +[2024-06-18 04:45:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42043.3). Total num frames: 997785600. Throughput: 0: 42269.8. Samples: 997914840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:45:36,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 04:45:37,039][12883] Updated weights for policy 0, policy_version 60901 (0.0042) +[2024-06-18 04:45:41,015][12883] Updated weights for policy 0, policy_version 60911 (0.0028) +[2024-06-18 04:45:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 997998592. Throughput: 0: 42131.9. Samples: 998162320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:45:41,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 04:45:45,037][12883] Updated weights for policy 0, policy_version 60921 (0.0042) +[2024-06-18 04:45:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 998211584. Throughput: 0: 42098.2. Samples: 998288160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:45:46,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 04:45:48,429][12883] Updated weights for policy 0, policy_version 60931 (0.0040) +[2024-06-18 04:45:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42043.4). Total num frames: 998424576. Throughput: 0: 42544.0. Samples: 998548160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:45:51,994][12645] Avg episode reward: [(0, '0.033')] +[2024-06-18 04:45:52,883][12883] Updated weights for policy 0, policy_version 60941 (0.0029) +[2024-06-18 04:45:56,006][12883] Updated weights for policy 0, policy_version 60951 (0.0041) +[2024-06-18 04:45:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42043.3). Total num frames: 998637568. Throughput: 0: 42226.7. Samples: 998796000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:45:56,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 04:46:00,656][12883] Updated weights for policy 0, policy_version 60961 (0.0047) +[2024-06-18 04:46:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 998850560. Throughput: 0: 42115.9. Samples: 998918960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:46:01,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 04:46:03,943][12883] Updated weights for policy 0, policy_version 60971 (0.0052) +[2024-06-18 04:46:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41987.8). Total num frames: 999047168. Throughput: 0: 42165.4. Samples: 999173440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 04:46:06,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 04:46:08,434][12883] Updated weights for policy 0, policy_version 60981 (0.0032) +[2024-06-18 04:46:11,971][12883] Updated weights for policy 0, policy_version 60991 (0.0034) +[2024-06-18 04:46:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 999276544. Throughput: 0: 42116.7. Samples: 999420860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 04:46:11,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 04:46:16,221][12883] Updated weights for policy 0, policy_version 61001 (0.0036) +[2024-06-18 04:46:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41780.8, 300 sec: 42043.0). Total num frames: 999473152. Throughput: 0: 42258.2. Samples: 999554340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 04:46:16,994][12645] Avg episode reward: [(0, '0.048')] +[2024-06-18 04:46:17,146][12862] Signal inference workers to stop experience collection... (14450 times) +[2024-06-18 04:46:17,146][12862] Signal inference workers to resume experience collection... (14450 times) +[2024-06-18 04:46:17,159][12883] InferenceWorker_p0-w0: stopping experience collection (14450 times) +[2024-06-18 04:46:17,159][12883] InferenceWorker_p0-w0: resuming experience collection (14450 times) +[2024-06-18 04:46:19,582][12883] Updated weights for policy 0, policy_version 61011 (0.0035) +[2024-06-18 04:46:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 999669760. Throughput: 0: 41945.3. Samples: 999802380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 04:46:21,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 04:46:24,313][12883] Updated weights for policy 0, policy_version 61021 (0.0035) +[2024-06-18 04:46:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41932.3). Total num frames: 999899136. Throughput: 0: 41993.4. Samples: 1000052020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 04:46:26,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 04:46:27,555][12883] Updated weights for policy 0, policy_version 61031 (0.0037) +[2024-06-18 04:46:31,797][12883] Updated weights for policy 0, policy_version 61041 (0.0039) +[2024-06-18 04:46:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41987.4). Total num frames: 1000095744. Throughput: 0: 42160.0. Samples: 1000185360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 04:46:31,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 04:46:35,182][12883] Updated weights for policy 0, policy_version 61051 (0.0029) +[2024-06-18 04:46:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41932.2). Total num frames: 1000292352. Throughput: 0: 41849.2. Samples: 1000431380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 04:46:36,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 04:46:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061053_1000292352.pth... +[2024-06-18 04:46:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060442_990281728.pth +[2024-06-18 04:46:39,416][12883] Updated weights for policy 0, policy_version 61061 (0.0032) +[2024-06-18 04:46:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1000538112. Throughput: 0: 41934.7. Samples: 1000683060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 04:46:41,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 04:46:42,874][12883] Updated weights for policy 0, policy_version 61071 (0.0033) +[2024-06-18 04:46:46,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42050.7, 300 sec: 41931.6). Total num frames: 1000734720. Throughput: 0: 42253.0. Samples: 1000820440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:46:46,997][12645] Avg episode reward: [(0, '0.042')] +[2024-06-18 04:46:47,253][12883] Updated weights for policy 0, policy_version 61081 (0.0037) +[2024-06-18 04:46:50,657][12883] Updated weights for policy 0, policy_version 61091 (0.0032) +[2024-06-18 04:46:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 41987.4). Total num frames: 1000947712. Throughput: 0: 42093.7. Samples: 1001067660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:46:51,994][12645] Avg episode reward: [(0, '0.070')] +[2024-06-18 04:46:55,020][12883] Updated weights for policy 0, policy_version 61101 (0.0044) +[2024-06-18 04:46:56,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1001177088. Throughput: 0: 41972.5. Samples: 1001309620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:46:56,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 04:46:59,136][12883] Updated weights for policy 0, policy_version 61111 (0.0044) +[2024-06-18 04:47:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41876.7). Total num frames: 1001340928. Throughput: 0: 41959.6. Samples: 1001442520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:47:01,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 04:47:02,839][12883] Updated weights for policy 0, policy_version 61121 (0.0029) +[2024-06-18 04:47:06,715][12883] Updated weights for policy 0, policy_version 61131 (0.0033) +[2024-06-18 04:47:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1001570304. Throughput: 0: 41848.0. Samples: 1001685540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:47:06,994][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 04:47:10,643][12883] Updated weights for policy 0, policy_version 61141 (0.0033) +[2024-06-18 04:47:11,708][12862] Signal inference workers to stop experience collection... (14500 times) +[2024-06-18 04:47:11,708][12862] Signal inference workers to resume experience collection... (14500 times) +[2024-06-18 04:47:11,726][12883] InferenceWorker_p0-w0: stopping experience collection (14500 times) +[2024-06-18 04:47:11,726][12883] InferenceWorker_p0-w0: resuming experience collection (14500 times) +[2024-06-18 04:47:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1001799680. Throughput: 0: 42058.2. Samples: 1001944640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:47:11,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 04:47:14,298][12883] Updated weights for policy 0, policy_version 61151 (0.0030) +[2024-06-18 04:47:16,995][12645] Fps is (10 sec: 39314.7, 60 sec: 41504.9, 300 sec: 41766.0). Total num frames: 1001963520. Throughput: 0: 41858.9. Samples: 1002069080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 04:47:16,996][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 04:47:18,575][12883] Updated weights for policy 0, policy_version 61161 (0.0031) +[2024-06-18 04:47:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1002209280. Throughput: 0: 41876.4. Samples: 1002315820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 04:47:21,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 04:47:22,339][12883] Updated weights for policy 0, policy_version 61171 (0.0038) +[2024-06-18 04:47:26,255][12883] Updated weights for policy 0, policy_version 61181 (0.0030) +[2024-06-18 04:47:26,994][12645] Fps is (10 sec: 45883.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1002422272. Throughput: 0: 42168.9. Samples: 1002580660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 04:47:26,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 04:47:29,875][12883] Updated weights for policy 0, policy_version 61191 (0.0027) +[2024-06-18 04:47:31,996][12645] Fps is (10 sec: 39314.4, 60 sec: 41777.9, 300 sec: 41765.1). Total num frames: 1002602496. Throughput: 0: 41742.1. Samples: 1002698820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 04:47:31,996][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 04:47:34,317][12883] Updated weights for policy 0, policy_version 61201 (0.0048) +[2024-06-18 04:47:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 1002864640. Throughput: 0: 41776.6. Samples: 1002947600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 04:47:36,994][12645] Avg episode reward: [(0, '0.035')] +[2024-06-18 04:47:37,462][12883] Updated weights for policy 0, policy_version 61211 (0.0039) +[2024-06-18 04:47:41,994][12645] Fps is (10 sec: 42606.8, 60 sec: 41506.2, 300 sec: 41932.4). Total num frames: 1003028480. Throughput: 0: 42091.3. Samples: 1003203720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 04:47:41,994][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 04:47:42,123][12883] Updated weights for policy 0, policy_version 61221 (0.0038) +[2024-06-18 04:47:45,720][12883] Updated weights for policy 0, policy_version 61231 (0.0041) +[2024-06-18 04:47:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41780.9, 300 sec: 41876.4). Total num frames: 1003241472. Throughput: 0: 41651.1. Samples: 1003316820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 04:47:46,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 04:47:49,836][12883] Updated weights for policy 0, policy_version 61241 (0.0032) +[2024-06-18 04:47:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1003487232. Throughput: 0: 42050.2. Samples: 1003577800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 04:47:51,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 04:47:53,308][12883] Updated weights for policy 0, policy_version 61251 (0.0034) +[2024-06-18 04:47:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 1003667456. Throughput: 0: 41996.0. Samples: 1003834460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) +[2024-06-18 04:47:56,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 04:47:57,602][12883] Updated weights for policy 0, policy_version 61261 (0.0030) +[2024-06-18 04:48:00,875][12883] Updated weights for policy 0, policy_version 61271 (0.0042) +[2024-06-18 04:48:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 1003880448. Throughput: 0: 41829.6. Samples: 1003951340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:48:01,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 04:48:05,415][12883] Updated weights for policy 0, policy_version 61281 (0.0035) +[2024-06-18 04:48:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 1004093440. Throughput: 0: 42088.8. Samples: 1004209820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:48:06,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 04:48:08,452][12883] Updated weights for policy 0, policy_version 61291 (0.0040) +[2024-06-18 04:48:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 1004290048. Throughput: 0: 41785.8. Samples: 1004461020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:48:11,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 04:48:13,147][12883] Updated weights for policy 0, policy_version 61301 (0.0032) +[2024-06-18 04:48:16,530][12883] Updated weights for policy 0, policy_version 61311 (0.0042) +[2024-06-18 04:48:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42599.6, 300 sec: 41987.5). Total num frames: 1004519424. Throughput: 0: 41924.9. Samples: 1004585360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:48:16,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 04:48:21,057][12883] Updated weights for policy 0, policy_version 61321 (0.0039) +[2024-06-18 04:48:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1004716032. Throughput: 0: 42001.2. Samples: 1004837660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:48:21,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 04:48:24,133][12883] Updated weights for policy 0, policy_version 61331 (0.0035) +[2024-06-18 04:48:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 1004912640. Throughput: 0: 42158.7. Samples: 1005100860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:48:26,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 04:48:28,705][12883] Updated weights for policy 0, policy_version 61341 (0.0048) +[2024-06-18 04:48:31,866][12883] Updated weights for policy 0, policy_version 61351 (0.0034) +[2024-06-18 04:48:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42872.8, 300 sec: 42154.1). Total num frames: 1005174784. Throughput: 0: 42312.8. Samples: 1005220900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 04:48:31,994][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 04:48:36,504][12883] Updated weights for policy 0, policy_version 61361 (0.0037) +[2024-06-18 04:48:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 1005338624. Throughput: 0: 42100.4. Samples: 1005472320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 04:48:36,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 04:48:37,040][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061362_1005355008.pth... +[2024-06-18 04:48:37,095][12862] Signal inference workers to stop experience collection... (14550 times) +[2024-06-18 04:48:37,096][12862] Signal inference workers to resume experience collection... (14550 times) +[2024-06-18 04:48:37,106][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060746_995262464.pth +[2024-06-18 04:48:37,109][12883] InferenceWorker_p0-w0: stopping experience collection (14550 times) +[2024-06-18 04:48:37,110][12883] InferenceWorker_p0-w0: resuming experience collection (14550 times) +[2024-06-18 04:48:39,991][12883] Updated weights for policy 0, policy_version 61371 (0.0043) +[2024-06-18 04:48:41,994][12645] Fps is (10 sec: 36045.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1005535232. Throughput: 0: 42021.9. Samples: 1005725440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 04:48:41,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 04:48:44,472][12883] Updated weights for policy 0, policy_version 61381 (0.0028) +[2024-06-18 04:48:46,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42323.7, 300 sec: 42042.7). Total num frames: 1005780992. Throughput: 0: 42264.1. Samples: 1005853320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 04:48:46,997][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 04:48:47,695][12883] Updated weights for policy 0, policy_version 61391 (0.0026) +[2024-06-18 04:48:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41987.8). Total num frames: 1005977600. Throughput: 0: 42202.4. Samples: 1006108920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 04:48:51,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 04:48:52,226][12883] Updated weights for policy 0, policy_version 61401 (0.0029) +[2024-06-18 04:48:55,383][12883] Updated weights for policy 0, policy_version 61411 (0.0034) +[2024-06-18 04:48:56,996][12645] Fps is (10 sec: 40960.0, 60 sec: 42050.7, 300 sec: 42043.0). Total num frames: 1006190592. Throughput: 0: 42264.0. Samples: 1006363000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 04:48:56,997][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 04:48:59,951][12883] Updated weights for policy 0, policy_version 61421 (0.0035) +[2024-06-18 04:49:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 1006436352. Throughput: 0: 42301.8. Samples: 1006488940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 04:49:01,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 04:49:03,057][12883] Updated weights for policy 0, policy_version 61431 (0.0035) +[2024-06-18 04:49:06,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1006616576. Throughput: 0: 42321.3. Samples: 1006742120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 04:49:06,995][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 04:49:07,599][12883] Updated weights for policy 0, policy_version 61441 (0.0029) +[2024-06-18 04:49:11,088][12883] Updated weights for policy 0, policy_version 61451 (0.0038) +[2024-06-18 04:49:11,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 1006813184. Throughput: 0: 42067.9. Samples: 1006993920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 04:49:11,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 04:49:15,345][12883] Updated weights for policy 0, policy_version 61461 (0.0034) +[2024-06-18 04:49:16,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 1007058944. Throughput: 0: 42292.0. Samples: 1007124140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:49:16,997][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 04:49:19,090][12883] Updated weights for policy 0, policy_version 61471 (0.0043) +[2024-06-18 04:49:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1007255552. Throughput: 0: 42359.0. Samples: 1007378480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:49:21,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 04:49:22,856][12883] Updated weights for policy 0, policy_version 61481 (0.0035) +[2024-06-18 04:49:26,650][12883] Updated weights for policy 0, policy_version 61491 (0.0036) +[2024-06-18 04:49:26,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 1007468544. Throughput: 0: 42273.7. Samples: 1007627760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:49:26,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 04:49:30,570][12883] Updated weights for policy 0, policy_version 61501 (0.0029) +[2024-06-18 04:49:31,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1007681536. Throughput: 0: 42349.3. Samples: 1007758940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:49:31,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 04:49:34,363][12883] Updated weights for policy 0, policy_version 61511 (0.0036) +[2024-06-18 04:49:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 41987.4). Total num frames: 1007878144. Throughput: 0: 42255.4. Samples: 1008010420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:49:36,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 04:49:38,250][12883] Updated weights for policy 0, policy_version 61521 (0.0030) +[2024-06-18 04:49:41,996][12645] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42042.7). Total num frames: 1008107520. Throughput: 0: 42197.8. Samples: 1008261900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:49:41,996][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 04:49:42,161][12883] Updated weights for policy 0, policy_version 61531 (0.0041) +[2024-06-18 04:49:45,974][12883] Updated weights for policy 0, policy_version 61541 (0.0029) +[2024-06-18 04:49:46,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42327.0, 300 sec: 42154.1). Total num frames: 1008320512. Throughput: 0: 42296.9. Samples: 1008392300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:49:46,994][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 04:49:50,034][12883] Updated weights for policy 0, policy_version 61551 (0.0041) +[2024-06-18 04:49:51,998][12645] Fps is (10 sec: 40949.9, 60 sec: 42322.0, 300 sec: 42042.3). Total num frames: 1008517120. Throughput: 0: 42241.4. Samples: 1008643180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 04:49:51,999][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 04:49:53,961][12883] Updated weights for policy 0, policy_version 61561 (0.0036) +[2024-06-18 04:49:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42154.1). Total num frames: 1008746496. Throughput: 0: 42205.3. Samples: 1008893160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 04:49:56,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 04:49:57,841][12883] Updated weights for policy 0, policy_version 61571 (0.0044) +[2024-06-18 04:50:01,731][12883] Updated weights for policy 0, policy_version 61581 (0.0039) +[2024-06-18 04:50:01,994][12645] Fps is (10 sec: 42618.5, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 1008943104. Throughput: 0: 42129.7. Samples: 1009019880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 04:50:01,994][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 04:50:05,792][12883] Updated weights for policy 0, policy_version 61591 (0.0032) +[2024-06-18 04:50:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 1009156096. Throughput: 0: 42076.5. Samples: 1009271920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 04:50:06,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 04:50:09,254][12883] Updated weights for policy 0, policy_version 61601 (0.0035) +[2024-06-18 04:50:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 41987.8). Total num frames: 1009352704. Throughput: 0: 42119.0. Samples: 1009523120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 04:50:11,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 04:50:13,507][12883] Updated weights for policy 0, policy_version 61611 (0.0036) +[2024-06-18 04:50:16,610][12862] Signal inference workers to stop experience collection... (14600 times) +[2024-06-18 04:50:16,645][12883] InferenceWorker_p0-w0: stopping experience collection (14600 times) +[2024-06-18 04:50:16,672][12862] Signal inference workers to resume experience collection... (14600 times) +[2024-06-18 04:50:16,673][12883] InferenceWorker_p0-w0: resuming experience collection (14600 times) +[2024-06-18 04:50:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 1009582080. Throughput: 0: 41941.2. Samples: 1009646300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 04:50:16,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 04:50:17,334][12883] Updated weights for policy 0, policy_version 61621 (0.0032) +[2024-06-18 04:50:21,483][12883] Updated weights for policy 0, policy_version 61631 (0.0034) +[2024-06-18 04:50:21,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 1009778688. Throughput: 0: 42027.3. Samples: 1009901640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 04:50:21,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 04:50:25,111][12883] Updated weights for policy 0, policy_version 61641 (0.0037) +[2024-06-18 04:50:26,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42050.7, 300 sec: 42042.7). Total num frames: 1009991680. Throughput: 0: 41863.1. Samples: 1010145740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 04:50:26,996][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 04:50:29,412][12883] Updated weights for policy 0, policy_version 61651 (0.0049) +[2024-06-18 04:50:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1010221056. Throughput: 0: 41871.5. Samples: 1010276520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 04:50:31,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 04:50:32,707][12883] Updated weights for policy 0, policy_version 61661 (0.0032) +[2024-06-18 04:50:36,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 1010401280. Throughput: 0: 41903.5. Samples: 1010528640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 04:50:36,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 04:50:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061670_1010401280.pth... +[2024-06-18 04:50:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061053_1000292352.pth +[2024-06-18 04:50:37,238][12883] Updated weights for policy 0, policy_version 61671 (0.0041) +[2024-06-18 04:50:40,557][12883] Updated weights for policy 0, policy_version 61681 (0.0041) +[2024-06-18 04:50:41,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42052.3, 300 sec: 42098.2). Total num frames: 1010630656. Throughput: 0: 41818.4. Samples: 1010775080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 04:50:41,996][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 04:50:45,227][12883] Updated weights for policy 0, policy_version 61691 (0.0036) +[2024-06-18 04:50:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 1010843648. Throughput: 0: 41926.1. Samples: 1010906560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 04:50:46,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 04:50:48,434][12883] Updated weights for policy 0, policy_version 61701 (0.0041) +[2024-06-18 04:50:51,994][12645] Fps is (10 sec: 37691.3, 60 sec: 41509.4, 300 sec: 41931.9). Total num frames: 1011007488. Throughput: 0: 41828.9. Samples: 1011154220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 04:50:51,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 04:50:52,859][12883] Updated weights for policy 0, policy_version 61711 (0.0034) +[2024-06-18 04:50:56,254][12883] Updated weights for policy 0, policy_version 61721 (0.0035) +[2024-06-18 04:50:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 1011236864. Throughput: 0: 41835.7. Samples: 1011405720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 04:50:56,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 04:51:00,411][12883] Updated weights for policy 0, policy_version 61731 (0.0031) +[2024-06-18 04:51:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1011466240. Throughput: 0: 42132.9. Samples: 1011542280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 04:51:01,994][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 04:51:04,010][12883] Updated weights for policy 0, policy_version 61741 (0.0030) +[2024-06-18 04:51:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 1011646464. Throughput: 0: 41853.2. Samples: 1011785040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 04:51:06,994][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 04:51:08,446][12883] Updated weights for policy 0, policy_version 61751 (0.0033) +[2024-06-18 04:51:11,805][12883] Updated weights for policy 0, policy_version 61761 (0.0038) +[2024-06-18 04:51:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42098.5). Total num frames: 1011892224. Throughput: 0: 41951.9. Samples: 1012033480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) +[2024-06-18 04:51:11,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 04:51:16,387][12883] Updated weights for policy 0, policy_version 61771 (0.0030) +[2024-06-18 04:51:16,996][12645] Fps is (10 sec: 42589.0, 60 sec: 41504.6, 300 sec: 42042.7). Total num frames: 1012072448. Throughput: 0: 41957.9. Samples: 1012164720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) +[2024-06-18 04:51:16,996][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 04:51:19,972][12883] Updated weights for policy 0, policy_version 61781 (0.0044) +[2024-06-18 04:51:22,001][12645] Fps is (10 sec: 40928.7, 60 sec: 42046.9, 300 sec: 42041.9). Total num frames: 1012301824. Throughput: 0: 41927.6. Samples: 1012415700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) +[2024-06-18 04:51:22,002][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 04:51:23,987][12883] Updated weights for policy 0, policy_version 61791 (0.0033) +[2024-06-18 04:51:26,994][12645] Fps is (10 sec: 45884.5, 60 sec: 42326.7, 300 sec: 42154.1). Total num frames: 1012531200. Throughput: 0: 41877.8. Samples: 1012659500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) +[2024-06-18 04:51:26,995][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 04:51:27,603][12883] Updated weights for policy 0, policy_version 61801 (0.0032) +[2024-06-18 04:51:31,831][12883] Updated weights for policy 0, policy_version 61811 (0.0031) +[2024-06-18 04:51:31,994][12645] Fps is (10 sec: 40991.5, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 1012711424. Throughput: 0: 42051.7. Samples: 1012798880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) +[2024-06-18 04:51:31,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 04:51:35,229][12883] Updated weights for policy 0, policy_version 61821 (0.0053) +[2024-06-18 04:51:36,994][12645] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1012940800. Throughput: 0: 42202.3. Samples: 1013053320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) +[2024-06-18 04:51:36,994][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 04:51:38,488][12862] Signal inference workers to stop experience collection... (14650 times) +[2024-06-18 04:51:38,488][12862] Signal inference workers to resume experience collection... (14650 times) +[2024-06-18 04:51:38,535][12883] InferenceWorker_p0-w0: stopping experience collection (14650 times) +[2024-06-18 04:51:38,535][12883] InferenceWorker_p0-w0: resuming experience collection (14650 times) +[2024-06-18 04:51:39,430][12883] Updated weights for policy 0, policy_version 61831 (0.0026) +[2024-06-18 04:51:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42327.0, 300 sec: 42154.4). Total num frames: 1013170176. Throughput: 0: 42082.8. Samples: 1013299440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) +[2024-06-18 04:51:41,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 04:51:42,937][12883] Updated weights for policy 0, policy_version 61841 (0.0031) +[2024-06-18 04:51:46,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1013350400. Throughput: 0: 41878.1. Samples: 1013426800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:51:46,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 04:51:47,053][12883] Updated weights for policy 0, policy_version 61851 (0.0046) +[2024-06-18 04:51:51,035][12883] Updated weights for policy 0, policy_version 61861 (0.0044) +[2024-06-18 04:51:51,994][12645] Fps is (10 sec: 36044.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 1013530624. Throughput: 0: 41952.5. Samples: 1013672900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:51:51,994][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 04:51:54,820][12883] Updated weights for policy 0, policy_version 61871 (0.0042) +[2024-06-18 04:51:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1013776384. Throughput: 0: 42124.4. Samples: 1013929080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:51:56,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 04:51:59,075][12883] Updated weights for policy 0, policy_version 61881 (0.0039) +[2024-06-18 04:52:01,994][12645] Fps is (10 sec: 44234.0, 60 sec: 41778.8, 300 sec: 42042.9). Total num frames: 1013972992. Throughput: 0: 42106.9. Samples: 1014059460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:52:01,995][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 04:52:02,549][12883] Updated weights for policy 0, policy_version 61891 (0.0033) +[2024-06-18 04:52:06,929][12883] Updated weights for policy 0, policy_version 61901 (0.0038) +[2024-06-18 04:52:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 1014185984. Throughput: 0: 41897.4. Samples: 1014300760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:52:06,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 04:52:10,358][12883] Updated weights for policy 0, policy_version 61911 (0.0039) +[2024-06-18 04:52:11,994][12645] Fps is (10 sec: 42601.5, 60 sec: 41779.2, 300 sec: 42154.3). Total num frames: 1014398976. Throughput: 0: 42129.7. Samples: 1014555320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:52:11,994][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 04:52:14,735][12883] Updated weights for policy 0, policy_version 61921 (0.0047) +[2024-06-18 04:52:16,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42052.3, 300 sec: 41987.2). Total num frames: 1014595584. Throughput: 0: 41901.4. Samples: 1014684540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:52:16,996][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 04:52:17,963][12883] Updated weights for policy 0, policy_version 61931 (0.0029) +[2024-06-18 04:52:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41511.4, 300 sec: 41931.9). Total num frames: 1014792192. Throughput: 0: 41752.4. Samples: 1014932180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 04:52:21,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 04:52:22,555][12883] Updated weights for policy 0, policy_version 61941 (0.0030) +[2024-06-18 04:52:26,094][12883] Updated weights for policy 0, policy_version 61951 (0.0044) +[2024-06-18 04:52:26,994][12645] Fps is (10 sec: 44246.0, 60 sec: 41779.3, 300 sec: 42154.3). Total num frames: 1015037952. Throughput: 0: 41713.6. Samples: 1015176560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:52:26,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 04:52:30,517][12883] Updated weights for policy 0, policy_version 61961 (0.0033) +[2024-06-18 04:52:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 1015218176. Throughput: 0: 41877.4. Samples: 1015311280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:52:31,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 04:52:32,883][12862] Signal inference workers to stop experience collection... (14700 times) +[2024-06-18 04:52:32,883][12862] Signal inference workers to resume experience collection... (14700 times) +[2024-06-18 04:52:32,893][12883] InferenceWorker_p0-w0: stopping experience collection (14700 times) +[2024-06-18 04:52:32,893][12883] InferenceWorker_p0-w0: resuming experience collection (14700 times) +[2024-06-18 04:52:33,801][12883] Updated weights for policy 0, policy_version 61971 (0.0023) +[2024-06-18 04:52:36,995][12645] Fps is (10 sec: 39318.4, 60 sec: 41505.4, 300 sec: 42042.9). Total num frames: 1015431168. Throughput: 0: 41653.8. Samples: 1015547360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:52:36,995][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 04:52:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061977_1015431168.pth... +[2024-06-18 04:52:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061362_1005355008.pth +[2024-06-18 04:52:38,384][12883] Updated weights for policy 0, policy_version 61981 (0.0041) +[2024-06-18 04:52:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 1015644160. Throughput: 0: 41541.7. Samples: 1015798460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:52:41,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 04:52:42,068][12883] Updated weights for policy 0, policy_version 61991 (0.0042) +[2024-06-18 04:52:46,261][12883] Updated weights for policy 0, policy_version 62001 (0.0042) +[2024-06-18 04:52:46,994][12645] Fps is (10 sec: 40963.7, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 1015840768. Throughput: 0: 41544.1. Samples: 1015928920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:52:46,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 04:52:49,701][12883] Updated weights for policy 0, policy_version 62011 (0.0040) +[2024-06-18 04:52:51,995][12645] Fps is (10 sec: 40953.0, 60 sec: 42051.1, 300 sec: 41987.2). Total num frames: 1016053760. Throughput: 0: 41635.2. Samples: 1016174420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:52:51,996][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 04:52:54,109][12883] Updated weights for policy 0, policy_version 62021 (0.0036) +[2024-06-18 04:52:56,996][12645] Fps is (10 sec: 44226.9, 60 sec: 41777.6, 300 sec: 42042.7). Total num frames: 1016283136. Throughput: 0: 41632.9. Samples: 1016428900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:52:56,997][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 04:52:57,370][12883] Updated weights for policy 0, policy_version 62031 (0.0049) +[2024-06-18 04:53:01,860][12883] Updated weights for policy 0, policy_version 62041 (0.0044) +[2024-06-18 04:53:01,994][12645] Fps is (10 sec: 42605.8, 60 sec: 41779.6, 300 sec: 41987.5). Total num frames: 1016479744. Throughput: 0: 41573.6. Samples: 1016555260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:53:01,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 04:53:05,165][12883] Updated weights for policy 0, policy_version 62051 (0.0030) +[2024-06-18 04:53:06,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 1016709120. Throughput: 0: 41701.6. Samples: 1016808760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 04:53:06,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 04:53:09,696][12883] Updated weights for policy 0, policy_version 62061 (0.0043) +[2024-06-18 04:53:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1016905728. Throughput: 0: 41917.5. Samples: 1017062840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 04:53:11,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 04:53:13,240][12883] Updated weights for policy 0, policy_version 62071 (0.0033) +[2024-06-18 04:53:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41780.6, 300 sec: 41987.5). Total num frames: 1017102336. Throughput: 0: 41584.3. Samples: 1017182580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 04:53:16,994][12645] Avg episode reward: [(0, '0.104')] +[2024-06-18 04:53:18,106][12883] Updated weights for policy 0, policy_version 62081 (0.0041) +[2024-06-18 04:53:21,083][12883] Updated weights for policy 0, policy_version 62091 (0.0025) +[2024-06-18 04:53:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 1017331712. Throughput: 0: 41868.9. Samples: 1017431420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 04:53:21,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 04:53:25,947][12883] Updated weights for policy 0, policy_version 62101 (0.0033) +[2024-06-18 04:53:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41233.2, 300 sec: 41820.9). Total num frames: 1017511936. Throughput: 0: 42152.1. Samples: 1017695300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 04:53:26,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 04:53:28,846][12883] Updated weights for policy 0, policy_version 62111 (0.0035) +[2024-06-18 04:53:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1017741312. Throughput: 0: 41883.2. Samples: 1017813660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 04:53:31,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 04:53:33,628][12883] Updated weights for policy 0, policy_version 62121 (0.0033) +[2024-06-18 04:53:36,646][12883] Updated weights for policy 0, policy_version 62131 (0.0034) +[2024-06-18 04:53:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42326.0, 300 sec: 42154.1). Total num frames: 1017970688. Throughput: 0: 42048.3. Samples: 1018066520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 04:53:36,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 04:53:41,721][12883] Updated weights for policy 0, policy_version 62141 (0.0039) +[2024-06-18 04:53:41,996][12645] Fps is (10 sec: 39312.7, 60 sec: 41504.6, 300 sec: 41876.4). Total num frames: 1018134528. Throughput: 0: 42312.9. Samples: 1018332980. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-18 04:53:41,997][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 04:53:44,409][12883] Updated weights for policy 0, policy_version 62151 (0.0042) +[2024-06-18 04:53:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1018380288. Throughput: 0: 41995.9. Samples: 1018445080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-18 04:53:46,994][12645] Avg episode reward: [(0, '0.082')] +[2024-06-18 04:53:49,648][12883] Updated weights for policy 0, policy_version 62161 (0.0047) +[2024-06-18 04:53:51,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42326.5, 300 sec: 42043.3). Total num frames: 1018593280. Throughput: 0: 41934.3. Samples: 1018695800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-18 04:53:51,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 04:53:52,555][12883] Updated weights for policy 0, policy_version 62171 (0.0040) +[2024-06-18 04:53:56,994][12645] Fps is (10 sec: 37683.8, 60 sec: 41234.6, 300 sec: 41765.3). Total num frames: 1018757120. Throughput: 0: 41942.6. Samples: 1018950260. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-18 04:53:56,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 04:53:57,346][12883] Updated weights for policy 0, policy_version 62181 (0.0037) +[2024-06-18 04:53:57,520][12862] Signal inference workers to stop experience collection... (14750 times) +[2024-06-18 04:53:57,574][12862] Signal inference workers to resume experience collection... (14750 times) +[2024-06-18 04:53:57,575][12883] InferenceWorker_p0-w0: stopping experience collection (14750 times) +[2024-06-18 04:53:57,594][12883] InferenceWorker_p0-w0: resuming experience collection (14750 times) +[2024-06-18 04:54:00,094][12883] Updated weights for policy 0, policy_version 62191 (0.0037) +[2024-06-18 04:54:01,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42323.7, 300 sec: 42042.7). Total num frames: 1019019264. Throughput: 0: 41923.8. Samples: 1019069240. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-18 04:54:01,996][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 04:54:04,903][12883] Updated weights for policy 0, policy_version 62201 (0.0044) +[2024-06-18 04:54:06,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 1019215872. Throughput: 0: 42096.9. Samples: 1019325780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-18 04:54:06,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 04:54:07,891][12883] Updated weights for policy 0, policy_version 62211 (0.0025) +[2024-06-18 04:54:11,994][12645] Fps is (10 sec: 37691.4, 60 sec: 41506.0, 300 sec: 41821.2). Total num frames: 1019396096. Throughput: 0: 41902.5. Samples: 1019580920. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-18 04:54:11,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 04:54:12,657][12883] Updated weights for policy 0, policy_version 62221 (0.0052) +[2024-06-18 04:54:15,692][12883] Updated weights for policy 0, policy_version 62231 (0.0042) +[2024-06-18 04:54:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 1019658240. Throughput: 0: 41958.2. Samples: 1019701780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) +[2024-06-18 04:54:16,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 04:54:20,177][12883] Updated weights for policy 0, policy_version 62241 (0.0033) +[2024-06-18 04:54:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1019838464. Throughput: 0: 42000.9. Samples: 1019956560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:54:21,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 04:54:23,488][12883] Updated weights for policy 0, policy_version 62251 (0.0039) +[2024-06-18 04:54:26,996][12645] Fps is (10 sec: 37674.7, 60 sec: 42050.6, 300 sec: 41876.1). Total num frames: 1020035072. Throughput: 0: 41630.2. Samples: 1020206340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:54:26,996][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 04:54:27,926][12883] Updated weights for policy 0, policy_version 62261 (0.0041) +[2024-06-18 04:54:31,343][12883] Updated weights for policy 0, policy_version 62271 (0.0036) +[2024-06-18 04:54:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 1020264448. Throughput: 0: 41857.0. Samples: 1020328640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:54:31,995][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 04:54:31,995][12862] Saving new best policy, reward=0.517! +[2024-06-18 04:54:35,901][12883] Updated weights for policy 0, policy_version 62281 (0.0039) +[2024-06-18 04:54:36,994][12645] Fps is (10 sec: 42607.8, 60 sec: 41506.1, 300 sec: 41876.7). Total num frames: 1020461056. Throughput: 0: 41906.2. Samples: 1020581580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:54:36,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 04:54:37,028][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062285_1020477440.pth... +[2024-06-18 04:54:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061670_1010401280.pth +[2024-06-18 04:54:39,220][12883] Updated weights for policy 0, policy_version 62291 (0.0041) +[2024-06-18 04:54:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42327.0, 300 sec: 41876.4). Total num frames: 1020674048. Throughput: 0: 41878.8. Samples: 1020834800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:54:41,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 04:54:43,565][12883] Updated weights for policy 0, policy_version 62301 (0.0030) +[2024-06-18 04:54:46,996][12645] Fps is (10 sec: 42589.1, 60 sec: 41777.8, 300 sec: 41932.3). Total num frames: 1020887040. Throughput: 0: 41907.6. Samples: 1020955080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:54:46,996][12645] Avg episode reward: [(0, '0.047')] +[2024-06-18 04:54:47,181][12883] Updated weights for policy 0, policy_version 62311 (0.0048) +[2024-06-18 04:54:51,383][12883] Updated weights for policy 0, policy_version 62321 (0.0033) +[2024-06-18 04:54:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 1021083648. Throughput: 0: 41889.8. Samples: 1021210820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:54:51,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 04:54:55,059][12883] Updated weights for policy 0, policy_version 62331 (0.0033) +[2024-06-18 04:54:56,993][12645] Fps is (10 sec: 39330.9, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 1021280256. Throughput: 0: 41746.9. Samples: 1021459520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 04:54:56,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 04:54:59,469][12883] Updated weights for policy 0, policy_version 62341 (0.0027) +[2024-06-18 04:55:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41507.7, 300 sec: 41876.4). Total num frames: 1021509632. Throughput: 0: 41877.3. Samples: 1021586260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 04:55:01,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 04:55:02,942][12883] Updated weights for policy 0, policy_version 62351 (0.0040) +[2024-06-18 04:55:06,994][12645] Fps is (10 sec: 42597.4, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 1021706240. Throughput: 0: 41855.0. Samples: 1021840040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 04:55:06,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 04:55:07,011][12883] Updated weights for policy 0, policy_version 62361 (0.0042) +[2024-06-18 04:55:10,670][12883] Updated weights for policy 0, policy_version 62371 (0.0042) +[2024-06-18 04:55:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 1021935616. Throughput: 0: 41752.3. Samples: 1022085100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 04:55:11,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 04:55:14,806][12883] Updated weights for policy 0, policy_version 62381 (0.0051) +[2024-06-18 04:55:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 1022132224. Throughput: 0: 41925.5. Samples: 1022215280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 04:55:16,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 04:55:18,527][12883] Updated weights for policy 0, policy_version 62391 (0.0039) +[2024-06-18 04:55:21,993][12645] Fps is (10 sec: 37683.8, 60 sec: 41233.1, 300 sec: 41765.7). Total num frames: 1022312448. Throughput: 0: 41894.8. Samples: 1022466840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 04:55:21,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 04:55:22,632][12883] Updated weights for policy 0, policy_version 62401 (0.0022) +[2024-06-18 04:55:25,137][12862] Signal inference workers to stop experience collection... (14800 times) +[2024-06-18 04:55:25,174][12883] InferenceWorker_p0-w0: stopping experience collection (14800 times) +[2024-06-18 04:55:25,183][12862] Signal inference workers to resume experience collection... (14800 times) +[2024-06-18 04:55:25,194][12883] InferenceWorker_p0-w0: resuming experience collection (14800 times) +[2024-06-18 04:55:26,623][12883] Updated weights for policy 0, policy_version 62411 (0.0033) +[2024-06-18 04:55:26,996][12645] Fps is (10 sec: 40950.4, 60 sec: 41779.2, 300 sec: 41765.0). Total num frames: 1022541824. Throughput: 0: 41820.4. Samples: 1022716820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 04:55:26,997][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 04:55:30,498][12883] Updated weights for policy 0, policy_version 62421 (0.0037) +[2024-06-18 04:55:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 1022771200. Throughput: 0: 42001.7. Samples: 1022845060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 04:55:31,994][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 04:55:34,495][12883] Updated weights for policy 0, policy_version 62431 (0.0044) +[2024-06-18 04:55:36,994][12645] Fps is (10 sec: 42608.4, 60 sec: 41779.3, 300 sec: 41821.2). Total num frames: 1022967808. Throughput: 0: 41792.5. Samples: 1023091480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:55:36,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 04:55:38,308][12883] Updated weights for policy 0, policy_version 62441 (0.0041) +[2024-06-18 04:55:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 1023180800. Throughput: 0: 41926.1. Samples: 1023346200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:55:41,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 04:55:42,058][12883] Updated weights for policy 0, policy_version 62451 (0.0031) +[2024-06-18 04:55:46,233][12883] Updated weights for policy 0, policy_version 62461 (0.0044) +[2024-06-18 04:55:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42053.8, 300 sec: 42043.0). Total num frames: 1023410176. Throughput: 0: 41940.5. Samples: 1023473580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:55:46,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 04:55:49,752][12883] Updated weights for policy 0, policy_version 62471 (0.0035) +[2024-06-18 04:55:51,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42050.7, 300 sec: 41931.6). Total num frames: 1023606784. Throughput: 0: 41751.3. Samples: 1023718940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:55:51,997][12645] Avg episode reward: [(0, '0.071')] +[2024-06-18 04:55:53,914][12883] Updated weights for policy 0, policy_version 62481 (0.0037) +[2024-06-18 04:55:56,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 1023787008. Throughput: 0: 41990.8. Samples: 1023974680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:55:56,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 04:55:57,632][12883] Updated weights for policy 0, policy_version 62491 (0.0024) +[2024-06-18 04:56:01,490][12883] Updated weights for policy 0, policy_version 62501 (0.0038) +[2024-06-18 04:56:01,994][12645] Fps is (10 sec: 40969.2, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1024016384. Throughput: 0: 41723.5. Samples: 1024092840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:56:01,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 04:56:05,554][12883] Updated weights for policy 0, policy_version 62511 (0.0026) +[2024-06-18 04:56:06,996][12645] Fps is (10 sec: 44228.1, 60 sec: 42051.0, 300 sec: 41820.6). Total num frames: 1024229376. Throughput: 0: 41744.3. Samples: 1024345420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:56:06,996][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 04:56:09,271][12883] Updated weights for policy 0, policy_version 62521 (0.0037) +[2024-06-18 04:56:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41876.7). Total num frames: 1024425984. Throughput: 0: 41818.0. Samples: 1024598540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 04:56:11,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 04:56:13,372][12883] Updated weights for policy 0, policy_version 62531 (0.0052) +[2024-06-18 04:56:16,996][12645] Fps is (10 sec: 42597.1, 60 sec: 42050.7, 300 sec: 41877.2). Total num frames: 1024655360. Throughput: 0: 41672.6. Samples: 1024720420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 04:56:16,996][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 04:56:17,217][12883] Updated weights for policy 0, policy_version 62541 (0.0033) +[2024-06-18 04:56:21,232][12883] Updated weights for policy 0, policy_version 62551 (0.0039) +[2024-06-18 04:56:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 1024835584. Throughput: 0: 41744.0. Samples: 1024969960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 04:56:21,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 04:56:25,079][12883] Updated weights for policy 0, policy_version 62561 (0.0041) +[2024-06-18 04:56:26,996][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41876.1). Total num frames: 1025064960. Throughput: 0: 41750.3. Samples: 1025225060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 04:56:26,997][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 04:56:29,126][12883] Updated weights for policy 0, policy_version 62571 (0.0029) +[2024-06-18 04:56:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 1025294336. Throughput: 0: 41727.6. Samples: 1025351320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 04:56:31,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 04:56:32,900][12883] Updated weights for policy 0, policy_version 62581 (0.0033) +[2024-06-18 04:56:36,839][12883] Updated weights for policy 0, policy_version 62591 (0.0034) +[2024-06-18 04:56:36,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 1025490944. Throughput: 0: 41937.1. Samples: 1025606020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 04:56:36,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 04:56:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062591_1025490944.pth... +[2024-06-18 04:56:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061977_1015431168.pth +[2024-06-18 04:56:40,566][12883] Updated weights for policy 0, policy_version 62601 (0.0040) +[2024-06-18 04:56:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 1025687552. Throughput: 0: 41630.6. Samples: 1025848060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 04:56:41,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 04:56:44,702][12883] Updated weights for policy 0, policy_version 62611 (0.0038) +[2024-06-18 04:56:46,996][12645] Fps is (10 sec: 40951.2, 60 sec: 41504.6, 300 sec: 41931.6). Total num frames: 1025900544. Throughput: 0: 41844.2. Samples: 1025975920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 04:56:46,996][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 04:56:48,290][12883] Updated weights for policy 0, policy_version 62621 (0.0040) +[2024-06-18 04:56:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41507.7, 300 sec: 41765.3). Total num frames: 1026097152. Throughput: 0: 41897.7. Samples: 1026230740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 04:56:51,994][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 04:56:52,507][12883] Updated weights for policy 0, policy_version 62631 (0.0026) +[2024-06-18 04:56:56,203][12883] Updated weights for policy 0, policy_version 62641 (0.0032) +[2024-06-18 04:56:56,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42325.3, 300 sec: 41876.5). Total num frames: 1026326528. Throughput: 0: 41720.1. Samples: 1026475940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:56:56,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 04:57:00,430][12883] Updated weights for policy 0, policy_version 62651 (0.0039) +[2024-06-18 04:57:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 1026539520. Throughput: 0: 42017.2. Samples: 1026611100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:57:01,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 04:57:04,267][12883] Updated weights for policy 0, policy_version 62661 (0.0029) +[2024-06-18 04:57:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41780.5, 300 sec: 41820.8). Total num frames: 1026736128. Throughput: 0: 41888.9. Samples: 1026854960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:57:06,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 04:57:08,467][12883] Updated weights for policy 0, policy_version 62671 (0.0049) +[2024-06-18 04:57:11,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 41876.4). Total num frames: 1026949120. Throughput: 0: 41831.6. Samples: 1027107480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:57:11,997][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 04:57:12,187][12883] Updated weights for policy 0, policy_version 62681 (0.0037) +[2024-06-18 04:57:15,252][12862] Signal inference workers to stop experience collection... (14850 times) +[2024-06-18 04:57:15,252][12862] Signal inference workers to resume experience collection... (14850 times) +[2024-06-18 04:57:15,291][12883] InferenceWorker_p0-w0: stopping experience collection (14850 times) +[2024-06-18 04:57:15,292][12883] InferenceWorker_p0-w0: resuming experience collection (14850 times) +[2024-06-18 04:57:16,146][12883] Updated weights for policy 0, policy_version 62691 (0.0037) +[2024-06-18 04:57:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41507.7, 300 sec: 41876.4). Total num frames: 1027145728. Throughput: 0: 41904.9. Samples: 1027237040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:57:16,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 04:57:20,234][12883] Updated weights for policy 0, policy_version 62701 (0.0038) +[2024-06-18 04:57:21,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 1027358720. Throughput: 0: 41889.8. Samples: 1027491060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:57:21,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 04:57:24,122][12883] Updated weights for policy 0, policy_version 62711 (0.0042) +[2024-06-18 04:57:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42053.8, 300 sec: 41931.9). Total num frames: 1027588096. Throughput: 0: 42018.6. Samples: 1027738900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:57:26,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 04:57:27,923][12883] Updated weights for policy 0, policy_version 62721 (0.0028) +[2024-06-18 04:57:31,832][12883] Updated weights for policy 0, policy_version 62731 (0.0026) +[2024-06-18 04:57:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41876.5). Total num frames: 1027784704. Throughput: 0: 42062.1. Samples: 1027868620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 04:57:31,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 04:57:36,017][12883] Updated weights for policy 0, policy_version 62741 (0.0047) +[2024-06-18 04:57:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 1027997696. Throughput: 0: 42028.5. Samples: 1028122020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:57:36,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 04:57:39,731][12883] Updated weights for policy 0, policy_version 62751 (0.0029) +[2024-06-18 04:57:41,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42050.6, 300 sec: 41931.6). Total num frames: 1028210688. Throughput: 0: 42001.8. Samples: 1028366120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:57:41,997][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 04:57:43,792][12883] Updated weights for policy 0, policy_version 62761 (0.0044) +[2024-06-18 04:57:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.8, 300 sec: 41932.2). Total num frames: 1028423680. Throughput: 0: 41957.7. Samples: 1028499200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:57:46,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 04:57:47,463][12883] Updated weights for policy 0, policy_version 62771 (0.0034) +[2024-06-18 04:57:51,601][12883] Updated weights for policy 0, policy_version 62781 (0.0035) +[2024-06-18 04:57:51,994][12645] Fps is (10 sec: 39331.1, 60 sec: 41779.3, 300 sec: 41765.7). Total num frames: 1028603904. Throughput: 0: 41993.0. Samples: 1028744640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:57:51,994][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 04:57:55,183][12883] Updated weights for policy 0, policy_version 62791 (0.0025) +[2024-06-18 04:57:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1028849664. Throughput: 0: 41962.5. Samples: 1028995700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:57:56,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 04:57:59,166][12883] Updated weights for policy 0, policy_version 62801 (0.0026) +[2024-06-18 04:58:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 1029046272. Throughput: 0: 42096.9. Samples: 1029131400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:01,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 04:58:02,899][12883] Updated weights for policy 0, policy_version 62811 (0.0032) +[2024-06-18 04:58:06,587][12883] Updated weights for policy 0, policy_version 62821 (0.0035) +[2024-06-18 04:58:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 1029259264. Throughput: 0: 41950.7. Samples: 1029378840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:06,995][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 04:58:10,507][12883] Updated weights for policy 0, policy_version 62831 (0.0035) +[2024-06-18 04:58:11,996][12645] Fps is (10 sec: 44228.0, 60 sec: 42325.5, 300 sec: 41987.2). Total num frames: 1029488640. Throughput: 0: 42068.8. Samples: 1029632080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:11,996][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 04:58:14,179][12883] Updated weights for policy 0, policy_version 62841 (0.0032) +[2024-06-18 04:58:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 1029685248. Throughput: 0: 42003.0. Samples: 1029758760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:16,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 04:58:18,457][12883] Updated weights for policy 0, policy_version 62851 (0.0039) +[2024-06-18 04:58:21,943][12883] Updated weights for policy 0, policy_version 62861 (0.0028) +[2024-06-18 04:58:21,994][12645] Fps is (10 sec: 42607.1, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 1029914624. Throughput: 0: 42109.8. Samples: 1030016960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:21,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 04:58:26,339][12883] Updated weights for policy 0, policy_version 62871 (0.0036) +[2024-06-18 04:58:27,000][12645] Fps is (10 sec: 44210.0, 60 sec: 42321.0, 300 sec: 41986.6). Total num frames: 1030127616. Throughput: 0: 42261.7. Samples: 1030268060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:27,000][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 04:58:28,961][12862] Signal inference workers to stop experience collection... (14900 times) +[2024-06-18 04:58:28,962][12862] Signal inference workers to resume experience collection... (14900 times) +[2024-06-18 04:58:28,981][12883] InferenceWorker_p0-w0: stopping experience collection (14900 times) +[2024-06-18 04:58:28,981][12883] InferenceWorker_p0-w0: resuming experience collection (14900 times) +[2024-06-18 04:58:29,839][12883] Updated weights for policy 0, policy_version 62881 (0.0032) +[2024-06-18 04:58:31,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.8, 300 sec: 41876.1). Total num frames: 1030324224. Throughput: 0: 42135.7. Samples: 1030395400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:31,996][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 04:58:33,836][12883] Updated weights for policy 0, policy_version 62891 (0.0038) +[2024-06-18 04:58:36,994][12645] Fps is (10 sec: 40985.4, 60 sec: 42325.3, 300 sec: 42043.3). Total num frames: 1030537216. Throughput: 0: 42232.3. Samples: 1030645100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:36,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 04:58:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062899_1030537216.pth... +[2024-06-18 04:58:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062285_1020477440.pth +[2024-06-18 04:58:37,562][12883] Updated weights for policy 0, policy_version 62901 (0.0028) +[2024-06-18 04:58:41,422][12883] Updated weights for policy 0, policy_version 62911 (0.0041) +[2024-06-18 04:58:41,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42327.0, 300 sec: 41932.0). Total num frames: 1030750208. Throughput: 0: 42202.3. Samples: 1030894800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:41,994][12645] Avg episode reward: [(0, '0.084')] +[2024-06-18 04:58:45,332][12883] Updated weights for policy 0, policy_version 62921 (0.0034) +[2024-06-18 04:58:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 1030963200. Throughput: 0: 42038.7. Samples: 1031023140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 04:58:46,994][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 04:58:49,308][12883] Updated weights for policy 0, policy_version 62931 (0.0049) +[2024-06-18 04:58:51,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.2, 300 sec: 42043.0). Total num frames: 1031159808. Throughput: 0: 42120.8. Samples: 1031274280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 04:58:51,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 04:58:53,151][12883] Updated weights for policy 0, policy_version 62941 (0.0044) +[2024-06-18 04:58:56,975][12883] Updated weights for policy 0, policy_version 62951 (0.0031) +[2024-06-18 04:58:56,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42323.8, 300 sec: 41931.9). Total num frames: 1031389184. Throughput: 0: 42122.4. Samples: 1031527600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 04:58:56,996][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 04:59:00,949][12883] Updated weights for policy 0, policy_version 62961 (0.0036) +[2024-06-18 04:59:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 1031569408. Throughput: 0: 42010.0. Samples: 1031649200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 04:59:01,994][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 04:59:04,702][12883] Updated weights for policy 0, policy_version 62971 (0.0042) +[2024-06-18 04:59:06,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1031798784. Throughput: 0: 41939.6. Samples: 1031904240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 04:59:06,994][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 04:59:08,580][12883] Updated weights for policy 0, policy_version 62981 (0.0035) +[2024-06-18 04:59:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41780.6, 300 sec: 41820.9). Total num frames: 1031995392. Throughput: 0: 42008.9. Samples: 1032158200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 04:59:11,994][12645] Avg episode reward: [(0, '0.103')] +[2024-06-18 04:59:13,070][12883] Updated weights for policy 0, policy_version 62991 (0.0033) +[2024-06-18 04:59:16,811][12883] Updated weights for policy 0, policy_version 63001 (0.0033) +[2024-06-18 04:59:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 1032208384. Throughput: 0: 41925.2. Samples: 1032281940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 04:59:16,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 04:59:20,763][12883] Updated weights for policy 0, policy_version 63011 (0.0036) +[2024-06-18 04:59:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41932.3). Total num frames: 1032404992. Throughput: 0: 41987.2. Samples: 1032534520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 04:59:21,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 04:59:24,636][12883] Updated weights for policy 0, policy_version 63021 (0.0039) +[2024-06-18 04:59:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41510.4, 300 sec: 41876.4). Total num frames: 1032617984. Throughput: 0: 42088.8. Samples: 1032788800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 04:59:26,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 04:59:28,433][12883] Updated weights for policy 0, policy_version 63031 (0.0037) +[2024-06-18 04:59:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42053.9, 300 sec: 41987.5). Total num frames: 1032847360. Throughput: 0: 42061.3. Samples: 1032915900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:59:31,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 04:59:32,260][12883] Updated weights for policy 0, policy_version 63041 (0.0042) +[2024-06-18 04:59:35,997][12883] Updated weights for policy 0, policy_version 63051 (0.0038) +[2024-06-18 04:59:36,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42050.7, 300 sec: 41987.1). Total num frames: 1033060352. Throughput: 0: 42038.4. Samples: 1033166100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:59:36,996][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 04:59:39,907][12883] Updated weights for policy 0, policy_version 63061 (0.0029) +[2024-06-18 04:59:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 1033256960. Throughput: 0: 42154.2. Samples: 1033424440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:59:41,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 04:59:43,971][12883] Updated weights for policy 0, policy_version 63071 (0.0047) +[2024-06-18 04:59:46,996][12645] Fps is (10 sec: 40960.0, 60 sec: 41777.6, 300 sec: 41987.2). Total num frames: 1033469952. Throughput: 0: 42122.3. Samples: 1033544800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:59:46,996][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 04:59:47,867][12883] Updated weights for policy 0, policy_version 63081 (0.0034) +[2024-06-18 04:59:51,883][12883] Updated weights for policy 0, policy_version 63091 (0.0045) +[2024-06-18 04:59:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 1033682944. Throughput: 0: 42101.7. Samples: 1033798820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:59:51,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 04:59:55,515][12883] Updated weights for policy 0, policy_version 63101 (0.0031) +[2024-06-18 04:59:56,994][12645] Fps is (10 sec: 39330.8, 60 sec: 41234.7, 300 sec: 41876.4). Total num frames: 1033863168. Throughput: 0: 42199.2. Samples: 1034057160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 04:59:56,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 04:59:59,567][12883] Updated weights for policy 0, policy_version 63111 (0.0037) +[2024-06-18 05:00:02,000][12645] Fps is (10 sec: 44209.2, 60 sec: 42593.9, 300 sec: 42097.7). Total num frames: 1034125312. Throughput: 0: 42048.4. Samples: 1034174380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:00:02,000][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 05:00:03,772][12883] Updated weights for policy 0, policy_version 63121 (0.0044) +[2024-06-18 05:00:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 1034289152. Throughput: 0: 42019.9. Samples: 1034425420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:00:06,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 05:00:07,575][12883] Updated weights for policy 0, policy_version 63131 (0.0047) +[2024-06-18 05:00:11,641][12883] Updated weights for policy 0, policy_version 63141 (0.0036) +[2024-06-18 05:00:11,994][12645] Fps is (10 sec: 37706.2, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 1034502144. Throughput: 0: 41930.6. Samples: 1034675680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 05:00:11,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 05:00:15,472][12883] Updated weights for policy 0, policy_version 63151 (0.0038) +[2024-06-18 05:00:16,996][12645] Fps is (10 sec: 45864.7, 60 sec: 42323.7, 300 sec: 42153.7). Total num frames: 1034747904. Throughput: 0: 41932.1. Samples: 1034802940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 05:00:16,997][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 05:00:17,545][12862] Signal inference workers to stop experience collection... (14950 times) +[2024-06-18 05:00:17,546][12862] Signal inference workers to resume experience collection... (14950 times) +[2024-06-18 05:00:17,562][12883] InferenceWorker_p0-w0: stopping experience collection (14950 times) +[2024-06-18 05:00:17,562][12883] InferenceWorker_p0-w0: resuming experience collection (14950 times) +[2024-06-18 05:00:19,581][12883] Updated weights for policy 0, policy_version 63161 (0.0027) +[2024-06-18 05:00:21,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 1034911744. Throughput: 0: 41914.6. Samples: 1035052160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 05:00:21,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 05:00:23,169][12883] Updated weights for policy 0, policy_version 63171 (0.0053) +[2024-06-18 05:00:26,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 1035141120. Throughput: 0: 41693.2. Samples: 1035300640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 05:00:26,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 05:00:27,323][12883] Updated weights for policy 0, policy_version 63181 (0.0030) +[2024-06-18 05:00:31,191][12883] Updated weights for policy 0, policy_version 63191 (0.0027) +[2024-06-18 05:00:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1035370496. Throughput: 0: 41922.1. Samples: 1035431200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 05:00:31,994][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 05:00:35,158][12883] Updated weights for policy 0, policy_version 63201 (0.0033) +[2024-06-18 05:00:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41780.8, 300 sec: 41987.5). Total num frames: 1035567104. Throughput: 0: 41984.4. Samples: 1035688120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 05:00:36,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 05:00:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063206_1035567104.pth... +[2024-06-18 05:00:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062591_1025490944.pth +[2024-06-18 05:00:38,963][12883] Updated weights for policy 0, policy_version 63211 (0.0036) +[2024-06-18 05:00:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1035780096. Throughput: 0: 41659.4. Samples: 1035931840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 05:00:41,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 05:00:42,805][12883] Updated weights for policy 0, policy_version 63221 (0.0040) +[2024-06-18 05:00:46,811][12883] Updated weights for policy 0, policy_version 63231 (0.0036) +[2024-06-18 05:00:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41780.8, 300 sec: 41932.3). Total num frames: 1035976704. Throughput: 0: 41822.7. Samples: 1036056140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:00:46,994][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 05:00:50,866][12883] Updated weights for policy 0, policy_version 63241 (0.0031) +[2024-06-18 05:00:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1036206080. Throughput: 0: 42030.6. Samples: 1036316800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:00:51,994][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 05:00:54,551][12883] Updated weights for policy 0, policy_version 63251 (0.0032) +[2024-06-18 05:00:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 1036419072. Throughput: 0: 41883.2. Samples: 1036560420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:00:56,994][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 05:00:58,528][12883] Updated weights for policy 0, policy_version 63261 (0.0041) +[2024-06-18 05:01:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41237.3, 300 sec: 41932.2). Total num frames: 1036599296. Throughput: 0: 41994.9. Samples: 1036692620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:01:01,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 05:01:02,421][12883] Updated weights for policy 0, policy_version 63271 (0.0031) +[2024-06-18 05:01:06,236][12883] Updated weights for policy 0, policy_version 63281 (0.0031) +[2024-06-18 05:01:06,996][12645] Fps is (10 sec: 37674.8, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 1036795904. Throughput: 0: 41909.4. Samples: 1036938180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:01:06,996][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 05:01:10,151][12883] Updated weights for policy 0, policy_version 63291 (0.0047) +[2024-06-18 05:01:11,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.5, 300 sec: 41987.8). Total num frames: 1037041664. Throughput: 0: 41950.4. Samples: 1037188400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:01:11,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 05:01:14,262][12883] Updated weights for policy 0, policy_version 63301 (0.0037) +[2024-06-18 05:01:17,000][12645] Fps is (10 sec: 44218.8, 60 sec: 41503.3, 300 sec: 42042.1). Total num frames: 1037238272. Throughput: 0: 41929.2. Samples: 1037318280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:01:17,001][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 05:01:17,622][12883] Updated weights for policy 0, policy_version 63311 (0.0027) +[2024-06-18 05:01:21,866][12883] Updated weights for policy 0, policy_version 63321 (0.0038) +[2024-06-18 05:01:21,996][12645] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 41987.5). Total num frames: 1037451264. Throughput: 0: 41734.3. Samples: 1037566260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:01:21,997][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 05:01:25,687][12883] Updated weights for policy 0, policy_version 63331 (0.0030) +[2024-06-18 05:01:26,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1037664256. Throughput: 0: 41767.5. Samples: 1037811380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 05:01:26,994][12645] Avg episode reward: [(0, '0.079')] +[2024-06-18 05:01:29,593][12883] Updated weights for policy 0, policy_version 63341 (0.0032) +[2024-06-18 05:01:31,994][12645] Fps is (10 sec: 40969.6, 60 sec: 41506.2, 300 sec: 41932.0). Total num frames: 1037860864. Throughput: 0: 41809.3. Samples: 1037937560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 05:01:31,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 05:01:33,491][12883] Updated weights for policy 0, policy_version 63351 (0.0043) +[2024-06-18 05:01:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1038073856. Throughput: 0: 41593.5. Samples: 1038188500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 05:01:36,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 05:01:37,482][12883] Updated weights for policy 0, policy_version 63361 (0.0038) +[2024-06-18 05:01:41,205][12883] Updated weights for policy 0, policy_version 63371 (0.0039) +[2024-06-18 05:01:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 41987.8). Total num frames: 1038286848. Throughput: 0: 41780.1. Samples: 1038440520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 05:01:41,994][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 05:01:45,342][12883] Updated weights for policy 0, policy_version 63381 (0.0035) +[2024-06-18 05:01:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1038483456. Throughput: 0: 41726.9. Samples: 1038570320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 05:01:46,994][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 05:01:49,121][12883] Updated weights for policy 0, policy_version 63391 (0.0042) +[2024-06-18 05:01:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1038712832. Throughput: 0: 41822.0. Samples: 1038820080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 05:01:51,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 05:01:53,301][12883] Updated weights for policy 0, policy_version 63401 (0.0044) +[2024-06-18 05:01:56,994][12645] Fps is (10 sec: 44235.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1038925824. Throughput: 0: 41775.8. Samples: 1039068320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 05:01:56,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 05:01:56,995][12883] Updated weights for policy 0, policy_version 63411 (0.0036) +[2024-06-18 05:02:01,103][12883] Updated weights for policy 0, policy_version 63421 (0.0024) +[2024-06-18 05:02:01,994][12645] Fps is (10 sec: 39322.3, 60 sec: 41779.4, 300 sec: 41931.9). Total num frames: 1039106048. Throughput: 0: 41777.1. Samples: 1039197980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 05:02:01,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 05:02:02,619][12862] Signal inference workers to stop experience collection... (15000 times) +[2024-06-18 05:02:02,620][12862] Signal inference workers to resume experience collection... (15000 times) +[2024-06-18 05:02:02,654][12883] InferenceWorker_p0-w0: stopping experience collection (15000 times) +[2024-06-18 05:02:02,654][12883] InferenceWorker_p0-w0: resuming experience collection (15000 times) +[2024-06-18 05:02:05,307][12883] Updated weights for policy 0, policy_version 63431 (0.0033) +[2024-06-18 05:02:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42326.9, 300 sec: 41987.8). Total num frames: 1039335424. Throughput: 0: 41639.4. Samples: 1039439940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 05:02:06,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 05:02:08,998][12883] Updated weights for policy 0, policy_version 63441 (0.0032) +[2024-06-18 05:02:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 1039532032. Throughput: 0: 41760.5. Samples: 1039690600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 05:02:11,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 05:02:12,950][12883] Updated weights for policy 0, policy_version 63451 (0.0038) +[2024-06-18 05:02:16,805][12883] Updated weights for policy 0, policy_version 63461 (0.0036) +[2024-06-18 05:02:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41783.6, 300 sec: 41987.5). Total num frames: 1039745024. Throughput: 0: 41749.7. Samples: 1039816300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 05:02:17,007][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 05:02:20,592][12883] Updated weights for policy 0, policy_version 63471 (0.0030) +[2024-06-18 05:02:21,996][12645] Fps is (10 sec: 42589.0, 60 sec: 41779.2, 300 sec: 41931.6). Total num frames: 1039958016. Throughput: 0: 41751.2. Samples: 1040067400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 05:02:21,996][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 05:02:25,099][12883] Updated weights for policy 0, policy_version 63481 (0.0045) +[2024-06-18 05:02:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.3, 300 sec: 41932.0). Total num frames: 1040154624. Throughput: 0: 41737.8. Samples: 1040318720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 05:02:26,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 05:02:28,258][12883] Updated weights for policy 0, policy_version 63491 (0.0028) +[2024-06-18 05:02:31,994][12645] Fps is (10 sec: 39330.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 1040351232. Throughput: 0: 41598.6. Samples: 1040442260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 05:02:31,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 05:02:33,075][12883] Updated weights for policy 0, policy_version 63501 (0.0037) +[2024-06-18 05:02:36,184][12883] Updated weights for policy 0, policy_version 63511 (0.0037) +[2024-06-18 05:02:37,000][12645] Fps is (10 sec: 44208.7, 60 sec: 42047.8, 300 sec: 41986.9). Total num frames: 1040596992. Throughput: 0: 41535.2. Samples: 1040689420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 05:02:37,001][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 05:02:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063513_1040596992.pth... +[2024-06-18 05:02:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062899_1030537216.pth +[2024-06-18 05:02:40,790][12883] Updated weights for policy 0, policy_version 63521 (0.0025) +[2024-06-18 05:02:42,000][12645] Fps is (10 sec: 42571.6, 60 sec: 41501.8, 300 sec: 41875.5). Total num frames: 1040777216. Throughput: 0: 41826.3. Samples: 1040950760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 05:02:42,000][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 05:02:43,938][12883] Updated weights for policy 0, policy_version 63531 (0.0038) +[2024-06-18 05:02:46,993][12645] Fps is (10 sec: 37707.3, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 1040973824. Throughput: 0: 41488.9. Samples: 1041064980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:02:46,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 05:02:48,357][12883] Updated weights for policy 0, policy_version 63541 (0.0038) +[2024-06-18 05:02:51,623][12883] Updated weights for policy 0, policy_version 63551 (0.0028) +[2024-06-18 05:02:51,994][12645] Fps is (10 sec: 45903.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1041235968. Throughput: 0: 41849.4. Samples: 1041323160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:02:51,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 05:02:56,716][12883] Updated weights for policy 0, policy_version 63561 (0.0034) +[2024-06-18 05:02:56,996][12645] Fps is (10 sec: 40950.1, 60 sec: 40958.5, 300 sec: 41820.5). Total num frames: 1041383424. Throughput: 0: 42007.7. Samples: 1041581040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:02:56,997][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 05:02:59,425][12883] Updated weights for policy 0, policy_version 63571 (0.0025) +[2024-06-18 05:03:01,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 1041612800. Throughput: 0: 41733.4. Samples: 1041694300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:03:01,994][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 05:03:04,467][12883] Updated weights for policy 0, policy_version 63581 (0.0029) +[2024-06-18 05:03:06,994][12645] Fps is (10 sec: 45885.7, 60 sec: 41779.3, 300 sec: 41876.7). Total num frames: 1041842176. Throughput: 0: 41855.0. Samples: 1041950780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:03:06,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 05:03:07,512][12883] Updated weights for policy 0, policy_version 63591 (0.0042) +[2024-06-18 05:03:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 1042022400. Throughput: 0: 41826.2. Samples: 1042200900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:03:11,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 05:03:12,284][12883] Updated weights for policy 0, policy_version 63601 (0.0027) +[2024-06-18 05:03:15,209][12883] Updated weights for policy 0, policy_version 63611 (0.0040) +[2024-06-18 05:03:16,995][12645] Fps is (10 sec: 40955.9, 60 sec: 41778.5, 300 sec: 41820.7). Total num frames: 1042251776. Throughput: 0: 41812.8. Samples: 1042323880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:03:16,995][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 05:03:18,091][12862] Signal inference workers to stop experience collection... (15050 times) +[2024-06-18 05:03:18,091][12862] Signal inference workers to resume experience collection... (15050 times) +[2024-06-18 05:03:18,141][12883] InferenceWorker_p0-w0: stopping experience collection (15050 times) +[2024-06-18 05:03:18,141][12883] InferenceWorker_p0-w0: resuming experience collection (15050 times) +[2024-06-18 05:03:20,096][12883] Updated weights for policy 0, policy_version 63621 (0.0038) +[2024-06-18 05:03:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41780.7, 300 sec: 41821.7). Total num frames: 1042464768. Throughput: 0: 42003.6. Samples: 1042579320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:03:21,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 05:03:23,159][12883] Updated weights for policy 0, policy_version 63631 (0.0035) +[2024-06-18 05:03:26,994][12645] Fps is (10 sec: 42602.8, 60 sec: 42052.2, 300 sec: 41876.7). Total num frames: 1042677760. Throughput: 0: 41811.6. Samples: 1042832020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 05:03:26,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 05:03:27,686][12883] Updated weights for policy 0, policy_version 63641 (0.0035) +[2024-06-18 05:03:30,699][12883] Updated weights for policy 0, policy_version 63651 (0.0037) +[2024-06-18 05:03:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 1042907136. Throughput: 0: 42052.7. Samples: 1042957360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 05:03:31,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 05:03:35,508][12883] Updated weights for policy 0, policy_version 63661 (0.0045) +[2024-06-18 05:03:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41783.6, 300 sec: 41876.4). Total num frames: 1043103744. Throughput: 0: 41880.5. Samples: 1043207780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 05:03:36,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 05:03:38,744][12883] Updated weights for policy 0, policy_version 63671 (0.0031) +[2024-06-18 05:03:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42056.7, 300 sec: 41820.9). Total num frames: 1043300352. Throughput: 0: 41639.1. Samples: 1043454700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 05:03:41,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 05:03:43,301][12883] Updated weights for policy 0, policy_version 63681 (0.0037) +[2024-06-18 05:03:46,575][12883] Updated weights for policy 0, policy_version 63691 (0.0048) +[2024-06-18 05:03:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 1043513344. Throughput: 0: 41921.2. Samples: 1043580760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 05:03:46,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 05:03:51,461][12883] Updated weights for policy 0, policy_version 63701 (0.0030) +[2024-06-18 05:03:51,996][12645] Fps is (10 sec: 42588.4, 60 sec: 41504.6, 300 sec: 41820.9). Total num frames: 1043726336. Throughput: 0: 41932.1. Samples: 1043837820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 05:03:51,996][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 05:03:54,067][12883] Updated weights for policy 0, policy_version 63711 (0.0028) +[2024-06-18 05:03:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42327.0, 300 sec: 41876.4). Total num frames: 1043922944. Throughput: 0: 41875.6. Samples: 1044085300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 05:03:56,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 05:03:59,060][12883] Updated weights for policy 0, policy_version 63721 (0.0035) +[2024-06-18 05:04:01,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 1044152320. Throughput: 0: 42016.5. Samples: 1044214580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 05:04:01,994][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 05:04:02,170][12883] Updated weights for policy 0, policy_version 63731 (0.0039) +[2024-06-18 05:04:06,642][12883] Updated weights for policy 0, policy_version 63741 (0.0034) +[2024-06-18 05:04:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 1044348928. Throughput: 0: 42061.4. Samples: 1044472080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 05:04:06,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 05:04:09,796][12883] Updated weights for policy 0, policy_version 63751 (0.0048) +[2024-06-18 05:04:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 1044561920. Throughput: 0: 41929.7. Samples: 1044718860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 05:04:11,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 05:04:14,291][12883] Updated weights for policy 0, policy_version 63761 (0.0037) +[2024-06-18 05:04:16,995][12645] Fps is (10 sec: 44230.5, 60 sec: 42325.1, 300 sec: 41987.3). Total num frames: 1044791296. Throughput: 0: 42049.4. Samples: 1044849640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 05:04:16,996][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 05:04:17,539][12883] Updated weights for policy 0, policy_version 63771 (0.0026) +[2024-06-18 05:04:21,994][12645] Fps is (10 sec: 40958.9, 60 sec: 41779.0, 300 sec: 41876.4). Total num frames: 1044971520. Throughput: 0: 42107.8. Samples: 1045102640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 05:04:21,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 05:04:22,054][12883] Updated weights for policy 0, policy_version 63781 (0.0034) +[2024-06-18 05:04:25,138][12883] Updated weights for policy 0, policy_version 63791 (0.0040) +[2024-06-18 05:04:26,994][12645] Fps is (10 sec: 40965.0, 60 sec: 42052.1, 300 sec: 41876.4). Total num frames: 1045200896. Throughput: 0: 42242.5. Samples: 1045355620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 05:04:26,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 05:04:29,987][12883] Updated weights for policy 0, policy_version 63801 (0.0029) +[2024-06-18 05:04:31,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41779.2, 300 sec: 41876.7). Total num frames: 1045413888. Throughput: 0: 42378.2. Samples: 1045487780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 05:04:31,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 05:04:32,839][12883] Updated weights for policy 0, policy_version 63811 (0.0040) +[2024-06-18 05:04:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 1045610496. Throughput: 0: 42302.6. Samples: 1045741340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 05:04:36,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 05:04:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063819_1045610496.pth... +[2024-06-18 05:04:37,094][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063206_1035567104.pth +[2024-06-18 05:04:37,750][12883] Updated weights for policy 0, policy_version 63821 (0.0032) +[2024-06-18 05:04:40,639][12883] Updated weights for policy 0, policy_version 63831 (0.0028) +[2024-06-18 05:04:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41876.7). Total num frames: 1045823488. Throughput: 0: 42206.6. Samples: 1045984600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 05:04:41,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 05:04:43,249][12862] Signal inference workers to stop experience collection... (15100 times) +[2024-06-18 05:04:43,298][12883] InferenceWorker_p0-w0: stopping experience collection (15100 times) +[2024-06-18 05:04:43,360][12862] Signal inference workers to resume experience collection... (15100 times) +[2024-06-18 05:04:43,361][12883] InferenceWorker_p0-w0: resuming experience collection (15100 times) +[2024-06-18 05:04:45,526][12883] Updated weights for policy 0, policy_version 63841 (0.0028) +[2024-06-18 05:04:47,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42048.0, 300 sec: 41875.5). Total num frames: 1046036480. Throughput: 0: 42210.1. Samples: 1046114300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 05:04:47,000][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 05:04:48,521][12883] Updated weights for policy 0, policy_version 63851 (0.0043) +[2024-06-18 05:04:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42053.8, 300 sec: 41987.4). Total num frames: 1046249472. Throughput: 0: 42199.9. Samples: 1046371080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 05:04:51,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 05:04:53,399][12883] Updated weights for policy 0, policy_version 63861 (0.0025) +[2024-06-18 05:04:56,338][12883] Updated weights for policy 0, policy_version 63871 (0.0039) +[2024-06-18 05:04:56,994][12645] Fps is (10 sec: 44263.7, 60 sec: 42598.3, 300 sec: 41877.3). Total num frames: 1046478848. Throughput: 0: 42003.8. Samples: 1046609040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 05:04:56,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 05:05:01,058][12883] Updated weights for policy 0, policy_version 63881 (0.0041) +[2024-06-18 05:05:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 1046659072. Throughput: 0: 42086.5. Samples: 1046743480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 05:05:01,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 05:05:03,994][12883] Updated weights for policy 0, policy_version 63891 (0.0033) +[2024-06-18 05:05:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 1046855680. Throughput: 0: 42055.3. Samples: 1046995120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 05:05:06,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 05:05:08,731][12883] Updated weights for policy 0, policy_version 63901 (0.0029) +[2024-06-18 05:05:11,888][12883] Updated weights for policy 0, policy_version 63911 (0.0027) +[2024-06-18 05:05:11,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 41931.9). Total num frames: 1047117824. Throughput: 0: 41899.3. Samples: 1047241180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 05:05:11,997][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 05:05:16,652][12883] Updated weights for policy 0, policy_version 63921 (0.0048) +[2024-06-18 05:05:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41780.1, 300 sec: 41987.5). Total num frames: 1047298048. Throughput: 0: 42017.8. Samples: 1047378580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 05:05:16,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 05:05:19,693][12883] Updated weights for policy 0, policy_version 63931 (0.0030) +[2024-06-18 05:05:21,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 1047511040. Throughput: 0: 41963.8. Samples: 1047629720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 05:05:21,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 05:05:24,318][12883] Updated weights for policy 0, policy_version 63941 (0.0028) +[2024-06-18 05:05:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 41931.9). Total num frames: 1047740416. Throughput: 0: 42128.0. Samples: 1047880360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 05:05:26,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 05:05:27,839][12883] Updated weights for policy 0, policy_version 63951 (0.0041) +[2024-06-18 05:05:31,928][12883] Updated weights for policy 0, policy_version 63961 (0.0041) +[2024-06-18 05:05:31,996][12645] Fps is (10 sec: 42589.5, 60 sec: 42050.8, 300 sec: 41931.6). Total num frames: 1047937024. Throughput: 0: 42179.8. Samples: 1048012220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 05:05:32,008][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 05:05:35,460][12883] Updated weights for policy 0, policy_version 63971 (0.0026) +[2024-06-18 05:05:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 1048133632. Throughput: 0: 41935.6. Samples: 1048258180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 05:05:36,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 05:05:39,649][12883] Updated weights for policy 0, policy_version 63981 (0.0028) +[2024-06-18 05:05:41,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 1048379392. Throughput: 0: 42177.9. Samples: 1048507040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 05:05:41,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 05:05:43,221][12883] Updated weights for policy 0, policy_version 63991 (0.0035) +[2024-06-18 05:05:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42056.6, 300 sec: 41876.4). Total num frames: 1048559616. Throughput: 0: 42301.8. Samples: 1048647060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 05:05:46,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 05:05:47,540][12883] Updated weights for policy 0, policy_version 64001 (0.0028) +[2024-06-18 05:05:51,025][12883] Updated weights for policy 0, policy_version 64011 (0.0038) +[2024-06-18 05:05:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 1048756224. Throughput: 0: 42280.9. Samples: 1048897760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 05:05:51,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 05:05:55,060][12883] Updated weights for policy 0, policy_version 64021 (0.0037) +[2024-06-18 05:05:56,997][12645] Fps is (10 sec: 47499.9, 60 sec: 42596.4, 300 sec: 42153.7). Total num frames: 1049034752. Throughput: 0: 42297.2. Samples: 1049144580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 05:05:56,997][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 05:05:59,122][12883] Updated weights for policy 0, policy_version 64031 (0.0030) +[2024-06-18 05:06:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41987.8). Total num frames: 1049182208. Throughput: 0: 42264.4. Samples: 1049280480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 05:06:01,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 05:06:02,864][12883] Updated weights for policy 0, policy_version 64041 (0.0043) +[2024-06-18 05:06:06,935][12883] Updated weights for policy 0, policy_version 64051 (0.0036) +[2024-06-18 05:06:06,996][12645] Fps is (10 sec: 37685.6, 60 sec: 42596.8, 300 sec: 41931.6). Total num frames: 1049411584. Throughput: 0: 42140.7. Samples: 1049526140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 05:06:06,997][12645] Avg episode reward: [(0, '0.319')] +[2024-06-18 05:06:09,179][12862] Signal inference workers to stop experience collection... (15150 times) +[2024-06-18 05:06:09,180][12862] Signal inference workers to resume experience collection... (15150 times) +[2024-06-18 05:06:09,233][12883] InferenceWorker_p0-w0: stopping experience collection (15150 times) +[2024-06-18 05:06:09,233][12883] InferenceWorker_p0-w0: resuming experience collection (15150 times) +[2024-06-18 05:06:10,691][12883] Updated weights for policy 0, policy_version 64061 (0.0035) +[2024-06-18 05:06:11,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42326.9, 300 sec: 42099.5). Total num frames: 1049657344. Throughput: 0: 42164.4. Samples: 1049777760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 05:06:11,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 05:06:14,978][12883] Updated weights for policy 0, policy_version 64071 (0.0038) +[2024-06-18 05:06:16,994][12645] Fps is (10 sec: 39330.7, 60 sec: 41779.3, 300 sec: 41876.7). Total num frames: 1049804800. Throughput: 0: 42126.5. Samples: 1049907820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 05:06:16,994][12645] Avg episode reward: [(0, '0.319')] +[2024-06-18 05:06:18,203][12883] Updated weights for policy 0, policy_version 64081 (0.0026) +[2024-06-18 05:06:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 1050034176. Throughput: 0: 42179.6. Samples: 1050156260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 05:06:21,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 05:06:22,531][12883] Updated weights for policy 0, policy_version 64091 (0.0025) +[2024-06-18 05:06:25,869][12883] Updated weights for policy 0, policy_version 64101 (0.0036) +[2024-06-18 05:06:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1050263552. Throughput: 0: 42270.2. Samples: 1050409200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 05:06:26,994][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 05:06:30,767][12883] Updated weights for policy 0, policy_version 64111 (0.0036) +[2024-06-18 05:06:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41507.7, 300 sec: 41876.4). Total num frames: 1050427392. Throughput: 0: 42127.6. Samples: 1050542800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 05:06:31,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 05:06:33,491][12883] Updated weights for policy 0, policy_version 64121 (0.0031) +[2024-06-18 05:06:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1050673152. Throughput: 0: 42131.9. Samples: 1050793700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) +[2024-06-18 05:06:37,003][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 05:06:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064128_1050673152.pth... +[2024-06-18 05:06:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063513_1040596992.pth +[2024-06-18 05:06:38,494][12883] Updated weights for policy 0, policy_version 64131 (0.0037) +[2024-06-18 05:06:41,409][12883] Updated weights for policy 0, policy_version 64141 (0.0041) +[2024-06-18 05:06:41,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1050902528. Throughput: 0: 42230.2. Samples: 1051044820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:06:41,994][12645] Avg episode reward: [(0, '0.056')] +[2024-06-18 05:06:46,825][12883] Updated weights for policy 0, policy_version 64151 (0.0027) +[2024-06-18 05:06:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 1051049984. Throughput: 0: 42060.9. Samples: 1051173220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:06:46,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 05:06:49,095][12883] Updated weights for policy 0, policy_version 64161 (0.0034) +[2024-06-18 05:06:51,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42869.9, 300 sec: 42042.7). Total num frames: 1051328512. Throughput: 0: 42057.4. Samples: 1051418720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:06:51,996][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 05:06:54,482][12883] Updated weights for policy 0, policy_version 64171 (0.0031) +[2024-06-18 05:06:56,792][12883] Updated weights for policy 0, policy_version 64181 (0.0028) +[2024-06-18 05:06:56,994][12645] Fps is (10 sec: 49152.7, 60 sec: 41781.3, 300 sec: 42154.1). Total num frames: 1051541504. Throughput: 0: 42296.1. Samples: 1051681080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:06:56,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 05:07:01,994][12645] Fps is (10 sec: 36052.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 1051688960. Throughput: 0: 42158.1. Samples: 1051804940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:07:01,994][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 05:07:02,043][12883] Updated weights for policy 0, policy_version 64191 (0.0036) +[2024-06-18 05:07:04,786][12883] Updated weights for policy 0, policy_version 64201 (0.0041) +[2024-06-18 05:07:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42600.0, 300 sec: 42154.1). Total num frames: 1051967488. Throughput: 0: 42189.7. Samples: 1052054800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:07:06,995][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 05:07:09,997][12883] Updated weights for policy 0, policy_version 64211 (0.0032) +[2024-06-18 05:07:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 1052131328. Throughput: 0: 42295.6. Samples: 1052312500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:07:11,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 05:07:12,011][12862] Signal inference workers to stop experience collection... (15200 times) +[2024-06-18 05:07:12,011][12862] Signal inference workers to resume experience collection... (15200 times) +[2024-06-18 05:07:12,056][12883] InferenceWorker_p0-w0: stopping experience collection (15200 times) +[2024-06-18 05:07:12,060][12883] InferenceWorker_p0-w0: resuming experience collection (15200 times) +[2024-06-18 05:07:12,672][12883] Updated weights for policy 0, policy_version 64221 (0.0032) +[2024-06-18 05:07:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 1052344320. Throughput: 0: 41905.3. Samples: 1052428540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 05:07:16,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 05:07:17,866][12883] Updated weights for policy 0, policy_version 64231 (0.0034) +[2024-06-18 05:07:20,584][12883] Updated weights for policy 0, policy_version 64241 (0.0036) +[2024-06-18 05:07:22,000][12645] Fps is (10 sec: 47483.8, 60 sec: 42867.0, 300 sec: 42208.7). Total num frames: 1052606464. Throughput: 0: 41998.2. Samples: 1052683880. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) +[2024-06-18 05:07:22,000][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 05:07:25,517][12883] Updated weights for policy 0, policy_version 64251 (0.0035) +[2024-06-18 05:07:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 1052770304. Throughput: 0: 42241.3. Samples: 1052945680. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) +[2024-06-18 05:07:26,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 05:07:28,443][12883] Updated weights for policy 0, policy_version 64261 (0.0031) +[2024-06-18 05:07:31,994][12645] Fps is (10 sec: 37707.1, 60 sec: 42598.4, 300 sec: 41988.4). Total num frames: 1052983296. Throughput: 0: 41917.4. Samples: 1053059500. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) +[2024-06-18 05:07:31,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 05:07:33,018][12883] Updated weights for policy 0, policy_version 64271 (0.0044) +[2024-06-18 05:07:36,106][12883] Updated weights for policy 0, policy_version 64281 (0.0030) +[2024-06-18 05:07:36,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42266.1). Total num frames: 1053245440. Throughput: 0: 42348.3. Samples: 1053324300. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) +[2024-06-18 05:07:36,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 05:07:40,629][12883] Updated weights for policy 0, policy_version 64291 (0.0036) +[2024-06-18 05:07:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.2, 300 sec: 42043.0). Total num frames: 1053376512. Throughput: 0: 42136.8. Samples: 1053577240. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) +[2024-06-18 05:07:41,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 05:07:43,998][12883] Updated weights for policy 0, policy_version 64301 (0.0035) +[2024-06-18 05:07:46,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42871.5, 300 sec: 41987.5). Total num frames: 1053622272. Throughput: 0: 41929.9. Samples: 1053691780. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) +[2024-06-18 05:07:46,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 05:07:48,206][12883] Updated weights for policy 0, policy_version 64311 (0.0035) +[2024-06-18 05:07:51,673][12883] Updated weights for policy 0, policy_version 64321 (0.0045) +[2024-06-18 05:07:51,994][12645] Fps is (10 sec: 49151.5, 60 sec: 42326.8, 300 sec: 42321.0). Total num frames: 1053868032. Throughput: 0: 42371.1. Samples: 1053961500. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) +[2024-06-18 05:07:51,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 05:07:55,726][12883] Updated weights for policy 0, policy_version 64331 (0.0032) +[2024-06-18 05:07:56,996][12645] Fps is (10 sec: 40950.4, 60 sec: 41504.5, 300 sec: 42098.2). Total num frames: 1054031872. Throughput: 0: 42207.6. Samples: 1054211940. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) +[2024-06-18 05:07:56,997][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 05:07:59,341][12883] Updated weights for policy 0, policy_version 64341 (0.0032) +[2024-06-18 05:08:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42154.1). Total num frames: 1054277632. Throughput: 0: 42231.9. Samples: 1054328980. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) +[2024-06-18 05:08:01,995][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 05:08:03,586][12883] Updated weights for policy 0, policy_version 64351 (0.0038) +[2024-06-18 05:08:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1054474240. Throughput: 0: 42504.9. Samples: 1054596340. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) +[2024-06-18 05:08:06,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 05:08:07,129][12883] Updated weights for policy 0, policy_version 64361 (0.0029) +[2024-06-18 05:08:11,555][12883] Updated weights for policy 0, policy_version 64371 (0.0025) +[2024-06-18 05:08:11,994][12645] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42098.7). Total num frames: 1054670848. Throughput: 0: 42205.5. Samples: 1054844920. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) +[2024-06-18 05:08:11,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 05:08:14,010][12862] Signal inference workers to stop experience collection... (15250 times) +[2024-06-18 05:08:14,063][12883] InferenceWorker_p0-w0: stopping experience collection (15250 times) +[2024-06-18 05:08:14,064][12862] Signal inference workers to resume experience collection... (15250 times) +[2024-06-18 05:08:14,079][12883] InferenceWorker_p0-w0: resuming experience collection (15250 times) +[2024-06-18 05:08:14,695][12883] Updated weights for policy 0, policy_version 64381 (0.0030) +[2024-06-18 05:08:17,000][12645] Fps is (10 sec: 42572.2, 60 sec: 42593.9, 300 sec: 42153.2). Total num frames: 1054900224. Throughput: 0: 42299.4. Samples: 1054963240. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) +[2024-06-18 05:08:17,001][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 05:08:19,357][12883] Updated weights for policy 0, policy_version 64391 (0.0040) +[2024-06-18 05:08:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41510.4, 300 sec: 42098.5). Total num frames: 1055096832. Throughput: 0: 42302.2. Samples: 1055227900. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) +[2024-06-18 05:08:21,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 05:08:22,569][12883] Updated weights for policy 0, policy_version 64401 (0.0039) +[2024-06-18 05:08:26,994][12645] Fps is (10 sec: 39346.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1055293440. Throughput: 0: 42172.0. Samples: 1055474980. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) +[2024-06-18 05:08:26,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 05:08:27,088][12862] Saving new best policy, reward=0.601! +[2024-06-18 05:08:27,090][12883] Updated weights for policy 0, policy_version 64411 (0.0035) +[2024-06-18 05:08:30,351][12883] Updated weights for policy 0, policy_version 64421 (0.0037) +[2024-06-18 05:08:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 1055539200. Throughput: 0: 42317.2. Samples: 1055596060. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) +[2024-06-18 05:08:31,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 05:08:34,764][12883] Updated weights for policy 0, policy_version 64431 (0.0026) +[2024-06-18 05:08:36,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41504.6, 300 sec: 42153.8). Total num frames: 1055735808. Throughput: 0: 42029.1. Samples: 1055852900. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) +[2024-06-18 05:08:36,996][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 05:08:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064437_1055735808.pth... +[2024-06-18 05:08:37,056][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063819_1045610496.pth +[2024-06-18 05:08:38,081][12883] Updated weights for policy 0, policy_version 64441 (0.0034) +[2024-06-18 05:08:41,994][12645] Fps is (10 sec: 40958.8, 60 sec: 42871.2, 300 sec: 42154.0). Total num frames: 1055948800. Throughput: 0: 41983.2. Samples: 1056101100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 05:08:41,995][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 05:08:42,409][12883] Updated weights for policy 0, policy_version 64451 (0.0039) +[2024-06-18 05:08:45,869][12883] Updated weights for policy 0, policy_version 64461 (0.0041) +[2024-06-18 05:08:46,994][12645] Fps is (10 sec: 39330.4, 60 sec: 41779.1, 300 sec: 42043.3). Total num frames: 1056129024. Throughput: 0: 42248.5. Samples: 1056230160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 05:08:46,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 05:08:50,063][12883] Updated weights for policy 0, policy_version 64471 (0.0036) +[2024-06-18 05:08:51,996][12645] Fps is (10 sec: 42590.4, 60 sec: 41777.7, 300 sec: 42209.3). Total num frames: 1056374784. Throughput: 0: 41713.6. Samples: 1056473540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 05:08:51,996][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 05:08:53,930][12883] Updated weights for policy 0, policy_version 64481 (0.0032) +[2024-06-18 05:08:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42327.0, 300 sec: 42098.5). Total num frames: 1056571392. Throughput: 0: 41945.2. Samples: 1056732460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 05:08:56,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 05:08:57,671][12883] Updated weights for policy 0, policy_version 64491 (0.0025) +[2024-06-18 05:09:01,952][12883] Updated weights for policy 0, policy_version 64501 (0.0035) +[2024-06-18 05:09:01,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1056784384. Throughput: 0: 41972.1. Samples: 1056851720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 05:09:01,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 05:09:05,532][12883] Updated weights for policy 0, policy_version 64511 (0.0031) +[2024-06-18 05:09:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 1056980992. Throughput: 0: 41644.9. Samples: 1057101920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 05:09:06,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 05:09:09,555][12883] Updated weights for policy 0, policy_version 64521 (0.0030) +[2024-06-18 05:09:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 41987.7). Total num frames: 1057177600. Throughput: 0: 41983.1. Samples: 1057364220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 05:09:11,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 05:09:13,121][12883] Updated weights for policy 0, policy_version 64531 (0.0026) +[2024-06-18 05:09:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42056.6, 300 sec: 42209.7). Total num frames: 1057423360. Throughput: 0: 41937.4. Samples: 1057483240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 05:09:16,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 05:09:17,430][12883] Updated weights for policy 0, policy_version 64541 (0.0035) +[2024-06-18 05:09:21,328][12883] Updated weights for policy 0, policy_version 64551 (0.0040) +[2024-06-18 05:09:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1057636352. Throughput: 0: 41770.2. Samples: 1057732460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) +[2024-06-18 05:09:21,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 05:09:24,945][12883] Updated weights for policy 0, policy_version 64561 (0.0032) +[2024-06-18 05:09:26,996][12645] Fps is (10 sec: 37675.0, 60 sec: 41777.7, 300 sec: 41987.2). Total num frames: 1057800192. Throughput: 0: 42132.0. Samples: 1057997120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) +[2024-06-18 05:09:26,996][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 05:09:28,185][12862] Signal inference workers to stop experience collection... (15300 times) +[2024-06-18 05:09:28,186][12862] Signal inference workers to resume experience collection... (15300 times) +[2024-06-18 05:09:28,202][12883] InferenceWorker_p0-w0: stopping experience collection (15300 times) +[2024-06-18 05:09:28,202][12883] InferenceWorker_p0-w0: resuming experience collection (15300 times) +[2024-06-18 05:09:29,258][12883] Updated weights for policy 0, policy_version 64571 (0.0039) +[2024-06-18 05:09:31,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41777.7, 300 sec: 42153.8). Total num frames: 1058045952. Throughput: 0: 41821.0. Samples: 1058112200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) +[2024-06-18 05:09:31,996][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 05:09:32,740][12883] Updated weights for policy 0, policy_version 64581 (0.0034) +[2024-06-18 05:09:36,994][12645] Fps is (10 sec: 44246.1, 60 sec: 41780.7, 300 sec: 42098.5). Total num frames: 1058242560. Throughput: 0: 42108.2. Samples: 1058368320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) +[2024-06-18 05:09:36,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 05:09:37,299][12883] Updated weights for policy 0, policy_version 64591 (0.0034) +[2024-06-18 05:09:40,938][12883] Updated weights for policy 0, policy_version 64601 (0.0038) +[2024-06-18 05:09:41,994][12645] Fps is (10 sec: 37691.7, 60 sec: 41233.3, 300 sec: 41988.4). Total num frames: 1058422784. Throughput: 0: 41771.5. Samples: 1058612180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) +[2024-06-18 05:09:41,994][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 05:09:45,267][12883] Updated weights for policy 0, policy_version 64611 (0.0033) +[2024-06-18 05:09:46,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 1058684928. Throughput: 0: 41879.2. Samples: 1058736280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) +[2024-06-18 05:09:46,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 05:09:48,659][12883] Updated weights for policy 0, policy_version 64621 (0.0031) +[2024-06-18 05:09:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41234.7, 300 sec: 41932.0). Total num frames: 1058848768. Throughput: 0: 42137.9. Samples: 1058998120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) +[2024-06-18 05:09:51,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 05:09:52,931][12883] Updated weights for policy 0, policy_version 64631 (0.0029) +[2024-06-18 05:09:56,316][12883] Updated weights for policy 0, policy_version 64641 (0.0033) +[2024-06-18 05:09:56,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 1059078144. Throughput: 0: 41768.8. Samples: 1059243820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) +[2024-06-18 05:09:56,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 05:10:00,613][12883] Updated weights for policy 0, policy_version 64651 (0.0024) +[2024-06-18 05:10:01,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1059323904. Throughput: 0: 42150.4. Samples: 1059380000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 05:10:01,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 05:10:04,068][12883] Updated weights for policy 0, policy_version 64661 (0.0036) +[2024-06-18 05:10:06,995][12645] Fps is (10 sec: 40953.0, 60 sec: 41778.0, 300 sec: 41932.0). Total num frames: 1059487744. Throughput: 0: 42122.7. Samples: 1059628060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 05:10:06,996][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 05:10:08,342][12883] Updated weights for policy 0, policy_version 64671 (0.0036) +[2024-06-18 05:10:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1059717120. Throughput: 0: 41669.6. Samples: 1059872160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 05:10:11,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 05:10:12,149][12883] Updated weights for policy 0, policy_version 64681 (0.0028) +[2024-06-18 05:10:16,160][12883] Updated weights for policy 0, policy_version 64691 (0.0025) +[2024-06-18 05:10:16,994][12645] Fps is (10 sec: 44244.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 1059930112. Throughput: 0: 42112.8. Samples: 1060007180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 05:10:16,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 05:10:19,847][12883] Updated weights for policy 0, policy_version 64701 (0.0040) +[2024-06-18 05:10:21,995][12645] Fps is (10 sec: 39314.4, 60 sec: 41231.8, 300 sec: 41931.7). Total num frames: 1060110336. Throughput: 0: 41939.3. Samples: 1060255660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 05:10:21,996][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 05:10:23,938][12883] Updated weights for policy 0, policy_version 64711 (0.0030) +[2024-06-18 05:10:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42873.0, 300 sec: 42154.4). Total num frames: 1060372480. Throughput: 0: 42143.0. Samples: 1060508620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 05:10:26,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 05:10:27,475][12883] Updated weights for policy 0, policy_version 64721 (0.0041) +[2024-06-18 05:10:31,700][12883] Updated weights for policy 0, policy_version 64731 (0.0040) +[2024-06-18 05:10:31,994][12645] Fps is (10 sec: 45883.8, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 1060569088. Throughput: 0: 42270.6. Samples: 1060638460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 05:10:31,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 05:10:35,051][12883] Updated weights for policy 0, policy_version 64741 (0.0033) +[2024-06-18 05:10:36,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1060749312. Throughput: 0: 42072.3. Samples: 1060891380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) +[2024-06-18 05:10:36,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 05:10:37,131][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064744_1060765696.pth... +[2024-06-18 05:10:37,195][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064128_1050673152.pth +[2024-06-18 05:10:39,339][12883] Updated weights for policy 0, policy_version 64751 (0.0029) +[2024-06-18 05:10:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 1060995072. Throughput: 0: 42154.8. Samples: 1061140780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) +[2024-06-18 05:10:41,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 05:10:43,068][12883] Updated weights for policy 0, policy_version 64761 (0.0033) +[2024-06-18 05:10:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1061191680. Throughput: 0: 42156.8. Samples: 1061277060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) +[2024-06-18 05:10:46,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 05:10:47,008][12883] Updated weights for policy 0, policy_version 64771 (0.0037) +[2024-06-18 05:10:50,540][12883] Updated weights for policy 0, policy_version 64781 (0.0045) +[2024-06-18 05:10:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 41821.3). Total num frames: 1061371904. Throughput: 0: 42025.7. Samples: 1061519140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) +[2024-06-18 05:10:51,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 05:10:52,345][12862] Signal inference workers to stop experience collection... (15350 times) +[2024-06-18 05:10:52,368][12883] InferenceWorker_p0-w0: stopping experience collection (15350 times) +[2024-06-18 05:10:52,404][12862] Signal inference workers to resume experience collection... (15350 times) +[2024-06-18 05:10:52,405][12883] InferenceWorker_p0-w0: resuming experience collection (15350 times) +[2024-06-18 05:10:54,634][12883] Updated weights for policy 0, policy_version 64791 (0.0034) +[2024-06-18 05:10:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1061617664. Throughput: 0: 42381.4. Samples: 1061779320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) +[2024-06-18 05:10:56,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 05:10:58,978][12883] Updated weights for policy 0, policy_version 64801 (0.0048) +[2024-06-18 05:11:01,996][12645] Fps is (10 sec: 47502.8, 60 sec: 42050.6, 300 sec: 42154.1). Total num frames: 1061847040. Throughput: 0: 42316.5. Samples: 1061911520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) +[2024-06-18 05:11:01,996][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 05:11:02,358][12883] Updated weights for policy 0, policy_version 64811 (0.0032) +[2024-06-18 05:11:06,525][12883] Updated weights for policy 0, policy_version 64821 (0.0026) +[2024-06-18 05:11:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.6, 300 sec: 41931.9). Total num frames: 1062027264. Throughput: 0: 42295.9. Samples: 1062158900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) +[2024-06-18 05:11:06,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 05:11:10,162][12883] Updated weights for policy 0, policy_version 64831 (0.0043) +[2024-06-18 05:11:11,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1062256640. Throughput: 0: 42374.4. Samples: 1062415460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) +[2024-06-18 05:11:11,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 05:11:14,023][12883] Updated weights for policy 0, policy_version 64841 (0.0022) +[2024-06-18 05:11:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 1062469632. Throughput: 0: 42431.9. Samples: 1062547900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) +[2024-06-18 05:11:16,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 05:11:17,846][12883] Updated weights for policy 0, policy_version 64851 (0.0054) +[2024-06-18 05:11:21,890][12883] Updated weights for policy 0, policy_version 64861 (0.0029) +[2024-06-18 05:11:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42872.7, 300 sec: 42098.5). Total num frames: 1062682624. Throughput: 0: 42344.0. Samples: 1062796860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) +[2024-06-18 05:11:21,994][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 05:11:25,301][12883] Updated weights for policy 0, policy_version 64871 (0.0033) +[2024-06-18 05:11:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1062879232. Throughput: 0: 42404.4. Samples: 1063048980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) +[2024-06-18 05:11:26,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 05:11:29,608][12883] Updated weights for policy 0, policy_version 64881 (0.0040) +[2024-06-18 05:11:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 1063092224. Throughput: 0: 42242.2. Samples: 1063177960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) +[2024-06-18 05:11:31,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 05:11:33,012][12883] Updated weights for policy 0, policy_version 64891 (0.0038) +[2024-06-18 05:11:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 1063321600. Throughput: 0: 42584.9. Samples: 1063435460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) +[2024-06-18 05:11:36,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 05:11:37,353][12883] Updated weights for policy 0, policy_version 64901 (0.0038) +[2024-06-18 05:11:40,659][12883] Updated weights for policy 0, policy_version 64911 (0.0031) +[2024-06-18 05:11:41,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1063534592. Throughput: 0: 42297.5. Samples: 1063682720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) +[2024-06-18 05:11:41,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 05:11:45,434][12883] Updated weights for policy 0, policy_version 64921 (0.0030) +[2024-06-18 05:11:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42098.9). Total num frames: 1063747584. Throughput: 0: 42283.0. Samples: 1063814160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) +[2024-06-18 05:11:46,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 05:11:48,479][12883] Updated weights for policy 0, policy_version 64931 (0.0035) +[2024-06-18 05:11:51,995][12645] Fps is (10 sec: 40953.8, 60 sec: 42870.2, 300 sec: 42042.8). Total num frames: 1063944192. Throughput: 0: 42429.9. Samples: 1064068320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) +[2024-06-18 05:11:51,996][12645] Avg episode reward: [(0, '0.070')] +[2024-06-18 05:11:53,078][12883] Updated weights for policy 0, policy_version 64941 (0.0030) +[2024-06-18 05:11:53,871][12862] Signal inference workers to stop experience collection... (15400 times) +[2024-06-18 05:11:53,891][12883] InferenceWorker_p0-w0: stopping experience collection (15400 times) +[2024-06-18 05:11:53,926][12862] Signal inference workers to resume experience collection... (15400 times) +[2024-06-18 05:11:53,929][12883] InferenceWorker_p0-w0: resuming experience collection (15400 times) +[2024-06-18 05:11:56,139][12883] Updated weights for policy 0, policy_version 64951 (0.0040) +[2024-06-18 05:11:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1064157184. Throughput: 0: 42332.0. Samples: 1064320400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) +[2024-06-18 05:11:56,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 05:12:00,833][12883] Updated weights for policy 0, policy_version 64961 (0.0037) +[2024-06-18 05:12:02,000][12645] Fps is (10 sec: 44216.8, 60 sec: 42322.5, 300 sec: 42097.7). Total num frames: 1064386560. Throughput: 0: 42330.2. Samples: 1064453020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 05:12:02,000][12645] Avg episode reward: [(0, '0.118')] +[2024-06-18 05:12:04,123][12883] Updated weights for policy 0, policy_version 64971 (0.0029) +[2024-06-18 05:12:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1064583168. Throughput: 0: 42401.4. Samples: 1064704920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 05:12:06,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 05:12:08,442][12883] Updated weights for policy 0, policy_version 64981 (0.0033) +[2024-06-18 05:12:11,718][12883] Updated weights for policy 0, policy_version 64991 (0.0032) +[2024-06-18 05:12:11,994][12645] Fps is (10 sec: 42624.5, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 1064812544. Throughput: 0: 42346.6. Samples: 1064954580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 05:12:11,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 05:12:16,352][12883] Updated weights for policy 0, policy_version 65001 (0.0050) +[2024-06-18 05:12:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 41988.4). Total num frames: 1064992768. Throughput: 0: 42434.7. Samples: 1065087520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 05:12:16,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 05:12:19,274][12883] Updated weights for policy 0, policy_version 65011 (0.0030) +[2024-06-18 05:12:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1065205760. Throughput: 0: 42358.7. Samples: 1065341600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 05:12:21,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 05:12:24,075][12883] Updated weights for policy 0, policy_version 65021 (0.0029) +[2024-06-18 05:12:26,766][12883] Updated weights for policy 0, policy_version 65031 (0.0044) +[2024-06-18 05:12:26,994][12645] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42320.7). Total num frames: 1065467904. Throughput: 0: 42418.0. Samples: 1065591520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 05:12:26,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 05:12:31,679][12883] Updated weights for policy 0, policy_version 65041 (0.0036) +[2024-06-18 05:12:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 1065664512. Throughput: 0: 42617.4. Samples: 1065731940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 05:12:31,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 05:12:34,329][12883] Updated weights for policy 0, policy_version 65051 (0.0035) +[2024-06-18 05:12:36,995][12645] Fps is (10 sec: 39316.3, 60 sec: 42324.4, 300 sec: 42320.5). Total num frames: 1065861120. Throughput: 0: 42567.4. Samples: 1065983840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 05:12:36,995][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 05:12:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065055_1065861120.pth... +[2024-06-18 05:12:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064437_1055735808.pth +[2024-06-18 05:12:39,286][12883] Updated weights for policy 0, policy_version 65061 (0.0031) +[2024-06-18 05:12:41,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1066106880. Throughput: 0: 42426.9. Samples: 1066229620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 05:12:41,994][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 05:12:42,150][12883] Updated weights for policy 0, policy_version 65071 (0.0037) +[2024-06-18 05:12:46,996][12645] Fps is (10 sec: 40956.2, 60 sec: 42050.7, 300 sec: 42042.7). Total num frames: 1066270720. Throughput: 0: 42622.8. Samples: 1066370880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 05:12:46,997][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 05:12:47,213][12883] Updated weights for policy 0, policy_version 65081 (0.0048) +[2024-06-18 05:12:50,041][12883] Updated weights for policy 0, policy_version 65091 (0.0033) +[2024-06-18 05:12:51,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42599.6, 300 sec: 42265.5). Total num frames: 1066500096. Throughput: 0: 42479.1. Samples: 1066616480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 05:12:51,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 05:12:54,718][12883] Updated weights for policy 0, policy_version 65101 (0.0039) +[2024-06-18 05:12:56,994][12645] Fps is (10 sec: 49163.5, 60 sec: 43417.6, 300 sec: 42320.7). Total num frames: 1066762240. Throughput: 0: 42498.8. Samples: 1066867020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 05:12:56,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 05:12:58,152][12883] Updated weights for policy 0, policy_version 65111 (0.0026) +[2024-06-18 05:13:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41783.6, 300 sec: 42098.6). Total num frames: 1066893312. Throughput: 0: 42504.0. Samples: 1067000200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 05:13:01,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 05:13:02,341][12883] Updated weights for policy 0, policy_version 65121 (0.0024) +[2024-06-18 05:13:05,837][12883] Updated weights for policy 0, policy_version 65131 (0.0033) +[2024-06-18 05:13:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1067139072. Throughput: 0: 42463.2. Samples: 1067252440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 05:13:06,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 05:13:09,406][12862] Signal inference workers to stop experience collection... (15450 times) +[2024-06-18 05:13:09,452][12883] InferenceWorker_p0-w0: stopping experience collection (15450 times) +[2024-06-18 05:13:09,516][12862] Signal inference workers to resume experience collection... (15450 times) +[2024-06-18 05:13:09,516][12883] InferenceWorker_p0-w0: resuming experience collection (15450 times) +[2024-06-18 05:13:10,267][12883] Updated weights for policy 0, policy_version 65141 (0.0027) +[2024-06-18 05:13:11,994][12645] Fps is (10 sec: 49151.6, 60 sec: 42871.6, 300 sec: 42321.6). Total num frames: 1067384832. Throughput: 0: 42544.5. Samples: 1067506020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 05:13:11,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 05:13:13,469][12883] Updated weights for policy 0, policy_version 65151 (0.0037) +[2024-06-18 05:13:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1067548672. Throughput: 0: 42445.8. Samples: 1067642000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 05:13:16,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 05:13:17,732][12883] Updated weights for policy 0, policy_version 65161 (0.0036) +[2024-06-18 05:13:20,857][12883] Updated weights for policy 0, policy_version 65171 (0.0027) +[2024-06-18 05:13:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42376.2). Total num frames: 1067794432. Throughput: 0: 42449.3. Samples: 1067894000. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 05:13:21,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 05:13:25,194][12883] Updated weights for policy 0, policy_version 65181 (0.0041) +[2024-06-18 05:13:26,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1068007424. Throughput: 0: 42809.9. Samples: 1068156060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 05:13:26,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 05:13:28,294][12883] Updated weights for policy 0, policy_version 65191 (0.0040) +[2024-06-18 05:13:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42265.5). Total num frames: 1068204032. Throughput: 0: 42427.5. Samples: 1068280020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 05:13:31,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 05:13:32,986][12883] Updated weights for policy 0, policy_version 65201 (0.0033) +[2024-06-18 05:13:36,319][12883] Updated weights for policy 0, policy_version 65211 (0.0035) +[2024-06-18 05:13:36,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42870.8, 300 sec: 42320.4). Total num frames: 1068433408. Throughput: 0: 42569.9. Samples: 1068532220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 05:13:36,996][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 05:13:40,602][12883] Updated weights for policy 0, policy_version 65221 (0.0031) +[2024-06-18 05:13:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1068630016. Throughput: 0: 42781.4. Samples: 1068792180. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 05:13:41,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 05:13:44,029][12883] Updated weights for policy 0, policy_version 65231 (0.0038) +[2024-06-18 05:13:46,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42600.0, 300 sec: 42209.9). Total num frames: 1068826624. Throughput: 0: 42606.9. Samples: 1068917520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 05:13:46,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 05:13:48,159][12883] Updated weights for policy 0, policy_version 65241 (0.0033) +[2024-06-18 05:13:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1069056000. Throughput: 0: 42573.7. Samples: 1069168260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 05:13:51,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 05:13:52,083][12883] Updated weights for policy 0, policy_version 65251 (0.0037) +[2024-06-18 05:13:55,737][12883] Updated weights for policy 0, policy_version 65261 (0.0033) +[2024-06-18 05:13:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 1069285376. Throughput: 0: 42569.1. Samples: 1069421640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) +[2024-06-18 05:13:56,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 05:14:00,275][12883] Updated weights for policy 0, policy_version 65271 (0.0030) +[2024-06-18 05:14:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1069449216. Throughput: 0: 42334.6. Samples: 1069547060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 05:14:01,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 05:14:03,467][12883] Updated weights for policy 0, policy_version 65281 (0.0027) +[2024-06-18 05:14:06,994][12645] Fps is (10 sec: 40961.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1069694976. Throughput: 0: 42381.0. Samples: 1069801140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 05:14:06,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 05:14:08,107][12883] Updated weights for policy 0, policy_version 65291 (0.0037) +[2024-06-18 05:14:11,581][12883] Updated weights for policy 0, policy_version 65301 (0.0030) +[2024-06-18 05:14:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1069907968. Throughput: 0: 42088.9. Samples: 1070050060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 05:14:11,994][12645] Avg episode reward: [(0, '0.139')] +[2024-06-18 05:14:15,735][12883] Updated weights for policy 0, policy_version 65311 (0.0047) +[2024-06-18 05:14:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1070088192. Throughput: 0: 42141.8. Samples: 1070176400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 05:14:16,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 05:14:19,536][12883] Updated weights for policy 0, policy_version 65321 (0.0033) +[2024-06-18 05:14:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42432.1). Total num frames: 1070317568. Throughput: 0: 42152.0. Samples: 1070428960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 05:14:21,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 05:14:23,623][12883] Updated weights for policy 0, policy_version 65331 (0.0037) +[2024-06-18 05:14:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42321.0). Total num frames: 1070530560. Throughput: 0: 42047.1. Samples: 1070684300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 05:14:26,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 05:14:27,149][12883] Updated weights for policy 0, policy_version 65341 (0.0035) +[2024-06-18 05:14:31,274][12883] Updated weights for policy 0, policy_version 65351 (0.0035) +[2024-06-18 05:14:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1070710784. Throughput: 0: 41961.1. Samples: 1070805760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 05:14:31,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 05:14:34,995][12883] Updated weights for policy 0, policy_version 65361 (0.0030) +[2024-06-18 05:14:36,994][12645] Fps is (10 sec: 42595.3, 60 sec: 42053.4, 300 sec: 42487.2). Total num frames: 1070956544. Throughput: 0: 42063.4. Samples: 1071061140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 05:14:36,995][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 05:14:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065366_1070956544.pth... +[2024-06-18 05:14:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064744_1060765696.pth +[2024-06-18 05:14:38,882][12883] Updated weights for policy 0, policy_version 65371 (0.0038) +[2024-06-18 05:14:41,996][12645] Fps is (10 sec: 44226.3, 60 sec: 42050.6, 300 sec: 42264.8). Total num frames: 1071153152. Throughput: 0: 41973.2. Samples: 1071310520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 05:14:41,997][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 05:14:43,175][12883] Updated weights for policy 0, policy_version 65381 (0.0032) +[2024-06-18 05:14:44,392][12862] Signal inference workers to stop experience collection... (15500 times) +[2024-06-18 05:14:44,392][12862] Signal inference workers to resume experience collection... (15500 times) +[2024-06-18 05:14:44,426][12883] InferenceWorker_p0-w0: stopping experience collection (15500 times) +[2024-06-18 05:14:44,427][12883] InferenceWorker_p0-w0: resuming experience collection (15500 times) +[2024-06-18 05:14:46,629][12883] Updated weights for policy 0, policy_version 65391 (0.0037) +[2024-06-18 05:14:46,994][12645] Fps is (10 sec: 40963.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1071366144. Throughput: 0: 41913.7. Samples: 1071433180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 05:14:46,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 05:14:51,140][12883] Updated weights for policy 0, policy_version 65401 (0.0035) +[2024-06-18 05:14:51,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1071579136. Throughput: 0: 42071.5. Samples: 1071694360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 05:14:51,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 05:14:54,601][12883] Updated weights for policy 0, policy_version 65411 (0.0031) +[2024-06-18 05:14:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 42265.1). Total num frames: 1071792128. Throughput: 0: 42160.9. Samples: 1071947300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 05:14:56,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 05:14:58,851][12883] Updated weights for policy 0, policy_version 65421 (0.0029) +[2024-06-18 05:15:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42432.0). Total num frames: 1072005120. Throughput: 0: 42234.1. Samples: 1072076940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 05:15:01,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 05:15:02,291][12883] Updated weights for policy 0, policy_version 65431 (0.0029) +[2024-06-18 05:15:06,551][12883] Updated weights for policy 0, policy_version 65441 (0.0031) +[2024-06-18 05:15:07,000][12645] Fps is (10 sec: 40934.4, 60 sec: 41774.8, 300 sec: 42319.8). Total num frames: 1072201728. Throughput: 0: 42186.1. Samples: 1072327600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 05:15:07,000][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 05:15:09,929][12883] Updated weights for policy 0, policy_version 65451 (0.0039) +[2024-06-18 05:15:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1072431104. Throughput: 0: 42136.4. Samples: 1072580440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 05:15:11,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 05:15:14,436][12883] Updated weights for policy 0, policy_version 65461 (0.0024) +[2024-06-18 05:15:16,994][12645] Fps is (10 sec: 44264.5, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 1072644096. Throughput: 0: 42311.0. Samples: 1072709760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 05:15:16,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 05:15:17,617][12883] Updated weights for policy 0, policy_version 65471 (0.0028) +[2024-06-18 05:15:21,996][12645] Fps is (10 sec: 39313.0, 60 sec: 41777.6, 300 sec: 42209.3). Total num frames: 1072824320. Throughput: 0: 42160.8. Samples: 1072958440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 05:15:21,996][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 05:15:22,151][12883] Updated weights for policy 0, policy_version 65481 (0.0035) +[2024-06-18 05:15:25,500][12883] Updated weights for policy 0, policy_version 65491 (0.0028) +[2024-06-18 05:15:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1073053696. Throughput: 0: 42344.8. Samples: 1073215940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 05:15:26,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 05:15:29,915][12883] Updated weights for policy 0, policy_version 65501 (0.0036) +[2024-06-18 05:15:31,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1073283072. Throughput: 0: 42545.3. Samples: 1073347720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 05:15:31,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 05:15:33,250][12883] Updated weights for policy 0, policy_version 65511 (0.0028) +[2024-06-18 05:15:36,994][12645] Fps is (10 sec: 39318.6, 60 sec: 41506.1, 300 sec: 42209.5). Total num frames: 1073446912. Throughput: 0: 42230.4. Samples: 1073594760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 05:15:36,995][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 05:15:37,437][12883] Updated weights for policy 0, policy_version 65521 (0.0043) +[2024-06-18 05:15:40,984][12883] Updated weights for policy 0, policy_version 65531 (0.0037) +[2024-06-18 05:15:41,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42325.4, 300 sec: 42375.9). Total num frames: 1073692672. Throughput: 0: 42273.0. Samples: 1073849680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 05:15:41,996][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 05:15:45,080][12883] Updated weights for policy 0, policy_version 65541 (0.0048) +[2024-06-18 05:15:46,994][12645] Fps is (10 sec: 45878.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1073905664. Throughput: 0: 42405.0. Samples: 1073985160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 05:15:46,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 05:15:48,598][12883] Updated weights for policy 0, policy_version 65551 (0.0040) +[2024-06-18 05:15:52,000][12645] Fps is (10 sec: 39305.8, 60 sec: 41774.8, 300 sec: 42264.3). Total num frames: 1074085888. Throughput: 0: 42325.8. Samples: 1074232260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 05:15:52,000][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 05:15:52,742][12883] Updated weights for policy 0, policy_version 65561 (0.0032) +[2024-06-18 05:15:56,211][12883] Updated weights for policy 0, policy_version 65571 (0.0046) +[2024-06-18 05:15:56,999][12645] Fps is (10 sec: 42575.7, 60 sec: 42321.6, 300 sec: 42320.3). Total num frames: 1074331648. Throughput: 0: 42290.1. Samples: 1074483720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 05:15:56,999][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 05:16:00,295][12883] Updated weights for policy 0, policy_version 65581 (0.0026) +[2024-06-18 05:16:01,996][12645] Fps is (10 sec: 45893.5, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 1074544640. Throughput: 0: 42441.4. Samples: 1074619720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:01,997][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 05:16:02,659][12862] Signal inference workers to stop experience collection... (15550 times) +[2024-06-18 05:16:02,659][12862] Signal inference workers to resume experience collection... (15550 times) +[2024-06-18 05:16:02,702][12883] InferenceWorker_p0-w0: stopping experience collection (15550 times) +[2024-06-18 05:16:02,702][12883] InferenceWorker_p0-w0: resuming experience collection (15550 times) +[2024-06-18 05:16:03,675][12883] Updated weights for policy 0, policy_version 65591 (0.0030) +[2024-06-18 05:16:06,994][12645] Fps is (10 sec: 39342.3, 60 sec: 42056.6, 300 sec: 42265.1). Total num frames: 1074724864. Throughput: 0: 42358.0. Samples: 1074864460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:06,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 05:16:07,949][12883] Updated weights for policy 0, policy_version 65601 (0.0036) +[2024-06-18 05:16:11,846][12883] Updated weights for policy 0, policy_version 65611 (0.0029) +[2024-06-18 05:16:11,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1074970624. Throughput: 0: 42272.8. Samples: 1075118220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:11,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 05:16:15,659][12883] Updated weights for policy 0, policy_version 65621 (0.0038) +[2024-06-18 05:16:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1075183616. Throughput: 0: 42357.7. Samples: 1075253820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:16,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 05:16:19,414][12883] Updated weights for policy 0, policy_version 65631 (0.0031) +[2024-06-18 05:16:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42599.9, 300 sec: 42376.2). Total num frames: 1075380224. Throughput: 0: 42292.2. Samples: 1075497880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:21,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 05:16:23,566][12883] Updated weights for policy 0, policy_version 65641 (0.0027) +[2024-06-18 05:16:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1075609600. Throughput: 0: 42257.1. Samples: 1075751160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:26,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 05:16:27,307][12883] Updated weights for policy 0, policy_version 65651 (0.0032) +[2024-06-18 05:16:31,195][12883] Updated weights for policy 0, policy_version 65661 (0.0035) +[2024-06-18 05:16:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 1075806208. Throughput: 0: 42154.8. Samples: 1075882140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:31,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 05:16:35,035][12883] Updated weights for policy 0, policy_version 65671 (0.0035) +[2024-06-18 05:16:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43145.0, 300 sec: 42376.3). Total num frames: 1076035584. Throughput: 0: 42315.1. Samples: 1076136180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:36,994][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 05:16:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065676_1076035584.pth... +[2024-06-18 05:16:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065055_1065861120.pth +[2024-06-18 05:16:39,074][12883] Updated weights for policy 0, policy_version 65681 (0.0029) +[2024-06-18 05:16:41,994][12645] Fps is (10 sec: 44238.0, 60 sec: 42600.0, 300 sec: 42376.2). Total num frames: 1076248576. Throughput: 0: 42413.5. Samples: 1076392100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:16:41,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 05:16:42,497][12883] Updated weights for policy 0, policy_version 65691 (0.0047) +[2024-06-18 05:16:46,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42050.7, 300 sec: 42320.6). Total num frames: 1076428800. Throughput: 0: 42257.3. Samples: 1076521300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) +[2024-06-18 05:16:46,997][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 05:16:47,280][12883] Updated weights for policy 0, policy_version 65701 (0.0027) +[2024-06-18 05:16:50,390][12883] Updated weights for policy 0, policy_version 65711 (0.0032) +[2024-06-18 05:16:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42602.8, 300 sec: 42320.7). Total num frames: 1076641792. Throughput: 0: 42477.7. Samples: 1076775960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) +[2024-06-18 05:16:51,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 05:16:54,833][12883] Updated weights for policy 0, policy_version 65721 (0.0040) +[2024-06-18 05:16:56,994][12645] Fps is (10 sec: 45885.7, 60 sec: 42602.2, 300 sec: 42377.1). Total num frames: 1076887552. Throughput: 0: 42477.8. Samples: 1077029720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) +[2024-06-18 05:16:56,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 05:16:58,247][12883] Updated weights for policy 0, policy_version 65731 (0.0042) +[2024-06-18 05:17:01,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42052.3, 300 sec: 42320.4). Total num frames: 1077067776. Throughput: 0: 42287.8. Samples: 1077156860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) +[2024-06-18 05:17:01,996][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 05:17:02,393][12883] Updated weights for policy 0, policy_version 65741 (0.0046) +[2024-06-18 05:17:05,966][12883] Updated weights for policy 0, policy_version 65751 (0.0036) +[2024-06-18 05:17:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 1077297152. Throughput: 0: 42376.4. Samples: 1077404820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) +[2024-06-18 05:17:06,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 05:17:10,262][12883] Updated weights for policy 0, policy_version 65761 (0.0036) +[2024-06-18 05:17:11,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1077510144. Throughput: 0: 42371.3. Samples: 1077657860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) +[2024-06-18 05:17:11,994][12645] Avg episode reward: [(0, '0.089')] +[2024-06-18 05:17:13,688][12883] Updated weights for policy 0, policy_version 65771 (0.0037) +[2024-06-18 05:17:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42376.2). Total num frames: 1077706752. Throughput: 0: 42281.6. Samples: 1077784800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) +[2024-06-18 05:17:16,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 05:17:17,899][12883] Updated weights for policy 0, policy_version 65781 (0.0031) +[2024-06-18 05:17:21,487][12883] Updated weights for policy 0, policy_version 65791 (0.0028) +[2024-06-18 05:17:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1077919744. Throughput: 0: 42300.1. Samples: 1078039680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) +[2024-06-18 05:17:21,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 05:17:25,574][12883] Updated weights for policy 0, policy_version 65801 (0.0030) +[2024-06-18 05:17:26,588][12862] Signal inference workers to stop experience collection... (15600 times) +[2024-06-18 05:17:26,589][12862] Signal inference workers to resume experience collection... (15600 times) +[2024-06-18 05:17:26,600][12883] InferenceWorker_p0-w0: stopping experience collection (15600 times) +[2024-06-18 05:17:26,600][12883] InferenceWorker_p0-w0: resuming experience collection (15600 times) +[2024-06-18 05:17:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1078149120. Throughput: 0: 42288.9. Samples: 1078295100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:17:26,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 05:17:29,118][12883] Updated weights for policy 0, policy_version 65811 (0.0033) +[2024-06-18 05:17:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.4, 300 sec: 42265.4). Total num frames: 1078329344. Throughput: 0: 42266.5. Samples: 1078423200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:17:31,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 05:17:33,255][12883] Updated weights for policy 0, policy_version 65821 (0.0040) +[2024-06-18 05:17:36,818][12883] Updated weights for policy 0, policy_version 65831 (0.0040) +[2024-06-18 05:17:36,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1078575104. Throughput: 0: 42207.9. Samples: 1078675320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:17:36,995][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 05:17:40,963][12883] Updated weights for policy 0, policy_version 65841 (0.0044) +[2024-06-18 05:17:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 1078788096. Throughput: 0: 42167.6. Samples: 1078927260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:17:41,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 05:17:44,733][12883] Updated weights for policy 0, policy_version 65851 (0.0032) +[2024-06-18 05:17:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 1078968320. Throughput: 0: 42197.2. Samples: 1079055640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:17:46,999][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 05:17:49,046][12883] Updated weights for policy 0, policy_version 65861 (0.0037) +[2024-06-18 05:17:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42154.1). Total num frames: 1079197696. Throughput: 0: 42273.1. Samples: 1079307100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:17:51,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 05:17:52,383][12883] Updated weights for policy 0, policy_version 65871 (0.0028) +[2024-06-18 05:17:56,873][12883] Updated weights for policy 0, policy_version 65881 (0.0040) +[2024-06-18 05:17:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1079394304. Throughput: 0: 42343.0. Samples: 1079563300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:17:56,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 05:17:59,977][12883] Updated weights for policy 0, policy_version 65891 (0.0035) +[2024-06-18 05:18:01,994][12645] Fps is (10 sec: 40958.1, 60 sec: 42326.6, 300 sec: 42265.1). Total num frames: 1079607296. Throughput: 0: 42211.6. Samples: 1079684340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 05:18:01,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 05:18:04,639][12883] Updated weights for policy 0, policy_version 65901 (0.0038) +[2024-06-18 05:18:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 1079803904. Throughput: 0: 42156.3. Samples: 1079936720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 05:18:06,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 05:18:07,805][12883] Updated weights for policy 0, policy_version 65911 (0.0026) +[2024-06-18 05:18:11,993][12645] Fps is (10 sec: 40962.1, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1080016896. Throughput: 0: 42121.9. Samples: 1080190580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 05:18:11,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 05:18:12,284][12883] Updated weights for policy 0, policy_version 65921 (0.0033) +[2024-06-18 05:18:15,971][12883] Updated weights for policy 0, policy_version 65931 (0.0033) +[2024-06-18 05:18:16,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1080262656. Throughput: 0: 42193.4. Samples: 1080321900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 05:18:16,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 05:18:19,845][12883] Updated weights for policy 0, policy_version 65941 (0.0040) +[2024-06-18 05:18:21,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1080442880. Throughput: 0: 42169.0. Samples: 1080572920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 05:18:21,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 05:18:23,693][12883] Updated weights for policy 0, policy_version 65951 (0.0035) +[2024-06-18 05:18:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1080655872. Throughput: 0: 42226.7. Samples: 1080827460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 05:18:26,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 05:18:27,498][12883] Updated weights for policy 0, policy_version 65961 (0.0033) +[2024-06-18 05:18:31,540][12883] Updated weights for policy 0, policy_version 65971 (0.0037) +[2024-06-18 05:18:31,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42154.4). Total num frames: 1080868864. Throughput: 0: 42180.5. Samples: 1080953760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 05:18:31,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 05:18:35,483][12883] Updated weights for policy 0, policy_version 65981 (0.0034) +[2024-06-18 05:18:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 1081065472. Throughput: 0: 42107.8. Samples: 1081201960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 05:18:36,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 05:18:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065983_1081065472.pth... +[2024-06-18 05:18:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065366_1070956544.pth +[2024-06-18 05:18:39,372][12883] Updated weights for policy 0, policy_version 65991 (0.0036) +[2024-06-18 05:18:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1081294848. Throughput: 0: 42063.7. Samples: 1081456160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 05:18:41,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 05:18:42,970][12883] Updated weights for policy 0, policy_version 66001 (0.0035) +[2024-06-18 05:18:45,978][12862] Signal inference workers to stop experience collection... (15650 times) +[2024-06-18 05:18:46,024][12883] InferenceWorker_p0-w0: stopping experience collection (15650 times) +[2024-06-18 05:18:46,038][12862] Signal inference workers to resume experience collection... (15650 times) +[2024-06-18 05:18:46,048][12883] InferenceWorker_p0-w0: resuming experience collection (15650 times) +[2024-06-18 05:18:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1081507840. Throughput: 0: 42305.3. Samples: 1081588060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) +[2024-06-18 05:18:46,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 05:18:47,099][12883] Updated weights for policy 0, policy_version 66011 (0.0026) +[2024-06-18 05:18:51,068][12883] Updated weights for policy 0, policy_version 66021 (0.0035) +[2024-06-18 05:18:51,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42050.6, 300 sec: 42153.8). Total num frames: 1081720832. Throughput: 0: 42250.0. Samples: 1081838060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) +[2024-06-18 05:18:51,996][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 05:18:54,813][12883] Updated weights for policy 0, policy_version 66031 (0.0033) +[2024-06-18 05:18:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1081933824. Throughput: 0: 42167.9. Samples: 1082088140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) +[2024-06-18 05:18:56,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 05:18:59,267][12883] Updated weights for policy 0, policy_version 66041 (0.0039) +[2024-06-18 05:19:01,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42325.7, 300 sec: 42209.6). Total num frames: 1082146816. Throughput: 0: 42100.6. Samples: 1082216420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) +[2024-06-18 05:19:01,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 05:19:02,565][12883] Updated weights for policy 0, policy_version 66051 (0.0033) +[2024-06-18 05:19:06,954][12883] Updated weights for policy 0, policy_version 66061 (0.0039) +[2024-06-18 05:19:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1082343424. Throughput: 0: 42051.2. Samples: 1082465220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) +[2024-06-18 05:19:06,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 05:19:10,418][12883] Updated weights for policy 0, policy_version 66071 (0.0026) +[2024-06-18 05:19:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.2, 300 sec: 42320.7). Total num frames: 1082572800. Throughput: 0: 42076.7. Samples: 1082720920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) +[2024-06-18 05:19:11,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 05:19:14,617][12883] Updated weights for policy 0, policy_version 66081 (0.0028) +[2024-06-18 05:19:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 1082753024. Throughput: 0: 42114.9. Samples: 1082848940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) +[2024-06-18 05:19:16,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 05:19:17,970][12883] Updated weights for policy 0, policy_version 66091 (0.0036) +[2024-06-18 05:19:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1082982400. Throughput: 0: 42295.2. Samples: 1083105240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) +[2024-06-18 05:19:21,994][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 05:19:22,194][12883] Updated weights for policy 0, policy_version 66101 (0.0036) +[2024-06-18 05:19:25,779][12883] Updated weights for policy 0, policy_version 66111 (0.0039) +[2024-06-18 05:19:26,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1083195392. Throughput: 0: 42143.6. Samples: 1083352620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:19:26,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 05:19:29,936][12883] Updated weights for policy 0, policy_version 66121 (0.0050) +[2024-06-18 05:19:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 1083408384. Throughput: 0: 42095.1. Samples: 1083482340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:19:31,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 05:19:33,840][12883] Updated weights for policy 0, policy_version 66131 (0.0032) +[2024-06-18 05:19:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 1083604992. Throughput: 0: 42023.0. Samples: 1083729000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:19:36,994][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 05:19:37,654][12883] Updated weights for policy 0, policy_version 66141 (0.0050) +[2024-06-18 05:19:41,662][12883] Updated weights for policy 0, policy_version 66151 (0.0036) +[2024-06-18 05:19:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1083817984. Throughput: 0: 42122.6. Samples: 1083983660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:19:41,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 05:19:45,587][12883] Updated weights for policy 0, policy_version 66161 (0.0025) +[2024-06-18 05:19:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 1084030976. Throughput: 0: 42030.0. Samples: 1084107780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:19:46,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 05:19:49,393][12883] Updated weights for policy 0, policy_version 66171 (0.0028) +[2024-06-18 05:19:52,000][12645] Fps is (10 sec: 44209.5, 60 sec: 42322.5, 300 sec: 42264.3). Total num frames: 1084260352. Throughput: 0: 42171.0. Samples: 1084363180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:19:52,000][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 05:19:53,326][12883] Updated weights for policy 0, policy_version 66181 (0.0025) +[2024-06-18 05:19:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1084456960. Throughput: 0: 42120.8. Samples: 1084616360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:19:56,994][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 05:19:57,202][12883] Updated weights for policy 0, policy_version 66191 (0.0031) +[2024-06-18 05:20:01,254][12883] Updated weights for policy 0, policy_version 66201 (0.0040) +[2024-06-18 05:20:01,994][12645] Fps is (10 sec: 40984.2, 60 sec: 42051.9, 300 sec: 42266.0). Total num frames: 1084669952. Throughput: 0: 41956.2. Samples: 1084736980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:20:01,995][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 05:20:04,951][12883] Updated weights for policy 0, policy_version 66211 (0.0035) +[2024-06-18 05:20:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1084882944. Throughput: 0: 41995.9. Samples: 1084995060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:06,994][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 05:20:08,979][12883] Updated weights for policy 0, policy_version 66221 (0.0047) +[2024-06-18 05:20:11,994][12645] Fps is (10 sec: 40961.7, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1085079552. Throughput: 0: 42085.3. Samples: 1085246460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:11,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 05:20:13,000][12883] Updated weights for policy 0, policy_version 66231 (0.0035) +[2024-06-18 05:20:16,482][12883] Updated weights for policy 0, policy_version 66241 (0.0035) +[2024-06-18 05:20:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42265.5). Total num frames: 1085292544. Throughput: 0: 42067.6. Samples: 1085375380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:16,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 05:20:18,627][12862] Signal inference workers to stop experience collection... (15700 times) +[2024-06-18 05:20:18,627][12862] Signal inference workers to resume experience collection... (15700 times) +[2024-06-18 05:20:18,659][12883] InferenceWorker_p0-w0: stopping experience collection (15700 times) +[2024-06-18 05:20:18,659][12883] InferenceWorker_p0-w0: resuming experience collection (15700 times) +[2024-06-18 05:20:20,749][12883] Updated weights for policy 0, policy_version 66251 (0.0034) +[2024-06-18 05:20:21,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 1085521920. Throughput: 0: 42238.6. Samples: 1085629740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:21,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 05:20:24,713][12883] Updated weights for policy 0, policy_version 66261 (0.0031) +[2024-06-18 05:20:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1085734912. Throughput: 0: 41972.5. Samples: 1085872420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:26,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 05:20:28,315][12883] Updated weights for policy 0, policy_version 66271 (0.0034) +[2024-06-18 05:20:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42320.8). Total num frames: 1085931520. Throughput: 0: 42179.2. Samples: 1086005840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:31,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 05:20:32,744][12883] Updated weights for policy 0, policy_version 66281 (0.0041) +[2024-06-18 05:20:36,203][12883] Updated weights for policy 0, policy_version 66291 (0.0027) +[2024-06-18 05:20:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42209.9). Total num frames: 1086144512. Throughput: 0: 42115.6. Samples: 1086258120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:36,994][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 05:20:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066293_1086144512.pth... +[2024-06-18 05:20:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065676_1076035584.pth +[2024-06-18 05:20:40,403][12883] Updated weights for policy 0, policy_version 66301 (0.0037) +[2024-06-18 05:20:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1086357504. Throughput: 0: 41991.2. Samples: 1086505960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:41,994][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 05:20:43,990][12883] Updated weights for policy 0, policy_version 66311 (0.0025) +[2024-06-18 05:20:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42266.0). Total num frames: 1086554112. Throughput: 0: 42129.6. Samples: 1086632800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:20:46,995][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 05:20:48,184][12883] Updated weights for policy 0, policy_version 66321 (0.0036) +[2024-06-18 05:20:51,783][12883] Updated weights for policy 0, policy_version 66331 (0.0031) +[2024-06-18 05:20:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41783.4, 300 sec: 42154.8). Total num frames: 1086767104. Throughput: 0: 42026.5. Samples: 1086886260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:20:51,995][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 05:20:56,028][12883] Updated weights for policy 0, policy_version 66341 (0.0025) +[2024-06-18 05:20:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42154.4). Total num frames: 1086980096. Throughput: 0: 42139.5. Samples: 1087142740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:20:56,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 05:20:59,567][12883] Updated weights for policy 0, policy_version 66351 (0.0034) +[2024-06-18 05:21:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1087209472. Throughput: 0: 41940.8. Samples: 1087262720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:21:01,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 05:21:03,741][12883] Updated weights for policy 0, policy_version 66361 (0.0036) +[2024-06-18 05:21:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1087406080. Throughput: 0: 41882.3. Samples: 1087514440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:21:06,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 05:21:07,187][12883] Updated weights for policy 0, policy_version 66371 (0.0034) +[2024-06-18 05:21:11,817][12883] Updated weights for policy 0, policy_version 66381 (0.0047) +[2024-06-18 05:21:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 1087602688. Throughput: 0: 42311.1. Samples: 1087776420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:21:11,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 05:21:14,750][12883] Updated weights for policy 0, policy_version 66391 (0.0033) +[2024-06-18 05:21:16,994][12645] Fps is (10 sec: 44234.3, 60 sec: 42597.9, 300 sec: 42265.1). Total num frames: 1087848448. Throughput: 0: 41974.5. Samples: 1087894720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:21:16,995][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 05:21:19,551][12883] Updated weights for policy 0, policy_version 66401 (0.0029) +[2024-06-18 05:21:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 1088028672. Throughput: 0: 41998.3. Samples: 1088148040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:21:21,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 05:21:22,490][12883] Updated weights for policy 0, policy_version 66411 (0.0029) +[2024-06-18 05:21:26,994][12645] Fps is (10 sec: 37685.6, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 1088225280. Throughput: 0: 42109.9. Samples: 1088400900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:21:26,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 05:21:27,431][12883] Updated weights for policy 0, policy_version 66421 (0.0028) +[2024-06-18 05:21:28,976][12862] Signal inference workers to stop experience collection... (15750 times) +[2024-06-18 05:21:28,977][12862] Signal inference workers to resume experience collection... (15750 times) +[2024-06-18 05:21:28,991][12883] InferenceWorker_p0-w0: stopping experience collection (15750 times) +[2024-06-18 05:21:28,992][12883] InferenceWorker_p0-w0: resuming experience collection (15750 times) +[2024-06-18 05:21:30,125][12883] Updated weights for policy 0, policy_version 66431 (0.0030) +[2024-06-18 05:21:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1088471040. Throughput: 0: 42034.7. Samples: 1088524360. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 05:21:31,994][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 05:21:35,142][12883] Updated weights for policy 0, policy_version 66441 (0.0042) +[2024-06-18 05:21:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1088667648. Throughput: 0: 42187.3. Samples: 1088784680. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 05:21:36,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 05:21:37,959][12883] Updated weights for policy 0, policy_version 66451 (0.0033) +[2024-06-18 05:21:41,996][12645] Fps is (10 sec: 39313.7, 60 sec: 41777.8, 300 sec: 42154.1). Total num frames: 1088864256. Throughput: 0: 41978.5. Samples: 1089031860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 05:21:41,996][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 05:21:42,992][12883] Updated weights for policy 0, policy_version 66461 (0.0032) +[2024-06-18 05:21:45,571][12883] Updated weights for policy 0, policy_version 66471 (0.0038) +[2024-06-18 05:21:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1089077248. Throughput: 0: 42090.6. Samples: 1089156800. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 05:21:46,995][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 05:21:50,813][12883] Updated weights for policy 0, policy_version 66481 (0.0036) +[2024-06-18 05:21:51,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42052.5, 300 sec: 42043.0). Total num frames: 1089290240. Throughput: 0: 42137.5. Samples: 1089410620. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 05:21:51,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 05:21:53,275][12883] Updated weights for policy 0, policy_version 66491 (0.0031) +[2024-06-18 05:21:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42154.4). Total num frames: 1089503232. Throughput: 0: 41920.8. Samples: 1089662860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 05:21:56,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 05:21:58,519][12883] Updated weights for policy 0, policy_version 66501 (0.0029) +[2024-06-18 05:22:01,487][12883] Updated weights for policy 0, policy_version 66511 (0.0029) +[2024-06-18 05:22:01,996][12645] Fps is (10 sec: 42588.4, 60 sec: 41777.7, 300 sec: 42098.2). Total num frames: 1089716224. Throughput: 0: 42058.0. Samples: 1089787400. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 05:22:01,997][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 05:22:06,119][12883] Updated weights for policy 0, policy_version 66521 (0.0029) +[2024-06-18 05:22:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41987.4). Total num frames: 1089896448. Throughput: 0: 41983.4. Samples: 1090037300. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 05:22:06,995][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 05:22:09,372][12883] Updated weights for policy 0, policy_version 66531 (0.0028) +[2024-06-18 05:22:11,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1090125824. Throughput: 0: 41869.7. Samples: 1090285040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:22:11,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 05:22:13,846][12883] Updated weights for policy 0, policy_version 66541 (0.0038) +[2024-06-18 05:22:16,994][12645] Fps is (10 sec: 45875.7, 60 sec: 41779.6, 300 sec: 42154.1). Total num frames: 1090355200. Throughput: 0: 42006.7. Samples: 1090414660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:22:16,995][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 05:22:17,231][12883] Updated weights for policy 0, policy_version 66551 (0.0040) +[2024-06-18 05:22:21,457][12883] Updated weights for policy 0, policy_version 66561 (0.0035) +[2024-06-18 05:22:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1090535424. Throughput: 0: 41788.9. Samples: 1090665180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:22:21,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 05:22:24,950][12883] Updated weights for policy 0, policy_version 66571 (0.0030) +[2024-06-18 05:22:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1090748416. Throughput: 0: 41844.5. Samples: 1090914780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:22:26,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 05:22:29,170][12883] Updated weights for policy 0, policy_version 66581 (0.0041) +[2024-06-18 05:22:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1090977792. Throughput: 0: 41976.5. Samples: 1091045740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:22:31,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 05:22:32,727][12883] Updated weights for policy 0, policy_version 66591 (0.0033) +[2024-06-18 05:22:36,835][12883] Updated weights for policy 0, policy_version 66601 (0.0033) +[2024-06-18 05:22:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 1091190784. Throughput: 0: 42027.7. Samples: 1091301880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:22:37,000][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 05:22:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066601_1091190784.pth... +[2024-06-18 05:22:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065983_1081065472.pth +[2024-06-18 05:22:40,666][12883] Updated weights for policy 0, policy_version 66611 (0.0021) +[2024-06-18 05:22:41,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42325.2, 300 sec: 42153.8). Total num frames: 1091403776. Throughput: 0: 41896.2. Samples: 1091548280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:22:41,997][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 05:22:44,507][12883] Updated weights for policy 0, policy_version 66621 (0.0032) +[2024-06-18 05:22:46,996][12645] Fps is (10 sec: 39313.5, 60 sec: 41777.7, 300 sec: 41987.1). Total num frames: 1091584000. Throughput: 0: 41965.3. Samples: 1091675840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:22:46,997][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 05:22:48,593][12883] Updated weights for policy 0, policy_version 66631 (0.0028) +[2024-06-18 05:22:51,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1091813376. Throughput: 0: 42028.1. Samples: 1091928560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 05:22:51,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 05:22:52,848][12883] Updated weights for policy 0, policy_version 66641 (0.0033) +[2024-06-18 05:22:56,250][12883] Updated weights for policy 0, policy_version 66651 (0.0032) +[2024-06-18 05:22:56,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1092042752. Throughput: 0: 42090.7. Samples: 1092179120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 05:22:56,999][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 05:23:00,607][12883] Updated weights for policy 0, policy_version 66661 (0.0033) +[2024-06-18 05:23:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42053.8, 300 sec: 42154.1). Total num frames: 1092239360. Throughput: 0: 42093.7. Samples: 1092308880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 05:23:01,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 05:23:04,125][12883] Updated weights for policy 0, policy_version 66671 (0.0039) +[2024-06-18 05:23:06,732][12862] Signal inference workers to stop experience collection... (15800 times) +[2024-06-18 05:23:06,763][12883] InferenceWorker_p0-w0: stopping experience collection (15800 times) +[2024-06-18 05:23:06,792][12862] Signal inference workers to resume experience collection... (15800 times) +[2024-06-18 05:23:06,793][12883] InferenceWorker_p0-w0: resuming experience collection (15800 times) +[2024-06-18 05:23:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 1092452352. Throughput: 0: 42040.0. Samples: 1092556980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 05:23:06,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 05:23:08,321][12883] Updated weights for policy 0, policy_version 66681 (0.0042) +[2024-06-18 05:23:11,959][12883] Updated weights for policy 0, policy_version 66691 (0.0039) +[2024-06-18 05:23:11,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1092665344. Throughput: 0: 42341.4. Samples: 1092820140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 05:23:11,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 05:23:15,984][12883] Updated weights for policy 0, policy_version 66701 (0.0039) +[2024-06-18 05:23:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 1092861952. Throughput: 0: 42132.0. Samples: 1092941680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 05:23:16,995][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 05:23:19,713][12883] Updated weights for policy 0, policy_version 66711 (0.0036) +[2024-06-18 05:23:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 1093074944. Throughput: 0: 42039.2. Samples: 1093193640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 05:23:21,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 05:23:23,611][12883] Updated weights for policy 0, policy_version 66721 (0.0032) +[2024-06-18 05:23:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 1093287936. Throughput: 0: 42329.2. Samples: 1093453000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 05:23:26,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 05:23:27,597][12883] Updated weights for policy 0, policy_version 66731 (0.0031) +[2024-06-18 05:23:31,202][12883] Updated weights for policy 0, policy_version 66741 (0.0031) +[2024-06-18 05:23:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1093500928. Throughput: 0: 42219.5. Samples: 1093575620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:23:31,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 05:23:35,303][12883] Updated weights for policy 0, policy_version 66751 (0.0026) +[2024-06-18 05:23:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 1093713920. Throughput: 0: 42197.4. Samples: 1093827440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:23:36,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 05:23:39,443][12883] Updated weights for policy 0, policy_version 66761 (0.0040) +[2024-06-18 05:23:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.9, 300 sec: 42098.5). Total num frames: 1093926912. Throughput: 0: 42269.0. Samples: 1094081220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:23:41,994][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 05:23:42,879][12883] Updated weights for policy 0, policy_version 66771 (0.0026) +[2024-06-18 05:23:46,995][12645] Fps is (10 sec: 40956.6, 60 sec: 42326.3, 300 sec: 42043.2). Total num frames: 1094123520. Throughput: 0: 42178.0. Samples: 1094206920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:23:46,995][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 05:23:47,672][12883] Updated weights for policy 0, policy_version 66781 (0.0027) +[2024-06-18 05:23:50,654][12883] Updated weights for policy 0, policy_version 66791 (0.0028) +[2024-06-18 05:23:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1094336512. Throughput: 0: 42215.1. Samples: 1094456660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:23:51,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 05:23:55,213][12883] Updated weights for policy 0, policy_version 66801 (0.0043) +[2024-06-18 05:23:56,994][12645] Fps is (10 sec: 42601.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1094549504. Throughput: 0: 42132.3. Samples: 1094716100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:23:56,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 05:23:58,449][12883] Updated weights for policy 0, policy_version 66811 (0.0031) +[2024-06-18 05:24:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1094746112. Throughput: 0: 42113.8. Samples: 1094836800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:24:01,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 05:24:02,645][12883] Updated weights for policy 0, policy_version 66821 (0.0034) +[2024-06-18 05:24:06,117][12883] Updated weights for policy 0, policy_version 66831 (0.0031) +[2024-06-18 05:24:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1094991872. Throughput: 0: 42089.8. Samples: 1095087680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:24:06,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 05:24:10,478][12883] Updated weights for policy 0, policy_version 66841 (0.0034) +[2024-06-18 05:24:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 1095172096. Throughput: 0: 42153.8. Samples: 1095349920. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:11,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 05:24:13,910][12883] Updated weights for policy 0, policy_version 66851 (0.0036) +[2024-06-18 05:24:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 1095401472. Throughput: 0: 41998.2. Samples: 1095465540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:16,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 05:24:18,689][12883] Updated weights for policy 0, policy_version 66861 (0.0037) +[2024-06-18 05:24:21,913][12883] Updated weights for policy 0, policy_version 66871 (0.0035) +[2024-06-18 05:24:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 1095614464. Throughput: 0: 41994.7. Samples: 1095717200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:21,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 05:24:26,430][12883] Updated weights for policy 0, policy_version 66881 (0.0029) +[2024-06-18 05:24:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1095811072. Throughput: 0: 42171.9. Samples: 1095978960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:26,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 05:24:29,474][12883] Updated weights for policy 0, policy_version 66891 (0.0053) +[2024-06-18 05:24:32,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42321.0, 300 sec: 42153.2). Total num frames: 1096040448. Throughput: 0: 42021.2. Samples: 1096098100. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:32,000][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 05:24:34,014][12883] Updated weights for policy 0, policy_version 66901 (0.0040) +[2024-06-18 05:24:34,280][12862] Signal inference workers to stop experience collection... (15850 times) +[2024-06-18 05:24:34,280][12862] Signal inference workers to resume experience collection... (15850 times) +[2024-06-18 05:24:34,296][12883] InferenceWorker_p0-w0: stopping experience collection (15850 times) +[2024-06-18 05:24:34,296][12883] InferenceWorker_p0-w0: resuming experience collection (15850 times) +[2024-06-18 05:24:36,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 1096237056. Throughput: 0: 42174.3. Samples: 1096354600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:36,997][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 05:24:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066910_1096253440.pth... +[2024-06-18 05:24:37,118][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066293_1086144512.pth +[2024-06-18 05:24:37,470][12883] Updated weights for policy 0, policy_version 66911 (0.0023) +[2024-06-18 05:24:41,659][12883] Updated weights for policy 0, policy_version 66921 (0.0026) +[2024-06-18 05:24:41,994][12645] Fps is (10 sec: 40985.8, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 1096450048. Throughput: 0: 42144.6. Samples: 1096612600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:41,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 05:24:45,217][12883] Updated weights for policy 0, policy_version 66931 (0.0031) +[2024-06-18 05:24:46,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.9, 300 sec: 42043.9). Total num frames: 1096663040. Throughput: 0: 42241.0. Samples: 1096737640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:46,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 05:24:49,453][12883] Updated weights for policy 0, policy_version 66941 (0.0025) +[2024-06-18 05:24:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1096859648. Throughput: 0: 42113.8. Samples: 1096982800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) +[2024-06-18 05:24:51,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 05:24:53,062][12883] Updated weights for policy 0, policy_version 66951 (0.0028) +[2024-06-18 05:24:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42043.1). Total num frames: 1097072640. Throughput: 0: 41968.9. Samples: 1097238520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 05:24:56,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 05:24:57,305][12883] Updated weights for policy 0, policy_version 66961 (0.0040) +[2024-06-18 05:25:00,877][12883] Updated weights for policy 0, policy_version 66971 (0.0031) +[2024-06-18 05:25:02,000][12645] Fps is (10 sec: 45846.7, 60 sec: 42867.1, 300 sec: 42153.2). Total num frames: 1097318400. Throughput: 0: 41956.4. Samples: 1097353840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 05:25:02,000][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 05:25:05,372][12883] Updated weights for policy 0, policy_version 66981 (0.0042) +[2024-06-18 05:25:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 1097498624. Throughput: 0: 42079.9. Samples: 1097610800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 05:25:06,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 05:25:08,608][12883] Updated weights for policy 0, policy_version 66991 (0.0039) +[2024-06-18 05:25:11,994][12645] Fps is (10 sec: 36067.5, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 1097678848. Throughput: 0: 42072.1. Samples: 1097872200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 05:25:11,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 05:25:13,014][12883] Updated weights for policy 0, policy_version 67001 (0.0025) +[2024-06-18 05:25:16,499][12883] Updated weights for policy 0, policy_version 67011 (0.0041) +[2024-06-18 05:25:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1097924608. Throughput: 0: 41964.5. Samples: 1097986240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 05:25:16,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 05:25:20,623][12883] Updated weights for policy 0, policy_version 67021 (0.0029) +[2024-06-18 05:25:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1098137600. Throughput: 0: 41896.4. Samples: 1098239840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 05:25:21,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 05:25:24,254][12883] Updated weights for policy 0, policy_version 67031 (0.0029) +[2024-06-18 05:25:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 1098317824. Throughput: 0: 41975.6. Samples: 1098501500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 05:25:26,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 05:25:28,303][12883] Updated weights for policy 0, policy_version 67041 (0.0028) +[2024-06-18 05:25:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41783.6, 300 sec: 42043.0). Total num frames: 1098547200. Throughput: 0: 41809.8. Samples: 1098619080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 05:25:31,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 05:25:32,097][12883] Updated weights for policy 0, policy_version 67051 (0.0038) +[2024-06-18 05:25:36,369][12883] Updated weights for policy 0, policy_version 67061 (0.0047) +[2024-06-18 05:25:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42053.9, 300 sec: 42043.0). Total num frames: 1098760192. Throughput: 0: 41936.4. Samples: 1098869940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 05:25:36,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 05:25:39,791][12883] Updated weights for policy 0, policy_version 67071 (0.0036) +[2024-06-18 05:25:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 1098956800. Throughput: 0: 41830.2. Samples: 1099120880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 05:25:41,994][12645] Avg episode reward: [(0, '0.076')] +[2024-06-18 05:25:44,062][12883] Updated weights for policy 0, policy_version 67081 (0.0039) +[2024-06-18 05:25:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1099169792. Throughput: 0: 42104.1. Samples: 1099248260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 05:25:46,994][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 05:25:47,696][12883] Updated weights for policy 0, policy_version 67091 (0.0042) +[2024-06-18 05:25:51,864][12883] Updated weights for policy 0, policy_version 67101 (0.0034) +[2024-06-18 05:25:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1099382784. Throughput: 0: 41947.1. Samples: 1099498420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 05:25:51,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 05:25:52,976][12862] Signal inference workers to stop experience collection... (15900 times) +[2024-06-18 05:25:53,001][12883] InferenceWorker_p0-w0: stopping experience collection (15900 times) +[2024-06-18 05:25:53,040][12862] Signal inference workers to resume experience collection... (15900 times) +[2024-06-18 05:25:53,041][12883] InferenceWorker_p0-w0: resuming experience collection (15900 times) +[2024-06-18 05:25:55,791][12883] Updated weights for policy 0, policy_version 67111 (0.0031) +[2024-06-18 05:25:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 1099595776. Throughput: 0: 41762.1. Samples: 1099751500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 05:25:56,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 05:26:00,120][12883] Updated weights for policy 0, policy_version 67121 (0.0043) +[2024-06-18 05:26:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41237.3, 300 sec: 41987.5). Total num frames: 1099792384. Throughput: 0: 41928.8. Samples: 1099873040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 05:26:01,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 05:26:03,768][12883] Updated weights for policy 0, policy_version 67131 (0.0036) +[2024-06-18 05:26:07,000][12645] Fps is (10 sec: 40934.5, 60 sec: 41774.9, 300 sec: 42042.1). Total num frames: 1100005376. Throughput: 0: 41892.0. Samples: 1100125240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 05:26:07,000][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 05:26:07,824][12883] Updated weights for policy 0, policy_version 67141 (0.0035) +[2024-06-18 05:26:11,861][12883] Updated weights for policy 0, policy_version 67151 (0.0039) +[2024-06-18 05:26:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 41876.5). Total num frames: 1100201984. Throughput: 0: 41760.2. Samples: 1100380720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 05:26:11,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 05:26:15,534][12883] Updated weights for policy 0, policy_version 67161 (0.0033) +[2024-06-18 05:26:16,996][12645] Fps is (10 sec: 42615.5, 60 sec: 41777.6, 300 sec: 42042.7). Total num frames: 1100431360. Throughput: 0: 41839.6. Samples: 1100501960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:16,997][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 05:26:19,666][12883] Updated weights for policy 0, policy_version 67171 (0.0042) +[2024-06-18 05:26:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 1100644352. Throughput: 0: 42018.1. Samples: 1100760760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:21,995][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 05:26:23,321][12883] Updated weights for policy 0, policy_version 67181 (0.0038) +[2024-06-18 05:26:26,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1100840960. Throughput: 0: 41905.3. Samples: 1101006620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:26,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 05:26:27,580][12883] Updated weights for policy 0, policy_version 67191 (0.0033) +[2024-06-18 05:26:31,070][12883] Updated weights for policy 0, policy_version 67201 (0.0025) +[2024-06-18 05:26:31,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1101053952. Throughput: 0: 41770.7. Samples: 1101127940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:31,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 05:26:35,315][12883] Updated weights for policy 0, policy_version 67211 (0.0034) +[2024-06-18 05:26:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42098.8). Total num frames: 1101283328. Throughput: 0: 42093.5. Samples: 1101392620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:36,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 05:26:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067217_1101283328.pth... +[2024-06-18 05:26:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066601_1091190784.pth +[2024-06-18 05:26:38,710][12883] Updated weights for policy 0, policy_version 67221 (0.0044) +[2024-06-18 05:26:41,994][12645] Fps is (10 sec: 40957.6, 60 sec: 41778.9, 300 sec: 41987.4). Total num frames: 1101463552. Throughput: 0: 41840.5. Samples: 1101634340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:41,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 05:26:42,912][12883] Updated weights for policy 0, policy_version 67231 (0.0042) +[2024-06-18 05:26:46,464][12883] Updated weights for policy 0, policy_version 67241 (0.0033) +[2024-06-18 05:26:46,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1101692928. Throughput: 0: 41856.9. Samples: 1101756600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:46,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 05:26:50,652][12883] Updated weights for policy 0, policy_version 67251 (0.0041) +[2024-06-18 05:26:51,994][12645] Fps is (10 sec: 40961.9, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 1101873152. Throughput: 0: 41922.7. Samples: 1102011500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:51,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 05:26:54,303][12883] Updated weights for policy 0, policy_version 67261 (0.0026) +[2024-06-18 05:26:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41987.8). Total num frames: 1102102528. Throughput: 0: 41709.8. Samples: 1102257660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 05:26:57,003][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 05:26:58,651][12883] Updated weights for policy 0, policy_version 67271 (0.0042) +[2024-06-18 05:27:01,871][12883] Updated weights for policy 0, policy_version 67281 (0.0040) +[2024-06-18 05:27:01,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1102331904. Throughput: 0: 41977.3. Samples: 1102390840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:27:01,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 05:27:06,238][12883] Updated weights for policy 0, policy_version 67291 (0.0038) +[2024-06-18 05:27:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42056.7, 300 sec: 42043.0). Total num frames: 1102528512. Throughput: 0: 41882.4. Samples: 1102645460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:27:06,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 05:27:09,698][12883] Updated weights for policy 0, policy_version 67301 (0.0034) +[2024-06-18 05:27:11,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1102741504. Throughput: 0: 41923.0. Samples: 1102893160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:27:11,995][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 05:27:13,959][12883] Updated weights for policy 0, policy_version 67311 (0.0042) +[2024-06-18 05:27:16,994][12645] Fps is (10 sec: 40956.6, 60 sec: 41780.2, 300 sec: 42042.9). Total num frames: 1102938112. Throughput: 0: 42157.0. Samples: 1103025040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:27:16,995][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 05:27:17,335][12862] Signal inference workers to stop experience collection... (15950 times) +[2024-06-18 05:27:17,386][12883] InferenceWorker_p0-w0: stopping experience collection (15950 times) +[2024-06-18 05:27:17,393][12862] Signal inference workers to resume experience collection... (15950 times) +[2024-06-18 05:27:17,402][12883] InferenceWorker_p0-w0: resuming experience collection (15950 times) +[2024-06-18 05:27:17,684][12883] Updated weights for policy 0, policy_version 67321 (0.0039) +[2024-06-18 05:27:21,700][12883] Updated weights for policy 0, policy_version 67331 (0.0030) +[2024-06-18 05:27:21,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 1103151104. Throughput: 0: 41743.6. Samples: 1103271080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:27:21,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 05:27:25,737][12883] Updated weights for policy 0, policy_version 67341 (0.0031) +[2024-06-18 05:27:26,994][12645] Fps is (10 sec: 42600.9, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 1103364096. Throughput: 0: 41959.4. Samples: 1103522500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:27:26,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 05:27:29,356][12883] Updated weights for policy 0, policy_version 67351 (0.0037) +[2024-06-18 05:27:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41932.0). Total num frames: 1103560704. Throughput: 0: 41960.6. Samples: 1103644820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:27:31,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 05:27:33,364][12883] Updated weights for policy 0, policy_version 67361 (0.0038) +[2024-06-18 05:27:36,996][12645] Fps is (10 sec: 42589.5, 60 sec: 41777.6, 300 sec: 41987.5). Total num frames: 1103790080. Throughput: 0: 41984.6. Samples: 1103900900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 05:27:37,005][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 05:27:37,308][12883] Updated weights for policy 0, policy_version 67371 (0.0033) +[2024-06-18 05:27:41,155][12883] Updated weights for policy 0, policy_version 67381 (0.0030) +[2024-06-18 05:27:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.6, 300 sec: 42098.9). Total num frames: 1104003072. Throughput: 0: 42016.0. Samples: 1104148380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:27:41,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 05:27:44,985][12883] Updated weights for policy 0, policy_version 67391 (0.0031) +[2024-06-18 05:27:46,994][12645] Fps is (10 sec: 40968.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1104199680. Throughput: 0: 41921.6. Samples: 1104277320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:27:46,994][12645] Avg episode reward: [(0, '0.091')] +[2024-06-18 05:27:49,043][12883] Updated weights for policy 0, policy_version 67401 (0.0031) +[2024-06-18 05:27:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 1104412672. Throughput: 0: 41815.0. Samples: 1104527140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:27:51,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 05:27:52,818][12883] Updated weights for policy 0, policy_version 67411 (0.0025) +[2024-06-18 05:27:56,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 1104625664. Throughput: 0: 41953.6. Samples: 1104781060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:27:56,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 05:27:57,001][12883] Updated weights for policy 0, policy_version 67421 (0.0042) +[2024-06-18 05:28:00,709][12883] Updated weights for policy 0, policy_version 67431 (0.0045) +[2024-06-18 05:28:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 1104838656. Throughput: 0: 41903.3. Samples: 1104910660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:28:01,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 05:28:04,548][12883] Updated weights for policy 0, policy_version 67441 (0.0038) +[2024-06-18 05:28:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 1105035264. Throughput: 0: 41967.0. Samples: 1105159600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:28:06,998][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 05:28:08,513][12883] Updated weights for policy 0, policy_version 67451 (0.0025) +[2024-06-18 05:28:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 1105264640. Throughput: 0: 42073.0. Samples: 1105415780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:28:11,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 05:28:12,554][12883] Updated weights for policy 0, policy_version 67461 (0.0038) +[2024-06-18 05:28:16,251][12883] Updated weights for policy 0, policy_version 67471 (0.0027) +[2024-06-18 05:28:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.8, 300 sec: 42043.0). Total num frames: 1105477632. Throughput: 0: 42137.3. Samples: 1105541000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:28:16,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 05:28:20,109][12883] Updated weights for policy 0, policy_version 67481 (0.0044) +[2024-06-18 05:28:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1105690624. Throughput: 0: 42156.8. Samples: 1105797860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 05:28:21,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 05:28:23,971][12883] Updated weights for policy 0, policy_version 67491 (0.0032) +[2024-06-18 05:28:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 1105887232. Throughput: 0: 42297.4. Samples: 1106051760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 05:28:26,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 05:28:27,681][12883] Updated weights for policy 0, policy_version 67501 (0.0047) +[2024-06-18 05:28:31,718][12883] Updated weights for policy 0, policy_version 67511 (0.0043) +[2024-06-18 05:28:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1106100224. Throughput: 0: 42215.7. Samples: 1106177020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 05:28:31,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 05:28:35,374][12883] Updated weights for policy 0, policy_version 67521 (0.0030) +[2024-06-18 05:28:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42053.7, 300 sec: 41987.4). Total num frames: 1106313216. Throughput: 0: 42344.3. Samples: 1106432640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 05:28:36,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 05:28:37,007][12862] Signal inference workers to stop experience collection... (16000 times) +[2024-06-18 05:28:37,012][12862] Signal inference workers to resume experience collection... (16000 times) +[2024-06-18 05:28:37,053][12883] InferenceWorker_p0-w0: stopping experience collection (16000 times) +[2024-06-18 05:28:37,053][12883] InferenceWorker_p0-w0: resuming experience collection (16000 times) +[2024-06-18 05:28:37,142][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067525_1106329600.pth... +[2024-06-18 05:28:37,196][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066910_1096253440.pth +[2024-06-18 05:28:39,407][12883] Updated weights for policy 0, policy_version 67531 (0.0037) +[2024-06-18 05:28:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42098.7). Total num frames: 1106542592. Throughput: 0: 42366.6. Samples: 1106687560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 05:28:41,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 05:28:43,191][12883] Updated weights for policy 0, policy_version 67541 (0.0033) +[2024-06-18 05:28:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1106739200. Throughput: 0: 42388.9. Samples: 1106818160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 05:28:46,994][12645] Avg episode reward: [(0, '0.130')] +[2024-06-18 05:28:47,227][12883] Updated weights for policy 0, policy_version 67551 (0.0030) +[2024-06-18 05:28:50,986][12883] Updated weights for policy 0, policy_version 67561 (0.0029) +[2024-06-18 05:28:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1106935808. Throughput: 0: 42388.0. Samples: 1107067060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 05:28:51,994][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 05:28:54,919][12883] Updated weights for policy 0, policy_version 67571 (0.0046) +[2024-06-18 05:28:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 1107181568. Throughput: 0: 42291.1. Samples: 1107318880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 05:28:56,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 05:28:58,721][12883] Updated weights for policy 0, policy_version 67581 (0.0022) +[2024-06-18 05:29:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1107361792. Throughput: 0: 42473.3. Samples: 1107452300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:01,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 05:29:02,827][12883] Updated weights for policy 0, policy_version 67591 (0.0043) +[2024-06-18 05:29:06,635][12883] Updated weights for policy 0, policy_version 67601 (0.0034) +[2024-06-18 05:29:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1107574784. Throughput: 0: 42320.0. Samples: 1107702260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:06,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 05:29:10,730][12883] Updated weights for policy 0, policy_version 67611 (0.0036) +[2024-06-18 05:29:11,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 1107820544. Throughput: 0: 42213.2. Samples: 1107951360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:11,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 05:29:14,702][12883] Updated weights for policy 0, policy_version 67621 (0.0038) +[2024-06-18 05:29:16,993][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 1107984384. Throughput: 0: 42457.5. Samples: 1108087600. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:16,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 05:29:18,444][12883] Updated weights for policy 0, policy_version 67631 (0.0034) +[2024-06-18 05:29:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1108197376. Throughput: 0: 42297.5. Samples: 1108336020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:21,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 05:29:22,148][12883] Updated weights for policy 0, policy_version 67641 (0.0025) +[2024-06-18 05:29:25,881][12883] Updated weights for policy 0, policy_version 67651 (0.0031) +[2024-06-18 05:29:26,996][12645] Fps is (10 sec: 47502.1, 60 sec: 42869.8, 300 sec: 42099.1). Total num frames: 1108459520. Throughput: 0: 42311.6. Samples: 1108591680. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:26,997][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 05:29:29,808][12883] Updated weights for policy 0, policy_version 67661 (0.0041) +[2024-06-18 05:29:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42043.3). Total num frames: 1108639744. Throughput: 0: 42340.9. Samples: 1108723500. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:31,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 05:29:33,469][12883] Updated weights for policy 0, policy_version 67671 (0.0028) +[2024-06-18 05:29:36,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1108852736. Throughput: 0: 42341.7. Samples: 1108972440. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:36,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 05:29:37,449][12883] Updated weights for policy 0, policy_version 67681 (0.0033) +[2024-06-18 05:29:41,112][12883] Updated weights for policy 0, policy_version 67691 (0.0038) +[2024-06-18 05:29:41,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 1109098496. Throughput: 0: 42493.7. Samples: 1109231100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 05:29:41,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 05:29:45,260][12883] Updated weights for policy 0, policy_version 67701 (0.0038) +[2024-06-18 05:29:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 1109278720. Throughput: 0: 42355.5. Samples: 1109358300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 05:29:46,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 05:29:48,923][12883] Updated weights for policy 0, policy_version 67711 (0.0027) +[2024-06-18 05:29:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 1109491712. Throughput: 0: 42339.5. Samples: 1109607540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 05:29:51,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 05:29:52,810][12883] Updated weights for policy 0, policy_version 67721 (0.0032) +[2024-06-18 05:29:56,775][12883] Updated weights for policy 0, policy_version 67731 (0.0031) +[2024-06-18 05:29:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41988.4). Total num frames: 1109704704. Throughput: 0: 42566.3. Samples: 1109866840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 05:29:57,003][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 05:30:00,684][12883] Updated weights for policy 0, policy_version 67741 (0.0034) +[2024-06-18 05:30:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 1109917696. Throughput: 0: 42324.6. Samples: 1109992220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 05:30:01,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 05:30:04,611][12883] Updated weights for policy 0, policy_version 67751 (0.0036) +[2024-06-18 05:30:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1110130688. Throughput: 0: 42337.7. Samples: 1110241220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 05:30:06,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 05:30:07,699][12862] Signal inference workers to stop experience collection... (16050 times) +[2024-06-18 05:30:07,699][12862] Signal inference workers to resume experience collection... (16050 times) +[2024-06-18 05:30:07,722][12883] InferenceWorker_p0-w0: stopping experience collection (16050 times) +[2024-06-18 05:30:07,722][12883] InferenceWorker_p0-w0: resuming experience collection (16050 times) +[2024-06-18 05:30:08,572][12883] Updated weights for policy 0, policy_version 67761 (0.0036) +[2024-06-18 05:30:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1110327296. Throughput: 0: 42510.1. Samples: 1110504540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 05:30:11,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 05:30:12,273][12883] Updated weights for policy 0, policy_version 67771 (0.0038) +[2024-06-18 05:30:16,494][12883] Updated weights for policy 0, policy_version 67781 (0.0043) +[2024-06-18 05:30:17,000][12645] Fps is (10 sec: 42571.9, 60 sec: 42866.9, 300 sec: 42097.7). Total num frames: 1110556672. Throughput: 0: 42128.9. Samples: 1110619560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 05:30:17,000][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 05:30:20,095][12883] Updated weights for policy 0, policy_version 67791 (0.0024) +[2024-06-18 05:30:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42265.1). Total num frames: 1110786048. Throughput: 0: 42287.9. Samples: 1110875400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) +[2024-06-18 05:30:21,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 05:30:24,317][12883] Updated weights for policy 0, policy_version 67801 (0.0047) +[2024-06-18 05:30:26,994][12645] Fps is (10 sec: 39346.2, 60 sec: 41507.7, 300 sec: 42043.0). Total num frames: 1110949888. Throughput: 0: 42315.7. Samples: 1111135300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 05:30:26,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 05:30:27,767][12883] Updated weights for policy 0, policy_version 67811 (0.0038) +[2024-06-18 05:30:31,895][12883] Updated weights for policy 0, policy_version 67821 (0.0024) +[2024-06-18 05:30:31,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 1111179264. Throughput: 0: 42081.1. Samples: 1111251940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 05:30:31,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 05:30:35,470][12883] Updated weights for policy 0, policy_version 67831 (0.0027) +[2024-06-18 05:30:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1111408640. Throughput: 0: 42265.0. Samples: 1111509460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 05:30:36,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 05:30:37,080][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067836_1111425024.pth... +[2024-06-18 05:30:37,136][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067217_1101283328.pth +[2024-06-18 05:30:39,381][12883] Updated weights for policy 0, policy_version 67841 (0.0033) +[2024-06-18 05:30:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 1111588864. Throughput: 0: 42376.0. Samples: 1111773760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 05:30:41,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 05:30:43,123][12883] Updated weights for policy 0, policy_version 67851 (0.0032) +[2024-06-18 05:30:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1111818240. Throughput: 0: 42089.4. Samples: 1111886240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 05:30:46,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 05:30:47,102][12883] Updated weights for policy 0, policy_version 67861 (0.0030) +[2024-06-18 05:30:50,835][12883] Updated weights for policy 0, policy_version 67871 (0.0037) +[2024-06-18 05:30:51,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1112064000. Throughput: 0: 42362.7. Samples: 1112147540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 05:30:51,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 05:30:55,014][12883] Updated weights for policy 0, policy_version 67881 (0.0029) +[2024-06-18 05:30:57,000][12645] Fps is (10 sec: 39297.1, 60 sec: 41774.9, 300 sec: 42097.7). Total num frames: 1112211456. Throughput: 0: 42328.0. Samples: 1112409560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 05:30:57,000][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 05:30:58,696][12883] Updated weights for policy 0, policy_version 67891 (0.0034) +[2024-06-18 05:31:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42266.1). Total num frames: 1112473600. Throughput: 0: 42283.1. Samples: 1112522040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 05:31:01,994][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 05:31:02,666][12883] Updated weights for policy 0, policy_version 67901 (0.0026) +[2024-06-18 05:31:06,751][12883] Updated weights for policy 0, policy_version 67911 (0.0033) +[2024-06-18 05:31:06,994][12645] Fps is (10 sec: 45903.8, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1112670208. Throughput: 0: 42368.1. Samples: 1112781960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:06,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 05:31:10,327][12883] Updated weights for policy 0, policy_version 67921 (0.0043) +[2024-06-18 05:31:11,994][12645] Fps is (10 sec: 36044.9, 60 sec: 41779.2, 300 sec: 42043.3). Total num frames: 1112834048. Throughput: 0: 42301.3. Samples: 1113038860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:11,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 05:31:14,501][12883] Updated weights for policy 0, policy_version 67931 (0.0033) +[2024-06-18 05:31:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42329.7, 300 sec: 42209.6). Total num frames: 1113096192. Throughput: 0: 42422.6. Samples: 1113160960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:16,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 05:31:17,908][12883] Updated weights for policy 0, policy_version 67941 (0.0036) +[2024-06-18 05:31:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41506.3, 300 sec: 42154.1). Total num frames: 1113276416. Throughput: 0: 42335.2. Samples: 1113414540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:21,994][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 05:31:22,191][12883] Updated weights for policy 0, policy_version 67951 (0.0026) +[2024-06-18 05:31:26,362][12883] Updated weights for policy 0, policy_version 67961 (0.0037) +[2024-06-18 05:31:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1113489408. Throughput: 0: 42013.4. Samples: 1113664360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:26,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 05:31:28,076][12862] Signal inference workers to stop experience collection... (16100 times) +[2024-06-18 05:31:28,076][12862] Signal inference workers to resume experience collection... (16100 times) +[2024-06-18 05:31:28,096][12883] InferenceWorker_p0-w0: stopping experience collection (16100 times) +[2024-06-18 05:31:28,097][12883] InferenceWorker_p0-w0: resuming experience collection (16100 times) +[2024-06-18 05:31:29,951][12883] Updated weights for policy 0, policy_version 67971 (0.0028) +[2024-06-18 05:31:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1113718784. Throughput: 0: 42164.8. Samples: 1113783660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:32,000][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 05:31:34,033][12883] Updated weights for policy 0, policy_version 67981 (0.0035) +[2024-06-18 05:31:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42209.7). Total num frames: 1113915392. Throughput: 0: 42181.2. Samples: 1114045700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:36,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 05:31:37,647][12883] Updated weights for policy 0, policy_version 67991 (0.0030) +[2024-06-18 05:31:41,493][12883] Updated weights for policy 0, policy_version 68001 (0.0041) +[2024-06-18 05:31:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1114128384. Throughput: 0: 41797.4. Samples: 1114290180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:41,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 05:31:45,395][12883] Updated weights for policy 0, policy_version 68011 (0.0035) +[2024-06-18 05:31:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1114357760. Throughput: 0: 42212.5. Samples: 1114421600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 05:31:46,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 05:31:49,145][12883] Updated weights for policy 0, policy_version 68021 (0.0038) +[2024-06-18 05:31:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 1114554368. Throughput: 0: 42190.2. Samples: 1114680520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:31:51,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 05:31:53,483][12883] Updated weights for policy 0, policy_version 68031 (0.0031) +[2024-06-18 05:31:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42602.8, 300 sec: 42154.1). Total num frames: 1114767360. Throughput: 0: 41899.1. Samples: 1114924320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:31:56,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 05:31:57,235][12883] Updated weights for policy 0, policy_version 68041 (0.0033) +[2024-06-18 05:32:01,191][12883] Updated weights for policy 0, policy_version 68051 (0.0033) +[2024-06-18 05:32:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1114980352. Throughput: 0: 42104.8. Samples: 1115055680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:32:01,994][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 05:32:05,099][12883] Updated weights for policy 0, policy_version 68061 (0.0027) +[2024-06-18 05:32:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1115176960. Throughput: 0: 42115.8. Samples: 1115309760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:32:06,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 05:32:08,732][12883] Updated weights for policy 0, policy_version 68071 (0.0034) +[2024-06-18 05:32:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42154.2). Total num frames: 1115373568. Throughput: 0: 42152.7. Samples: 1115561240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:32:11,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 05:32:12,922][12883] Updated weights for policy 0, policy_version 68081 (0.0027) +[2024-06-18 05:32:16,292][12883] Updated weights for policy 0, policy_version 68091 (0.0034) +[2024-06-18 05:32:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1115635712. Throughput: 0: 42268.3. Samples: 1115685740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:32:16,994][12645] Avg episode reward: [(0, '0.103')] +[2024-06-18 05:32:20,946][12883] Updated weights for policy 0, policy_version 68101 (0.0046) +[2024-06-18 05:32:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1115815936. Throughput: 0: 42263.1. Samples: 1115947540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:32:21,994][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 05:32:23,857][12883] Updated weights for policy 0, policy_version 68111 (0.0024) +[2024-06-18 05:32:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 1116028928. Throughput: 0: 42253.7. Samples: 1116191600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:32:26,994][12645] Avg episode reward: [(0, '0.094')] +[2024-06-18 05:32:28,722][12883] Updated weights for policy 0, policy_version 68121 (0.0032) +[2024-06-18 05:32:31,648][12883] Updated weights for policy 0, policy_version 68131 (0.0036) +[2024-06-18 05:32:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42321.0). Total num frames: 1116274688. Throughput: 0: 42273.8. Samples: 1116323920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 05:32:31,994][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 05:32:36,402][12883] Updated weights for policy 0, policy_version 68141 (0.0037) +[2024-06-18 05:32:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 1116454912. Throughput: 0: 42248.1. Samples: 1116581680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 05:32:36,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 05:32:37,116][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068144_1116471296.pth... +[2024-06-18 05:32:37,184][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067525_1106329600.pth +[2024-06-18 05:32:39,346][12883] Updated weights for policy 0, policy_version 68151 (0.0031) +[2024-06-18 05:32:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1116667904. Throughput: 0: 42302.3. Samples: 1116827920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 05:32:41,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 05:32:44,072][12883] Updated weights for policy 0, policy_version 68161 (0.0041) +[2024-06-18 05:32:46,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1116897280. Throughput: 0: 42259.1. Samples: 1116957340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 05:32:46,995][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 05:32:47,264][12883] Updated weights for policy 0, policy_version 68171 (0.0032) +[2024-06-18 05:32:51,801][12883] Updated weights for policy 0, policy_version 68181 (0.0037) +[2024-06-18 05:32:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1117077504. Throughput: 0: 42165.0. Samples: 1117207180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 05:32:51,994][12645] Avg episode reward: [(0, '0.124')] +[2024-06-18 05:32:54,096][12862] Signal inference workers to stop experience collection... (16150 times) +[2024-06-18 05:32:54,100][12862] Signal inference workers to resume experience collection... (16150 times) +[2024-06-18 05:32:54,126][12883] InferenceWorker_p0-w0: stopping experience collection (16150 times) +[2024-06-18 05:32:54,127][12883] InferenceWorker_p0-w0: resuming experience collection (16150 times) +[2024-06-18 05:32:55,182][12883] Updated weights for policy 0, policy_version 68191 (0.0035) +[2024-06-18 05:32:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1117306880. Throughput: 0: 41988.1. Samples: 1117450700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 05:32:56,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 05:32:59,730][12883] Updated weights for policy 0, policy_version 68201 (0.0033) +[2024-06-18 05:33:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1117503488. Throughput: 0: 42289.4. Samples: 1117588760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 05:33:01,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 05:33:02,834][12883] Updated weights for policy 0, policy_version 68211 (0.0029) +[2024-06-18 05:33:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 1117716480. Throughput: 0: 42021.9. Samples: 1117838520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 05:33:06,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 05:33:07,339][12883] Updated weights for policy 0, policy_version 68221 (0.0045) +[2024-06-18 05:33:10,884][12883] Updated weights for policy 0, policy_version 68231 (0.0028) +[2024-06-18 05:33:11,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 1117962240. Throughput: 0: 41960.9. Samples: 1118079840. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:11,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 05:33:15,066][12883] Updated weights for policy 0, policy_version 68241 (0.0035) +[2024-06-18 05:33:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.4, 300 sec: 42209.6). Total num frames: 1118142464. Throughput: 0: 42048.5. Samples: 1118216100. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:16,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 05:33:18,569][12883] Updated weights for policy 0, policy_version 68251 (0.0039) +[2024-06-18 05:33:21,994][12645] Fps is (10 sec: 36044.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1118322688. Throughput: 0: 41846.9. Samples: 1118464800. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:21,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 05:33:22,792][12883] Updated weights for policy 0, policy_version 68261 (0.0034) +[2024-06-18 05:33:26,227][12883] Updated weights for policy 0, policy_version 68271 (0.0031) +[2024-06-18 05:33:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1118584832. Throughput: 0: 41942.6. Samples: 1118715340. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:26,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 05:33:30,509][12883] Updated weights for policy 0, policy_version 68281 (0.0031) +[2024-06-18 05:33:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 41506.2, 300 sec: 42209.7). Total num frames: 1118765056. Throughput: 0: 42315.3. Samples: 1118861520. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:31,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 05:33:33,840][12883] Updated weights for policy 0, policy_version 68291 (0.0043) +[2024-06-18 05:33:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41779.0, 300 sec: 42098.5). Total num frames: 1118961664. Throughput: 0: 42071.0. Samples: 1119100380. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:36,995][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 05:33:38,279][12883] Updated weights for policy 0, policy_version 68301 (0.0027) +[2024-06-18 05:33:41,590][12883] Updated weights for policy 0, policy_version 68311 (0.0042) +[2024-06-18 05:33:41,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 1119207424. Throughput: 0: 42262.5. Samples: 1119352520. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:41,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 05:33:46,135][12883] Updated weights for policy 0, policy_version 68321 (0.0040) +[2024-06-18 05:33:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1119404032. Throughput: 0: 42257.0. Samples: 1119490320. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:46,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 05:33:49,609][12883] Updated weights for policy 0, policy_version 68331 (0.0033) +[2024-06-18 05:33:51,996][12645] Fps is (10 sec: 40951.5, 60 sec: 42323.8, 300 sec: 42153.8). Total num frames: 1119617024. Throughput: 0: 41930.3. Samples: 1119725480. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) +[2024-06-18 05:33:51,996][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 05:33:53,405][12862] Signal inference workers to stop experience collection... (16200 times) +[2024-06-18 05:33:53,439][12883] InferenceWorker_p0-w0: stopping experience collection (16200 times) +[2024-06-18 05:33:53,450][12862] Signal inference workers to resume experience collection... (16200 times) +[2024-06-18 05:33:53,460][12883] InferenceWorker_p0-w0: resuming experience collection (16200 times) +[2024-06-18 05:33:54,375][12883] Updated weights for policy 0, policy_version 68341 (0.0028) +[2024-06-18 05:33:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1119830016. Throughput: 0: 42250.3. Samples: 1119981100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-18 05:33:56,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 05:33:57,218][12883] Updated weights for policy 0, policy_version 68351 (0.0038) +[2024-06-18 05:34:01,994][12645] Fps is (10 sec: 39330.8, 60 sec: 41779.4, 300 sec: 42154.1). Total num frames: 1120010240. Throughput: 0: 42087.6. Samples: 1120110040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-18 05:34:01,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 05:34:02,098][12883] Updated weights for policy 0, policy_version 68361 (0.0028) +[2024-06-18 05:34:05,234][12883] Updated weights for policy 0, policy_version 68371 (0.0025) +[2024-06-18 05:34:06,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 1120272384. Throughput: 0: 42251.2. Samples: 1120366100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-18 05:34:06,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 05:34:09,809][12883] Updated weights for policy 0, policy_version 68381 (0.0034) +[2024-06-18 05:34:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1120468992. Throughput: 0: 42211.2. Samples: 1120614840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-18 05:34:11,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 05:34:12,925][12883] Updated weights for policy 0, policy_version 68391 (0.0046) +[2024-06-18 05:34:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1120665600. Throughput: 0: 41798.2. Samples: 1120742440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-18 05:34:16,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 05:34:17,571][12883] Updated weights for policy 0, policy_version 68401 (0.0027) +[2024-06-18 05:34:21,049][12883] Updated weights for policy 0, policy_version 68411 (0.0033) +[2024-06-18 05:34:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42209.9). Total num frames: 1120911360. Throughput: 0: 42101.9. Samples: 1120994960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-18 05:34:21,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 05:34:25,449][12883] Updated weights for policy 0, policy_version 68421 (0.0033) +[2024-06-18 05:34:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1121091584. Throughput: 0: 42108.5. Samples: 1121247400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-18 05:34:26,994][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 05:34:28,649][12883] Updated weights for policy 0, policy_version 68431 (0.0039) +[2024-06-18 05:34:31,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1121288192. Throughput: 0: 41737.4. Samples: 1121368500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) +[2024-06-18 05:34:31,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 05:34:33,293][12883] Updated weights for policy 0, policy_version 68441 (0.0037) +[2024-06-18 05:34:36,283][12883] Updated weights for policy 0, policy_version 68451 (0.0032) +[2024-06-18 05:34:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42154.1). Total num frames: 1121533952. Throughput: 0: 42273.2. Samples: 1121627680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:34:37,000][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 05:34:37,034][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068454_1121550336.pth... +[2024-06-18 05:34:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067836_1111425024.pth +[2024-06-18 05:34:40,878][12883] Updated weights for policy 0, policy_version 68461 (0.0039) +[2024-06-18 05:34:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1121730560. Throughput: 0: 42200.3. Samples: 1121880120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:34:41,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 05:34:44,033][12883] Updated weights for policy 0, policy_version 68471 (0.0026) +[2024-06-18 05:34:46,996][12645] Fps is (10 sec: 37674.9, 60 sec: 41777.7, 300 sec: 42098.2). Total num frames: 1121910784. Throughput: 0: 42098.2. Samples: 1122004560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:34:46,996][12645] Avg episode reward: [(0, '0.087')] +[2024-06-18 05:34:48,317][12883] Updated weights for policy 0, policy_version 68481 (0.0037) +[2024-06-18 05:34:51,505][12883] Updated weights for policy 0, policy_version 68491 (0.0035) +[2024-06-18 05:34:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 1122156544. Throughput: 0: 42227.7. Samples: 1122266340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:34:51,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 05:34:55,850][12883] Updated weights for policy 0, policy_version 68501 (0.0042) +[2024-06-18 05:34:56,994][12645] Fps is (10 sec: 45885.0, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1122369536. Throughput: 0: 42355.4. Samples: 1122520840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:34:56,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 05:34:59,150][12883] Updated weights for policy 0, policy_version 68511 (0.0024) +[2024-06-18 05:35:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1122566144. Throughput: 0: 42325.9. Samples: 1122647100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:35:01,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 05:35:03,374][12883] Updated weights for policy 0, policy_version 68521 (0.0038) +[2024-06-18 05:35:06,328][12862] Signal inference workers to stop experience collection... (16250 times) +[2024-06-18 05:35:06,328][12862] Signal inference workers to resume experience collection... (16250 times) +[2024-06-18 05:35:06,371][12883] InferenceWorker_p0-w0: stopping experience collection (16250 times) +[2024-06-18 05:35:06,371][12883] InferenceWorker_p0-w0: resuming experience collection (16250 times) +[2024-06-18 05:35:06,834][12883] Updated weights for policy 0, policy_version 68531 (0.0041) +[2024-06-18 05:35:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1122811904. Throughput: 0: 42474.2. Samples: 1122906300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:35:06,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 05:35:11,262][12883] Updated weights for policy 0, policy_version 68541 (0.0032) +[2024-06-18 05:35:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42155.0). Total num frames: 1122992128. Throughput: 0: 42468.4. Samples: 1123158480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:35:11,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 05:35:14,593][12883] Updated weights for policy 0, policy_version 68551 (0.0042) +[2024-06-18 05:35:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 1123205120. Throughput: 0: 42375.1. Samples: 1123275380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 05:35:16,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 05:35:19,009][12883] Updated weights for policy 0, policy_version 68561 (0.0037) +[2024-06-18 05:35:21,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1123450880. Throughput: 0: 42504.6. Samples: 1123540380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 05:35:21,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 05:35:22,089][12883] Updated weights for policy 0, policy_version 68571 (0.0041) +[2024-06-18 05:35:26,758][12883] Updated weights for policy 0, policy_version 68581 (0.0032) +[2024-06-18 05:35:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1123631104. Throughput: 0: 42642.7. Samples: 1123799040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 05:35:26,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 05:35:29,618][12883] Updated weights for policy 0, policy_version 68591 (0.0023) +[2024-06-18 05:35:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1123827712. Throughput: 0: 42524.9. Samples: 1123918080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 05:35:31,994][12645] Avg episode reward: [(0, '0.058')] +[2024-06-18 05:35:34,470][12883] Updated weights for policy 0, policy_version 68601 (0.0039) +[2024-06-18 05:35:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1124073472. Throughput: 0: 42485.4. Samples: 1124178180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 05:35:36,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 05:35:37,779][12883] Updated weights for policy 0, policy_version 68611 (0.0029) +[2024-06-18 05:35:41,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1124270080. Throughput: 0: 42421.8. Samples: 1124429820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 05:35:41,998][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 05:35:42,483][12883] Updated weights for policy 0, policy_version 68621 (0.0037) +[2024-06-18 05:35:45,516][12883] Updated weights for policy 0, policy_version 68631 (0.0037) +[2024-06-18 05:35:46,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42871.5, 300 sec: 42098.2). Total num frames: 1124483072. Throughput: 0: 42464.9. Samples: 1124558120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 05:35:46,996][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 05:35:49,998][12883] Updated weights for policy 0, policy_version 68641 (0.0033) +[2024-06-18 05:35:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42321.6). Total num frames: 1124696064. Throughput: 0: 42478.8. Samples: 1124817840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 05:35:51,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 05:35:53,168][12883] Updated weights for policy 0, policy_version 68651 (0.0036) +[2024-06-18 05:35:56,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1124909056. Throughput: 0: 42510.7. Samples: 1125071460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 05:35:56,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 05:35:57,721][12883] Updated weights for policy 0, policy_version 68661 (0.0039) +[2024-06-18 05:36:00,962][12883] Updated weights for policy 0, policy_version 68671 (0.0037) +[2024-06-18 05:36:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1125105664. Throughput: 0: 42822.6. Samples: 1125202400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 05:36:01,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 05:36:05,426][12883] Updated weights for policy 0, policy_version 68681 (0.0026) +[2024-06-18 05:36:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1125335040. Throughput: 0: 42749.7. Samples: 1125464120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 05:36:06,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 05:36:07,372][12862] Signal inference workers to stop experience collection... (16300 times) +[2024-06-18 05:36:07,372][12862] Signal inference workers to resume experience collection... (16300 times) +[2024-06-18 05:36:07,390][12883] InferenceWorker_p0-w0: stopping experience collection (16300 times) +[2024-06-18 05:36:07,390][12883] InferenceWorker_p0-w0: resuming experience collection (16300 times) +[2024-06-18 05:36:08,500][12883] Updated weights for policy 0, policy_version 68691 (0.0032) +[2024-06-18 05:36:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1125564416. Throughput: 0: 42602.6. Samples: 1125716160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 05:36:11,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 05:36:12,882][12883] Updated weights for policy 0, policy_version 68701 (0.0040) +[2024-06-18 05:36:16,006][12883] Updated weights for policy 0, policy_version 68711 (0.0037) +[2024-06-18 05:36:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1125761024. Throughput: 0: 42947.3. Samples: 1125850720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 05:36:16,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 05:36:20,458][12883] Updated weights for policy 0, policy_version 68721 (0.0032) +[2024-06-18 05:36:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.0, 300 sec: 42265.1). Total num frames: 1125957632. Throughput: 0: 42790.4. Samples: 1126103760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 05:36:21,995][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 05:36:23,603][12883] Updated weights for policy 0, policy_version 68731 (0.0035) +[2024-06-18 05:36:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1126203392. Throughput: 0: 42717.8. Samples: 1126352120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 05:36:26,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 05:36:28,235][12883] Updated weights for policy 0, policy_version 68741 (0.0030) +[2024-06-18 05:36:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.3, 300 sec: 42320.7). Total num frames: 1126400000. Throughput: 0: 42796.2. Samples: 1126483860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 05:36:31,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 05:36:32,158][12883] Updated weights for policy 0, policy_version 68751 (0.0040) +[2024-06-18 05:36:35,802][12883] Updated weights for policy 0, policy_version 68761 (0.0033) +[2024-06-18 05:36:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1126612992. Throughput: 0: 42565.6. Samples: 1126733300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) +[2024-06-18 05:36:36,994][12645] Avg episode reward: [(0, '0.143')] +[2024-06-18 05:36:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068763_1126612992.pth... +[2024-06-18 05:36:37,051][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068144_1116471296.pth +[2024-06-18 05:36:40,153][12883] Updated weights for policy 0, policy_version 68771 (0.0041) +[2024-06-18 05:36:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1126825984. Throughput: 0: 42636.9. Samples: 1126990120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:36:41,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 05:36:43,486][12883] Updated weights for policy 0, policy_version 68781 (0.0030) +[2024-06-18 05:36:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42600.0, 300 sec: 42320.7). Total num frames: 1127038976. Throughput: 0: 42514.3. Samples: 1127115540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:36:46,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 05:36:47,899][12883] Updated weights for policy 0, policy_version 68791 (0.0040) +[2024-06-18 05:36:51,119][12883] Updated weights for policy 0, policy_version 68801 (0.0044) +[2024-06-18 05:36:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 1127268352. Throughput: 0: 42225.8. Samples: 1127364280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:36:51,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 05:36:55,731][12883] Updated weights for policy 0, policy_version 68811 (0.0032) +[2024-06-18 05:36:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1127448576. Throughput: 0: 42316.5. Samples: 1127620400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:36:56,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 05:36:58,865][12883] Updated weights for policy 0, policy_version 68821 (0.0027) +[2024-06-18 05:37:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 1127677952. Throughput: 0: 42088.1. Samples: 1127744680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:37:01,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 05:37:03,373][12883] Updated weights for policy 0, policy_version 68831 (0.0041) +[2024-06-18 05:37:06,471][12883] Updated weights for policy 0, policy_version 68841 (0.0026) +[2024-06-18 05:37:06,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42869.9, 300 sec: 42487.0). Total num frames: 1127907328. Throughput: 0: 42200.3. Samples: 1128002860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:37:06,997][12645] Avg episode reward: [(0, '0.228')] +[2024-06-18 05:37:11,106][12883] Updated weights for policy 0, policy_version 68851 (0.0034) +[2024-06-18 05:37:11,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42050.8, 300 sec: 42209.3). Total num frames: 1128087552. Throughput: 0: 42352.1. Samples: 1128258060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:37:11,997][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 05:37:14,089][12883] Updated weights for policy 0, policy_version 68861 (0.0028) +[2024-06-18 05:37:16,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1128300544. Throughput: 0: 42074.9. Samples: 1128377220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:37:16,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 05:37:18,712][12883] Updated weights for policy 0, policy_version 68871 (0.0022) +[2024-06-18 05:37:21,801][12883] Updated weights for policy 0, policy_version 68881 (0.0047) +[2024-06-18 05:37:21,994][12645] Fps is (10 sec: 45885.9, 60 sec: 43144.7, 300 sec: 42431.8). Total num frames: 1128546304. Throughput: 0: 42371.7. Samples: 1128640020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 05:37:21,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 05:37:26,618][12883] Updated weights for policy 0, policy_version 68891 (0.0038) +[2024-06-18 05:37:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1128726528. Throughput: 0: 42279.2. Samples: 1128892680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:37:26,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 05:37:29,777][12883] Updated weights for policy 0, policy_version 68901 (0.0038) +[2024-06-18 05:37:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1128939520. Throughput: 0: 42115.1. Samples: 1129010720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:37:31,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 05:37:34,178][12883] Updated weights for policy 0, policy_version 68911 (0.0045) +[2024-06-18 05:37:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1129168896. Throughput: 0: 42402.2. Samples: 1129272380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:37:37,000][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 05:37:37,660][12883] Updated weights for policy 0, policy_version 68921 (0.0044) +[2024-06-18 05:37:38,527][12862] Signal inference workers to stop experience collection... (16350 times) +[2024-06-18 05:37:38,536][12862] Signal inference workers to resume experience collection... (16350 times) +[2024-06-18 05:37:38,574][12883] InferenceWorker_p0-w0: stopping experience collection (16350 times) +[2024-06-18 05:37:38,574][12883] InferenceWorker_p0-w0: resuming experience collection (16350 times) +[2024-06-18 05:37:41,967][12883] Updated weights for policy 0, policy_version 68931 (0.0040) +[2024-06-18 05:37:42,000][12645] Fps is (10 sec: 42571.3, 60 sec: 42320.9, 300 sec: 42264.3). Total num frames: 1129365504. Throughput: 0: 42255.0. Samples: 1129522140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:37:42,001][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 05:37:45,511][12883] Updated weights for policy 0, policy_version 68941 (0.0031) +[2024-06-18 05:37:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1129578496. Throughput: 0: 42159.6. Samples: 1129641860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:37:46,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 05:37:49,828][12883] Updated weights for policy 0, policy_version 68951 (0.0038) +[2024-06-18 05:37:51,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1129791488. Throughput: 0: 42140.2. Samples: 1129899080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:37:51,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 05:37:53,439][12883] Updated weights for policy 0, policy_version 68961 (0.0027) +[2024-06-18 05:37:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1129988096. Throughput: 0: 42268.8. Samples: 1130160060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:37:56,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 05:37:57,347][12883] Updated weights for policy 0, policy_version 68971 (0.0034) +[2024-06-18 05:38:01,153][12883] Updated weights for policy 0, policy_version 68981 (0.0037) +[2024-06-18 05:38:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1130217472. Throughput: 0: 42443.0. Samples: 1130287160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 05:38:01,995][12645] Avg episode reward: [(0, '0.319')] +[2024-06-18 05:38:04,953][12883] Updated weights for policy 0, policy_version 68991 (0.0040) +[2024-06-18 05:38:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42053.9, 300 sec: 42265.2). Total num frames: 1130430464. Throughput: 0: 42115.5. Samples: 1130535220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:06,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 05:38:08,878][12883] Updated weights for policy 0, policy_version 69001 (0.0042) +[2024-06-18 05:38:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1130627072. Throughput: 0: 42362.6. Samples: 1130799000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:11,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 05:38:12,498][12883] Updated weights for policy 0, policy_version 69011 (0.0021) +[2024-06-18 05:38:16,832][12883] Updated weights for policy 0, policy_version 69021 (0.0037) +[2024-06-18 05:38:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1130856448. Throughput: 0: 42433.3. Samples: 1130920220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:16,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 05:38:20,192][12883] Updated weights for policy 0, policy_version 69031 (0.0028) +[2024-06-18 05:38:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 1131069440. Throughput: 0: 42398.6. Samples: 1131180320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:21,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 05:38:24,537][12883] Updated weights for policy 0, policy_version 69041 (0.0030) +[2024-06-18 05:38:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1131266048. Throughput: 0: 42477.0. Samples: 1131433340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:26,994][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 05:38:27,907][12883] Updated weights for policy 0, policy_version 69051 (0.0031) +[2024-06-18 05:38:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1131479040. Throughput: 0: 42657.7. Samples: 1131561460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:31,995][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 05:38:32,140][12883] Updated weights for policy 0, policy_version 69061 (0.0038) +[2024-06-18 05:38:36,185][12883] Updated weights for policy 0, policy_version 69071 (0.0034) +[2024-06-18 05:38:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1131692032. Throughput: 0: 42570.3. Samples: 1131814740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:36,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 05:38:37,091][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069074_1131708416.pth... +[2024-06-18 05:38:37,167][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068454_1121550336.pth +[2024-06-18 05:38:39,825][12883] Updated weights for policy 0, policy_version 69081 (0.0028) +[2024-06-18 05:38:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.7, 300 sec: 42376.2). Total num frames: 1131905024. Throughput: 0: 42391.8. Samples: 1132067700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:41,994][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 05:38:43,918][12883] Updated weights for policy 0, policy_version 69091 (0.0037) +[2024-06-18 05:38:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42321.0). Total num frames: 1132101632. Throughput: 0: 42265.8. Samples: 1132189120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 05:38:46,994][12645] Avg episode reward: [(0, '0.082')] +[2024-06-18 05:38:47,483][12883] Updated weights for policy 0, policy_version 69101 (0.0033) +[2024-06-18 05:38:51,412][12883] Updated weights for policy 0, policy_version 69111 (0.0032) +[2024-06-18 05:38:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1132314624. Throughput: 0: 42397.2. Samples: 1132443100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 05:38:51,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 05:38:55,647][12883] Updated weights for policy 0, policy_version 69121 (0.0029) +[2024-06-18 05:38:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1132544000. Throughput: 0: 42240.1. Samples: 1132699800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 05:38:56,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 05:38:59,490][12883] Updated weights for policy 0, policy_version 69131 (0.0040) +[2024-06-18 05:39:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1132756992. Throughput: 0: 42319.5. Samples: 1132824600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 05:39:01,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 05:39:03,344][12883] Updated weights for policy 0, policy_version 69141 (0.0035) +[2024-06-18 05:39:05,604][12862] Signal inference workers to stop experience collection... (16400 times) +[2024-06-18 05:39:05,604][12862] Signal inference workers to resume experience collection... (16400 times) +[2024-06-18 05:39:05,648][12883] InferenceWorker_p0-w0: stopping experience collection (16400 times) +[2024-06-18 05:39:05,648][12883] InferenceWorker_p0-w0: resuming experience collection (16400 times) +[2024-06-18 05:39:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1132953600. Throughput: 0: 42179.2. Samples: 1133078380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 05:39:06,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 05:39:07,105][12883] Updated weights for policy 0, policy_version 69151 (0.0033) +[2024-06-18 05:39:11,115][12883] Updated weights for policy 0, policy_version 69161 (0.0039) +[2024-06-18 05:39:11,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 1133150208. Throughput: 0: 42138.8. Samples: 1133329580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 05:39:11,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 05:39:14,903][12883] Updated weights for policy 0, policy_version 69171 (0.0023) +[2024-06-18 05:39:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1133395968. Throughput: 0: 42045.8. Samples: 1133453520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 05:39:16,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 05:39:18,833][12883] Updated weights for policy 0, policy_version 69181 (0.0025) +[2024-06-18 05:39:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1133608960. Throughput: 0: 42177.3. Samples: 1133712720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 05:39:21,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 05:39:22,597][12883] Updated weights for policy 0, policy_version 69191 (0.0043) +[2024-06-18 05:39:26,413][12883] Updated weights for policy 0, policy_version 69201 (0.0027) +[2024-06-18 05:39:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1133805568. Throughput: 0: 42253.5. Samples: 1133969100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 05:39:26,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 05:39:30,165][12883] Updated weights for policy 0, policy_version 69211 (0.0042) +[2024-06-18 05:39:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1134034944. Throughput: 0: 42374.6. Samples: 1134095980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:39:31,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 05:39:34,139][12883] Updated weights for policy 0, policy_version 69221 (0.0043) +[2024-06-18 05:39:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1134231552. Throughput: 0: 42307.5. Samples: 1134346940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:39:36,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 05:39:37,859][12883] Updated weights for policy 0, policy_version 69231 (0.0041) +[2024-06-18 05:39:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42432.1). Total num frames: 1134428160. Throughput: 0: 42345.3. Samples: 1134605340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:39:41,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 05:39:42,066][12883] Updated weights for policy 0, policy_version 69241 (0.0031) +[2024-06-18 05:39:45,598][12883] Updated weights for policy 0, policy_version 69251 (0.0032) +[2024-06-18 05:39:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1134673920. Throughput: 0: 42287.6. Samples: 1134727540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:39:46,998][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 05:39:49,973][12883] Updated weights for policy 0, policy_version 69261 (0.0030) +[2024-06-18 05:39:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1134870528. Throughput: 0: 42398.8. Samples: 1134986320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:39:51,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 05:39:53,492][12883] Updated weights for policy 0, policy_version 69271 (0.0041) +[2024-06-18 05:39:56,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1135067136. Throughput: 0: 42349.8. Samples: 1135235320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:39:56,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 05:39:57,629][12883] Updated weights for policy 0, policy_version 69281 (0.0031) +[2024-06-18 05:40:01,301][12883] Updated weights for policy 0, policy_version 69291 (0.0028) +[2024-06-18 05:40:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1135296512. Throughput: 0: 42500.0. Samples: 1135366020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:40:01,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 05:40:05,459][12883] Updated weights for policy 0, policy_version 69301 (0.0042) +[2024-06-18 05:40:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1135509504. Throughput: 0: 42461.8. Samples: 1135623500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:40:06,994][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 05:40:09,208][12883] Updated weights for policy 0, policy_version 69311 (0.0034) +[2024-06-18 05:40:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42431.8). Total num frames: 1135722496. Throughput: 0: 42299.9. Samples: 1135872600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 05:40:11,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 05:40:13,369][12883] Updated weights for policy 0, policy_version 69321 (0.0041) +[2024-06-18 05:40:16,896][12883] Updated weights for policy 0, policy_version 69331 (0.0038) +[2024-06-18 05:40:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42265.1). Total num frames: 1135919104. Throughput: 0: 42294.1. Samples: 1135999220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 05:40:16,995][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 05:40:20,997][12883] Updated weights for policy 0, policy_version 69341 (0.0043) +[2024-06-18 05:40:21,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1136132096. Throughput: 0: 42509.6. Samples: 1136259860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 05:40:21,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 05:40:24,468][12883] Updated weights for policy 0, policy_version 69351 (0.0038) +[2024-06-18 05:40:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1136361472. Throughput: 0: 42344.3. Samples: 1136510840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 05:40:26,995][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 05:40:28,550][12883] Updated weights for policy 0, policy_version 69361 (0.0045) +[2024-06-18 05:40:29,816][12862] Signal inference workers to stop experience collection... (16450 times) +[2024-06-18 05:40:29,817][12862] Signal inference workers to resume experience collection... (16450 times) +[2024-06-18 05:40:29,834][12883] InferenceWorker_p0-w0: stopping experience collection (16450 times) +[2024-06-18 05:40:29,834][12883] InferenceWorker_p0-w0: resuming experience collection (16450 times) +[2024-06-18 05:40:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 1136558080. Throughput: 0: 42493.9. Samples: 1136639760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 05:40:31,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 05:40:32,070][12883] Updated weights for policy 0, policy_version 69371 (0.0029) +[2024-06-18 05:40:36,218][12883] Updated weights for policy 0, policy_version 69381 (0.0033) +[2024-06-18 05:40:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1136771072. Throughput: 0: 42217.1. Samples: 1136886100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 05:40:36,994][12645] Avg episode reward: [(0, '0.039')] +[2024-06-18 05:40:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069383_1136771072.pth... +[2024-06-18 05:40:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068763_1126612992.pth +[2024-06-18 05:40:39,663][12883] Updated weights for policy 0, policy_version 69391 (0.0031) +[2024-06-18 05:40:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42376.6). Total num frames: 1136984064. Throughput: 0: 42277.6. Samples: 1137137820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 05:40:41,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 05:40:43,830][12883] Updated weights for policy 0, policy_version 69401 (0.0027) +[2024-06-18 05:40:46,997][12645] Fps is (10 sec: 40945.2, 60 sec: 41776.6, 300 sec: 42320.1). Total num frames: 1137180672. Throughput: 0: 42151.2. Samples: 1137262980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 05:40:46,998][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 05:40:47,686][12883] Updated weights for policy 0, policy_version 69411 (0.0028) +[2024-06-18 05:40:51,472][12883] Updated weights for policy 0, policy_version 69421 (0.0041) +[2024-06-18 05:40:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1137410048. Throughput: 0: 42092.8. Samples: 1137517680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 05:40:51,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 05:40:55,557][12883] Updated weights for policy 0, policy_version 69431 (0.0047) +[2024-06-18 05:40:56,996][12645] Fps is (10 sec: 44243.5, 60 sec: 42596.7, 300 sec: 42431.5). Total num frames: 1137623040. Throughput: 0: 42138.0. Samples: 1137768900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:40:56,997][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 05:40:59,262][12883] Updated weights for policy 0, policy_version 69441 (0.0032) +[2024-06-18 05:41:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1137819648. Throughput: 0: 42085.9. Samples: 1137893080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:41:01,994][12645] Avg episode reward: [(0, '0.098')] +[2024-06-18 05:41:03,534][12883] Updated weights for policy 0, policy_version 69451 (0.0031) +[2024-06-18 05:41:06,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1138032640. Throughput: 0: 41931.1. Samples: 1138146760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:41:06,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 05:41:07,088][12883] Updated weights for policy 0, policy_version 69461 (0.0029) +[2024-06-18 05:41:11,336][12883] Updated weights for policy 0, policy_version 69471 (0.0040) +[2024-06-18 05:41:11,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42323.8, 300 sec: 42375.9). Total num frames: 1138262016. Throughput: 0: 42038.9. Samples: 1138402680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:41:11,996][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 05:41:14,703][12883] Updated weights for policy 0, policy_version 69481 (0.0047) +[2024-06-18 05:41:16,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1138442240. Throughput: 0: 41979.4. Samples: 1138528840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:41:16,995][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 05:41:18,891][12883] Updated weights for policy 0, policy_version 69491 (0.0027) +[2024-06-18 05:41:22,000][12645] Fps is (10 sec: 42581.1, 60 sec: 42593.9, 300 sec: 42319.8). Total num frames: 1138688000. Throughput: 0: 42088.1. Samples: 1138780320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:41:22,000][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 05:41:22,335][12883] Updated weights for policy 0, policy_version 69501 (0.0028) +[2024-06-18 05:41:26,601][12883] Updated weights for policy 0, policy_version 69511 (0.0031) +[2024-06-18 05:41:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.4, 300 sec: 42265.2). Total num frames: 1138868224. Throughput: 0: 42308.6. Samples: 1139041700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:41:26,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 05:41:29,847][12883] Updated weights for policy 0, policy_version 69521 (0.0033) +[2024-06-18 05:41:31,994][12645] Fps is (10 sec: 39346.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1139081216. Throughput: 0: 42252.4. Samples: 1139164180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:41:31,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 05:41:34,172][12883] Updated weights for policy 0, policy_version 69531 (0.0035) +[2024-06-18 05:41:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1139294208. Throughput: 0: 42246.2. Samples: 1139418760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 05:41:36,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 05:41:37,517][12883] Updated weights for policy 0, policy_version 69541 (0.0039) +[2024-06-18 05:41:39,395][12862] Signal inference workers to stop experience collection... (16500 times) +[2024-06-18 05:41:39,395][12862] Signal inference workers to resume experience collection... (16500 times) +[2024-06-18 05:41:39,437][12883] InferenceWorker_p0-w0: stopping experience collection (16500 times) +[2024-06-18 05:41:39,438][12883] InferenceWorker_p0-w0: resuming experience collection (16500 times) +[2024-06-18 05:41:41,995][12645] Fps is (10 sec: 42593.9, 60 sec: 42051.6, 300 sec: 42265.0). Total num frames: 1139507200. Throughput: 0: 42288.7. Samples: 1139671840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-18 05:41:41,995][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 05:41:42,035][12883] Updated weights for policy 0, policy_version 69551 (0.0030) +[2024-06-18 05:41:46,156][12883] Updated weights for policy 0, policy_version 69561 (0.0039) +[2024-06-18 05:41:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42054.9, 300 sec: 42154.1). Total num frames: 1139703808. Throughput: 0: 42367.6. Samples: 1139799620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-18 05:41:46,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 05:41:49,711][12883] Updated weights for policy 0, policy_version 69571 (0.0023) +[2024-06-18 05:41:51,994][12645] Fps is (10 sec: 42602.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1139933184. Throughput: 0: 42279.9. Samples: 1140049360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-18 05:41:51,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 05:41:53,976][12883] Updated weights for policy 0, policy_version 69581 (0.0050) +[2024-06-18 05:41:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42053.8, 300 sec: 42265.2). Total num frames: 1140146176. Throughput: 0: 42224.7. Samples: 1140302700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-18 05:41:56,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 05:41:57,623][12883] Updated weights for policy 0, policy_version 69591 (0.0033) +[2024-06-18 05:42:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42098.9). Total num frames: 1140326400. Throughput: 0: 42132.5. Samples: 1140424800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-18 05:42:01,999][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 05:42:02,157][12883] Updated weights for policy 0, policy_version 69601 (0.0039) +[2024-06-18 05:42:05,237][12883] Updated weights for policy 0, policy_version 69611 (0.0040) +[2024-06-18 05:42:06,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42050.6, 300 sec: 42265.2). Total num frames: 1140555776. Throughput: 0: 42184.6. Samples: 1140678460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-18 05:42:06,996][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 05:42:09,719][12883] Updated weights for policy 0, policy_version 69621 (0.0033) +[2024-06-18 05:42:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41780.7, 300 sec: 42265.2). Total num frames: 1140768768. Throughput: 0: 42076.8. Samples: 1140935160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-18 05:42:11,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 05:42:13,129][12883] Updated weights for policy 0, policy_version 69631 (0.0037) +[2024-06-18 05:42:16,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1140981760. Throughput: 0: 42167.5. Samples: 1141061720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) +[2024-06-18 05:42:16,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 05:42:17,390][12883] Updated weights for policy 0, policy_version 69641 (0.0042) +[2024-06-18 05:42:20,949][12883] Updated weights for policy 0, policy_version 69651 (0.0028) +[2024-06-18 05:42:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41783.5, 300 sec: 42265.2). Total num frames: 1141194752. Throughput: 0: 41947.6. Samples: 1141306400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:42:21,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 05:42:25,215][12883] Updated weights for policy 0, policy_version 69661 (0.0026) +[2024-06-18 05:42:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 1141407744. Throughput: 0: 42041.8. Samples: 1141563680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:42:26,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 05:42:28,997][12883] Updated weights for policy 0, policy_version 69671 (0.0026) +[2024-06-18 05:42:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1141620736. Throughput: 0: 42074.2. Samples: 1141692960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:42:31,997][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 05:42:32,903][12883] Updated weights for policy 0, policy_version 69681 (0.0032) +[2024-06-18 05:42:36,651][12883] Updated weights for policy 0, policy_version 69691 (0.0033) +[2024-06-18 05:42:36,997][12645] Fps is (10 sec: 40945.8, 60 sec: 42049.8, 300 sec: 42210.0). Total num frames: 1141817344. Throughput: 0: 42068.7. Samples: 1141942600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:42:36,998][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 05:42:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069691_1141817344.pth... +[2024-06-18 05:42:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069074_1131708416.pth +[2024-06-18 05:42:40,661][12883] Updated weights for policy 0, policy_version 69701 (0.0033) +[2024-06-18 05:42:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42326.1, 300 sec: 42265.2). Total num frames: 1142046720. Throughput: 0: 42186.7. Samples: 1142201100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:42:41,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 05:42:44,200][12883] Updated weights for policy 0, policy_version 69711 (0.0031) +[2024-06-18 05:42:46,994][12645] Fps is (10 sec: 44250.0, 60 sec: 42598.0, 300 sec: 42265.1). Total num frames: 1142259712. Throughput: 0: 42313.3. Samples: 1142328920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:42:46,995][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 05:42:48,510][12883] Updated weights for policy 0, policy_version 69721 (0.0049) +[2024-06-18 05:42:49,409][12862] Signal inference workers to stop experience collection... (16550 times) +[2024-06-18 05:42:49,462][12862] Signal inference workers to resume experience collection... (16550 times) +[2024-06-18 05:42:49,463][12883] InferenceWorker_p0-w0: stopping experience collection (16550 times) +[2024-06-18 05:42:49,488][12883] InferenceWorker_p0-w0: resuming experience collection (16550 times) +[2024-06-18 05:42:51,890][12883] Updated weights for policy 0, policy_version 69731 (0.0033) +[2024-06-18 05:42:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1142472704. Throughput: 0: 42152.7. Samples: 1142575240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:42:51,995][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 05:42:56,474][12883] Updated weights for policy 0, policy_version 69741 (0.0046) +[2024-06-18 05:42:56,994][12645] Fps is (10 sec: 40962.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1142669312. Throughput: 0: 42314.2. Samples: 1142839300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:42:56,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 05:42:59,570][12883] Updated weights for policy 0, policy_version 69751 (0.0032) +[2024-06-18 05:43:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42265.1). Total num frames: 1142898688. Throughput: 0: 42187.4. Samples: 1142960160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 05:43:01,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 05:43:04,129][12883] Updated weights for policy 0, policy_version 69761 (0.0029) +[2024-06-18 05:43:06,994][12645] Fps is (10 sec: 42595.8, 60 sec: 42326.4, 300 sec: 42265.1). Total num frames: 1143095296. Throughput: 0: 42366.0. Samples: 1143212900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:06,995][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 05:43:07,329][12883] Updated weights for policy 0, policy_version 69771 (0.0034) +[2024-06-18 05:43:11,966][12883] Updated weights for policy 0, policy_version 69781 (0.0026) +[2024-06-18 05:43:12,000][12645] Fps is (10 sec: 39297.6, 60 sec: 42047.9, 300 sec: 42153.2). Total num frames: 1143291904. Throughput: 0: 42370.6. Samples: 1143470620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:12,001][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 05:43:15,198][12883] Updated weights for policy 0, policy_version 69791 (0.0036) +[2024-06-18 05:43:16,994][12645] Fps is (10 sec: 44240.2, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1143537664. Throughput: 0: 42225.0. Samples: 1143593080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:16,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 05:43:19,582][12883] Updated weights for policy 0, policy_version 69801 (0.0028) +[2024-06-18 05:43:21,994][12645] Fps is (10 sec: 44265.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1143734272. Throughput: 0: 42372.8. Samples: 1143849220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:21,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 05:43:22,801][12883] Updated weights for policy 0, policy_version 69811 (0.0029) +[2024-06-18 05:43:26,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1143930880. Throughput: 0: 42329.8. Samples: 1144105940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:26,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 05:43:27,703][12883] Updated weights for policy 0, policy_version 69821 (0.0021) +[2024-06-18 05:43:30,482][12883] Updated weights for policy 0, policy_version 69831 (0.0029) +[2024-06-18 05:43:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1144176640. Throughput: 0: 42228.9. Samples: 1144229200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:31,994][12645] Avg episode reward: [(0, '0.094')] +[2024-06-18 05:43:35,368][12883] Updated weights for policy 0, policy_version 69841 (0.0032) +[2024-06-18 05:43:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42327.8, 300 sec: 42209.6). Total num frames: 1144356864. Throughput: 0: 42524.9. Samples: 1144488860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:36,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 05:43:38,106][12883] Updated weights for policy 0, policy_version 69851 (0.0030) +[2024-06-18 05:43:41,994][12645] Fps is (10 sec: 37683.7, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1144553472. Throughput: 0: 42155.7. Samples: 1144736300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:41,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 05:43:43,028][12883] Updated weights for policy 0, policy_version 69861 (0.0021) +[2024-06-18 05:43:45,947][12883] Updated weights for policy 0, policy_version 69871 (0.0054) +[2024-06-18 05:43:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.7, 300 sec: 42320.7). Total num frames: 1144799232. Throughput: 0: 42307.2. Samples: 1144863980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:43:46,994][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 05:43:50,643][12883] Updated weights for policy 0, policy_version 69881 (0.0030) +[2024-06-18 05:43:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1144979456. Throughput: 0: 42390.8. Samples: 1145120460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 05:43:51,998][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 05:43:53,653][12883] Updated weights for policy 0, policy_version 69891 (0.0037) +[2024-06-18 05:43:57,000][12645] Fps is (10 sec: 40934.4, 60 sec: 42321.0, 300 sec: 42208.7). Total num frames: 1145208832. Throughput: 0: 42041.3. Samples: 1145362480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 05:43:57,000][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 05:43:58,476][12883] Updated weights for policy 0, policy_version 69901 (0.0038) +[2024-06-18 05:43:59,052][12862] Signal inference workers to stop experience collection... (16600 times) +[2024-06-18 05:43:59,052][12862] Signal inference workers to resume experience collection... (16600 times) +[2024-06-18 05:43:59,081][12883] InferenceWorker_p0-w0: stopping experience collection (16600 times) +[2024-06-18 05:43:59,081][12883] InferenceWorker_p0-w0: resuming experience collection (16600 times) +[2024-06-18 05:44:01,761][12883] Updated weights for policy 0, policy_version 69911 (0.0039) +[2024-06-18 05:44:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 1145421824. Throughput: 0: 42284.4. Samples: 1145495880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 05:44:01,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 05:44:06,070][12883] Updated weights for policy 0, policy_version 69921 (0.0026) +[2024-06-18 05:44:06,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42325.8, 300 sec: 42320.7). Total num frames: 1145634816. Throughput: 0: 42398.1. Samples: 1145757140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 05:44:06,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 05:44:07,017][12862] Saving new best policy, reward=0.651! +[2024-06-18 05:44:09,690][12883] Updated weights for policy 0, policy_version 69931 (0.0028) +[2024-06-18 05:44:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42875.9, 300 sec: 42265.2). Total num frames: 1145864192. Throughput: 0: 42055.9. Samples: 1145998460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 05:44:11,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 05:44:13,956][12883] Updated weights for policy 0, policy_version 69941 (0.0035) +[2024-06-18 05:44:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1146044416. Throughput: 0: 42217.7. Samples: 1146129000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 05:44:16,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 05:44:17,632][12883] Updated weights for policy 0, policy_version 69951 (0.0042) +[2024-06-18 05:44:21,653][12883] Updated weights for policy 0, policy_version 69961 (0.0031) +[2024-06-18 05:44:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 1146257408. Throughput: 0: 42083.1. Samples: 1146382600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 05:44:21,997][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 05:44:25,302][12883] Updated weights for policy 0, policy_version 69971 (0.0029) +[2024-06-18 05:44:26,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42596.8, 300 sec: 42209.3). Total num frames: 1146486784. Throughput: 0: 42073.8. Samples: 1146629720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 05:44:26,996][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 05:44:29,247][12883] Updated weights for policy 0, policy_version 69981 (0.0046) +[2024-06-18 05:44:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1146683392. Throughput: 0: 42194.6. Samples: 1146762740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:44:31,999][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 05:44:33,024][12883] Updated weights for policy 0, policy_version 69991 (0.0034) +[2024-06-18 05:44:36,987][12883] Updated weights for policy 0, policy_version 70001 (0.0033) +[2024-06-18 05:44:36,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1146896384. Throughput: 0: 42012.6. Samples: 1147011020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:44:36,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 05:44:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070001_1146896384.pth... +[2024-06-18 05:44:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069383_1136771072.pth +[2024-06-18 05:44:40,718][12883] Updated weights for policy 0, policy_version 70011 (0.0033) +[2024-06-18 05:44:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1147092992. Throughput: 0: 42295.7. Samples: 1147265520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:44:41,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 05:44:44,559][12883] Updated weights for policy 0, policy_version 70021 (0.0030) +[2024-06-18 05:44:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1147322368. Throughput: 0: 42197.8. Samples: 1147394780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:44:46,994][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 05:44:48,363][12883] Updated weights for policy 0, policy_version 70031 (0.0039) +[2024-06-18 05:44:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 1147535360. Throughput: 0: 42020.0. Samples: 1147648040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:44:51,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 05:44:52,259][12883] Updated weights for policy 0, policy_version 70041 (0.0033) +[2024-06-18 05:44:56,091][12883] Updated weights for policy 0, policy_version 70051 (0.0035) +[2024-06-18 05:44:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.8, 300 sec: 42209.6). Total num frames: 1147748352. Throughput: 0: 42211.7. Samples: 1147897980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:44:56,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 05:44:59,940][12883] Updated weights for policy 0, policy_version 70061 (0.0023) +[2024-06-18 05:45:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1147944960. Throughput: 0: 42076.5. Samples: 1148022440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:45:01,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 05:45:03,951][12883] Updated weights for policy 0, policy_version 70071 (0.0033) +[2024-06-18 05:45:06,995][12645] Fps is (10 sec: 40955.4, 60 sec: 42051.5, 300 sec: 42153.9). Total num frames: 1148157952. Throughput: 0: 42072.4. Samples: 1148275900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:45:06,995][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 05:45:07,671][12883] Updated weights for policy 0, policy_version 70081 (0.0032) +[2024-06-18 05:45:11,847][12883] Updated weights for policy 0, policy_version 70091 (0.0037) +[2024-06-18 05:45:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1148370944. Throughput: 0: 42174.1. Samples: 1148527460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 05:45:11,994][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 05:45:15,728][12883] Updated weights for policy 0, policy_version 70101 (0.0027) +[2024-06-18 05:45:16,994][12645] Fps is (10 sec: 40964.1, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1148567552. Throughput: 0: 42092.4. Samples: 1148656900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) +[2024-06-18 05:45:17,003][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 05:45:19,655][12883] Updated weights for policy 0, policy_version 70111 (0.0038) +[2024-06-18 05:45:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1148780544. Throughput: 0: 42148.7. Samples: 1148907720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) +[2024-06-18 05:45:21,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 05:45:23,526][12883] Updated weights for policy 0, policy_version 70121 (0.0032) +[2024-06-18 05:45:26,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42052.3, 300 sec: 42209.3). Total num frames: 1149009920. Throughput: 0: 41938.4. Samples: 1149152840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) +[2024-06-18 05:45:26,996][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 05:45:27,197][12883] Updated weights for policy 0, policy_version 70131 (0.0046) +[2024-06-18 05:45:31,186][12883] Updated weights for policy 0, policy_version 70141 (0.0034) +[2024-06-18 05:45:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1149222912. Throughput: 0: 42147.8. Samples: 1149291440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) +[2024-06-18 05:45:31,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 05:45:33,008][12862] Signal inference workers to stop experience collection... (16650 times) +[2024-06-18 05:45:33,009][12862] Signal inference workers to resume experience collection... (16650 times) +[2024-06-18 05:45:33,050][12883] InferenceWorker_p0-w0: stopping experience collection (16650 times) +[2024-06-18 05:45:33,056][12883] InferenceWorker_p0-w0: resuming experience collection (16650 times) +[2024-06-18 05:45:35,314][12883] Updated weights for policy 0, policy_version 70151 (0.0030) +[2024-06-18 05:45:36,994][12645] Fps is (10 sec: 39330.2, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 1149403136. Throughput: 0: 42030.2. Samples: 1149539400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) +[2024-06-18 05:45:36,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 05:45:38,960][12883] Updated weights for policy 0, policy_version 70161 (0.0032) +[2024-06-18 05:45:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42265.7). Total num frames: 1149648896. Throughput: 0: 41983.6. Samples: 1149787240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) +[2024-06-18 05:45:41,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 05:45:42,981][12883] Updated weights for policy 0, policy_version 70171 (0.0041) +[2024-06-18 05:45:46,651][12883] Updated weights for policy 0, policy_version 70181 (0.0032) +[2024-06-18 05:45:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1149845504. Throughput: 0: 42214.3. Samples: 1149922080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) +[2024-06-18 05:45:46,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 05:45:50,630][12883] Updated weights for policy 0, policy_version 70191 (0.0026) +[2024-06-18 05:45:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42098.9). Total num frames: 1150042112. Throughput: 0: 42201.9. Samples: 1150174940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) +[2024-06-18 05:45:51,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 05:45:54,452][12883] Updated weights for policy 0, policy_version 70201 (0.0032) +[2024-06-18 05:45:56,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 1150287872. Throughput: 0: 42275.0. Samples: 1150429840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:45:56,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 05:45:58,226][12883] Updated weights for policy 0, policy_version 70211 (0.0031) +[2024-06-18 05:46:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1150468096. Throughput: 0: 42285.4. Samples: 1150559740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:46:01,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 05:46:02,463][12883] Updated weights for policy 0, policy_version 70221 (0.0033) +[2024-06-18 05:46:06,136][12883] Updated weights for policy 0, policy_version 70231 (0.0053) +[2024-06-18 05:46:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42053.0, 300 sec: 42098.9). Total num frames: 1150681088. Throughput: 0: 42133.9. Samples: 1150803740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:46:06,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 05:46:10,286][12883] Updated weights for policy 0, policy_version 70241 (0.0026) +[2024-06-18 05:46:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1150926848. Throughput: 0: 42404.3. Samples: 1151060940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:46:11,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 05:46:13,811][12883] Updated weights for policy 0, policy_version 70251 (0.0037) +[2024-06-18 05:46:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42099.4). Total num frames: 1151107072. Throughput: 0: 42209.5. Samples: 1151190860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:46:16,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 05:46:17,698][12883] Updated weights for policy 0, policy_version 70261 (0.0037) +[2024-06-18 05:46:21,437][12883] Updated weights for policy 0, policy_version 70271 (0.0024) +[2024-06-18 05:46:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 1151320064. Throughput: 0: 42243.2. Samples: 1151440340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:46:21,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 05:46:25,495][12883] Updated weights for policy 0, policy_version 70281 (0.0040) +[2024-06-18 05:46:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 1151549440. Throughput: 0: 42449.7. Samples: 1151697480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:46:26,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 05:46:29,463][12883] Updated weights for policy 0, policy_version 70291 (0.0025) +[2024-06-18 05:46:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 1151746048. Throughput: 0: 42220.9. Samples: 1151822020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:46:31,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 05:46:33,113][12883] Updated weights for policy 0, policy_version 70301 (0.0037) +[2024-06-18 05:46:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42209.8). Total num frames: 1151959040. Throughput: 0: 42336.1. Samples: 1152080060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 05:46:36,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 05:46:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070310_1151959040.pth... +[2024-06-18 05:46:37,091][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069691_1141817344.pth +[2024-06-18 05:46:37,291][12883] Updated weights for policy 0, policy_version 70311 (0.0034) +[2024-06-18 05:46:40,762][12883] Updated weights for policy 0, policy_version 70321 (0.0030) +[2024-06-18 05:46:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1152172032. Throughput: 0: 42321.6. Samples: 1152334300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:46:41,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 05:46:44,783][12883] Updated weights for policy 0, policy_version 70331 (0.0038) +[2024-06-18 05:46:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1152385024. Throughput: 0: 42374.2. Samples: 1152466580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:46:46,994][12645] Avg episode reward: [(0, '0.092')] +[2024-06-18 05:46:48,375][12883] Updated weights for policy 0, policy_version 70341 (0.0032) +[2024-06-18 05:46:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1152581632. Throughput: 0: 42600.5. Samples: 1152720760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:46:51,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 05:46:52,348][12883] Updated weights for policy 0, policy_version 70351 (0.0040) +[2024-06-18 05:46:56,047][12883] Updated weights for policy 0, policy_version 70361 (0.0027) +[2024-06-18 05:46:56,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42323.8, 300 sec: 42375.9). Total num frames: 1152827392. Throughput: 0: 42549.9. Samples: 1152975780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:46:56,997][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 05:46:59,608][12862] Signal inference workers to stop experience collection... (16700 times) +[2024-06-18 05:46:59,641][12883] InferenceWorker_p0-w0: stopping experience collection (16700 times) +[2024-06-18 05:46:59,670][12862] Signal inference workers to resume experience collection... (16700 times) +[2024-06-18 05:46:59,671][12883] InferenceWorker_p0-w0: resuming experience collection (16700 times) +[2024-06-18 05:47:00,010][12883] Updated weights for policy 0, policy_version 70371 (0.0047) +[2024-06-18 05:47:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42265.5). Total num frames: 1153024000. Throughput: 0: 42589.2. Samples: 1153107380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:47:01,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 05:47:03,661][12883] Updated weights for policy 0, policy_version 70381 (0.0032) +[2024-06-18 05:47:06,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1153236992. Throughput: 0: 42528.4. Samples: 1153354120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:47:06,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 05:47:08,358][12883] Updated weights for policy 0, policy_version 70391 (0.0043) +[2024-06-18 05:47:11,639][12883] Updated weights for policy 0, policy_version 70401 (0.0039) +[2024-06-18 05:47:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1153466368. Throughput: 0: 42393.2. Samples: 1153605180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:47:11,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 05:47:15,969][12883] Updated weights for policy 0, policy_version 70411 (0.0042) +[2024-06-18 05:47:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1153646592. Throughput: 0: 42546.6. Samples: 1153736620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:47:16,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 05:47:19,387][12883] Updated weights for policy 0, policy_version 70421 (0.0045) +[2024-06-18 05:47:21,993][12645] Fps is (10 sec: 39322.7, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 1153859584. Throughput: 0: 42375.6. Samples: 1153986960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 05:47:21,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 05:47:23,976][12883] Updated weights for policy 0, policy_version 70431 (0.0033) +[2024-06-18 05:47:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1154088960. Throughput: 0: 42364.3. Samples: 1154240700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:47:26,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 05:47:27,164][12883] Updated weights for policy 0, policy_version 70441 (0.0048) +[2024-06-18 05:47:31,514][12883] Updated weights for policy 0, policy_version 70451 (0.0028) +[2024-06-18 05:47:31,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42265.7). Total num frames: 1154285568. Throughput: 0: 42275.5. Samples: 1154368980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:47:31,994][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 05:47:34,850][12883] Updated weights for policy 0, policy_version 70461 (0.0042) +[2024-06-18 05:47:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1154498560. Throughput: 0: 42357.3. Samples: 1154626840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:47:36,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 05:47:39,448][12883] Updated weights for policy 0, policy_version 70471 (0.0034) +[2024-06-18 05:47:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42265.3). Total num frames: 1154727936. Throughput: 0: 42358.7. Samples: 1154881820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:47:41,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 05:47:42,361][12883] Updated weights for policy 0, policy_version 70481 (0.0027) +[2024-06-18 05:47:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1154924544. Throughput: 0: 42295.2. Samples: 1155010660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:47:46,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 05:47:46,996][12883] Updated weights for policy 0, policy_version 70491 (0.0041) +[2024-06-18 05:47:50,273][12883] Updated weights for policy 0, policy_version 70501 (0.0034) +[2024-06-18 05:47:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 1155153920. Throughput: 0: 42399.5. Samples: 1155262100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:47:51,998][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 05:47:54,680][12883] Updated weights for policy 0, policy_version 70511 (0.0041) +[2024-06-18 05:47:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.8, 300 sec: 42209.6). Total num frames: 1155350528. Throughput: 0: 42626.7. Samples: 1155523380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:47:56,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 05:47:57,949][12883] Updated weights for policy 0, policy_version 70521 (0.0034) +[2024-06-18 05:48:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42265.3). Total num frames: 1155563520. Throughput: 0: 42365.8. Samples: 1155643080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:01,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 05:48:02,298][12883] Updated weights for policy 0, policy_version 70531 (0.0022) +[2024-06-18 05:48:05,850][12883] Updated weights for policy 0, policy_version 70541 (0.0035) +[2024-06-18 05:48:06,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42377.2). Total num frames: 1155792896. Throughput: 0: 42548.4. Samples: 1155901640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:06,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 05:48:10,433][12883] Updated weights for policy 0, policy_version 70551 (0.0033) +[2024-06-18 05:48:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1155973120. Throughput: 0: 42439.6. Samples: 1156150480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:11,994][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 05:48:13,523][12883] Updated weights for policy 0, policy_version 70561 (0.0027) +[2024-06-18 05:48:16,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1156186112. Throughput: 0: 42284.4. Samples: 1156271780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:16,998][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 05:48:18,035][12883] Updated weights for policy 0, policy_version 70571 (0.0020) +[2024-06-18 05:48:21,163][12883] Updated weights for policy 0, policy_version 70581 (0.0029) +[2024-06-18 05:48:21,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.4, 300 sec: 42431.8). Total num frames: 1156448256. Throughput: 0: 42320.0. Samples: 1156531240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:21,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 05:48:25,766][12883] Updated weights for policy 0, policy_version 70591 (0.0032) +[2024-06-18 05:48:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1156612096. Throughput: 0: 42392.8. Samples: 1156789500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:26,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 05:48:29,070][12883] Updated weights for policy 0, policy_version 70601 (0.0036) +[2024-06-18 05:48:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1156841472. Throughput: 0: 42080.5. Samples: 1156904280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:31,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 05:48:33,353][12883] Updated weights for policy 0, policy_version 70611 (0.0025) +[2024-06-18 05:48:36,672][12862] Signal inference workers to stop experience collection... (16750 times) +[2024-06-18 05:48:36,672][12862] Signal inference workers to resume experience collection... (16750 times) +[2024-06-18 05:48:36,728][12883] InferenceWorker_p0-w0: stopping experience collection (16750 times) +[2024-06-18 05:48:36,728][12883] InferenceWorker_p0-w0: resuming experience collection (16750 times) +[2024-06-18 05:48:36,802][12883] Updated weights for policy 0, policy_version 70621 (0.0039) +[2024-06-18 05:48:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1157054464. Throughput: 0: 42331.1. Samples: 1157167000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:36,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 05:48:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070622_1157070848.pth... +[2024-06-18 05:48:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070001_1146896384.pth +[2024-06-18 05:48:41,073][12883] Updated weights for policy 0, policy_version 70631 (0.0031) +[2024-06-18 05:48:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1157234688. Throughput: 0: 42106.3. Samples: 1157418160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:41,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 05:48:44,806][12883] Updated weights for policy 0, policy_version 70641 (0.0038) +[2024-06-18 05:48:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1157480448. Throughput: 0: 42254.2. Samples: 1157544520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:48:46,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 05:48:48,718][12883] Updated weights for policy 0, policy_version 70651 (0.0030) +[2024-06-18 05:48:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42210.5). Total num frames: 1157660672. Throughput: 0: 41965.2. Samples: 1157790080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:48:51,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 05:48:52,723][12883] Updated weights for policy 0, policy_version 70661 (0.0028) +[2024-06-18 05:48:56,376][12883] Updated weights for policy 0, policy_version 70671 (0.0029) +[2024-06-18 05:48:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1157890048. Throughput: 0: 42129.8. Samples: 1158046320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:48:56,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 05:49:00,712][12883] Updated weights for policy 0, policy_version 70681 (0.0032) +[2024-06-18 05:49:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1158103040. Throughput: 0: 42340.1. Samples: 1158177080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:49:01,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 05:49:03,935][12883] Updated weights for policy 0, policy_version 70691 (0.0037) +[2024-06-18 05:49:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 1158316032. Throughput: 0: 42174.6. Samples: 1158429100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:49:06,995][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 05:49:08,419][12883] Updated weights for policy 0, policy_version 70701 (0.0041) +[2024-06-18 05:49:11,648][12883] Updated weights for policy 0, policy_version 70711 (0.0036) +[2024-06-18 05:49:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1158529024. Throughput: 0: 41941.0. Samples: 1158676840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:49:11,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 05:49:16,452][12883] Updated weights for policy 0, policy_version 70721 (0.0046) +[2024-06-18 05:49:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1158742016. Throughput: 0: 42227.1. Samples: 1158804500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:49:16,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 05:49:19,511][12883] Updated weights for policy 0, policy_version 70731 (0.0044) +[2024-06-18 05:49:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 42209.9). Total num frames: 1158938624. Throughput: 0: 42111.9. Samples: 1159062040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:49:22,008][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 05:49:24,026][12883] Updated weights for policy 0, policy_version 70741 (0.0031) +[2024-06-18 05:49:26,980][12883] Updated weights for policy 0, policy_version 70751 (0.0036) +[2024-06-18 05:49:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 1159184384. Throughput: 0: 42162.1. Samples: 1159315460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:49:26,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 05:49:31,531][12883] Updated weights for policy 0, policy_version 70761 (0.0038) +[2024-06-18 05:49:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 1159348224. Throughput: 0: 42304.3. Samples: 1159448220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 05:49:31,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 05:49:34,631][12883] Updated weights for policy 0, policy_version 70771 (0.0031) +[2024-06-18 05:49:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1159577600. Throughput: 0: 42349.4. Samples: 1159695800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 05:49:36,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 05:49:39,380][12883] Updated weights for policy 0, policy_version 70781 (0.0035) +[2024-06-18 05:49:41,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1159790592. Throughput: 0: 42303.1. Samples: 1159949960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 05:49:41,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 05:49:42,492][12883] Updated weights for policy 0, policy_version 70791 (0.0039) +[2024-06-18 05:49:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1159987200. Throughput: 0: 42158.8. Samples: 1160074220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 05:49:46,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 05:49:47,053][12883] Updated weights for policy 0, policy_version 70801 (0.0030) +[2024-06-18 05:49:50,241][12883] Updated weights for policy 0, policy_version 70811 (0.0024) +[2024-06-18 05:49:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1160216576. Throughput: 0: 42110.3. Samples: 1160324060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 05:49:51,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 05:49:54,772][12883] Updated weights for policy 0, policy_version 70821 (0.0027) +[2024-06-18 05:49:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1160413184. Throughput: 0: 42421.3. Samples: 1160585800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 05:49:56,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 05:49:57,870][12883] Updated weights for policy 0, policy_version 70831 (0.0038) +[2024-06-18 05:50:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42265.3). Total num frames: 1160626176. Throughput: 0: 42337.4. Samples: 1160709680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 05:50:01,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 05:50:02,582][12883] Updated weights for policy 0, policy_version 70841 (0.0035) +[2024-06-18 05:50:05,817][12883] Updated weights for policy 0, policy_version 70851 (0.0031) +[2024-06-18 05:50:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1160855552. Throughput: 0: 42207.2. Samples: 1160961360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 05:50:06,994][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 05:50:09,431][12862] Signal inference workers to stop experience collection... (16800 times) +[2024-06-18 05:50:09,431][12862] Signal inference workers to resume experience collection... (16800 times) +[2024-06-18 05:50:09,459][12883] InferenceWorker_p0-w0: stopping experience collection (16800 times) +[2024-06-18 05:50:09,487][12883] InferenceWorker_p0-w0: resuming experience collection (16800 times) +[2024-06-18 05:50:10,174][12883] Updated weights for policy 0, policy_version 70861 (0.0027) +[2024-06-18 05:50:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1161052160. Throughput: 0: 42280.6. Samples: 1161218080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 05:50:11,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 05:50:13,604][12883] Updated weights for policy 0, policy_version 70871 (0.0028) +[2024-06-18 05:50:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1161248768. Throughput: 0: 42025.0. Samples: 1161339340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:16,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 05:50:17,674][12883] Updated weights for policy 0, policy_version 70881 (0.0038) +[2024-06-18 05:50:21,339][12883] Updated weights for policy 0, policy_version 70891 (0.0033) +[2024-06-18 05:50:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42321.0). Total num frames: 1161494528. Throughput: 0: 42301.9. Samples: 1161599380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:21,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 05:50:25,551][12883] Updated weights for policy 0, policy_version 70901 (0.0037) +[2024-06-18 05:50:26,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41777.7, 300 sec: 42264.9). Total num frames: 1161691136. Throughput: 0: 42307.2. Samples: 1161853880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:26,996][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 05:50:29,318][12883] Updated weights for policy 0, policy_version 70911 (0.0037) +[2024-06-18 05:50:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1161887744. Throughput: 0: 42217.3. Samples: 1161974000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:31,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 05:50:33,036][12883] Updated weights for policy 0, policy_version 70921 (0.0030) +[2024-06-18 05:50:36,787][12883] Updated weights for policy 0, policy_version 70931 (0.0033) +[2024-06-18 05:50:37,000][12645] Fps is (10 sec: 45856.6, 60 sec: 42867.0, 300 sec: 42375.3). Total num frames: 1162149888. Throughput: 0: 42549.6. Samples: 1162239060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:37,001][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 05:50:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070932_1162149888.pth... +[2024-06-18 05:50:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070310_1151959040.pth +[2024-06-18 05:50:40,665][12883] Updated weights for policy 0, policy_version 70941 (0.0027) +[2024-06-18 05:50:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1162330112. Throughput: 0: 42439.9. Samples: 1162495600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:41,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 05:50:44,466][12883] Updated weights for policy 0, policy_version 70951 (0.0029) +[2024-06-18 05:50:46,994][12645] Fps is (10 sec: 39346.3, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1162543104. Throughput: 0: 42342.2. Samples: 1162615080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:46,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 05:50:48,325][12883] Updated weights for policy 0, policy_version 70961 (0.0032) +[2024-06-18 05:50:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1162772480. Throughput: 0: 42486.2. Samples: 1162873240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:51,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 05:50:52,487][12883] Updated weights for policy 0, policy_version 70971 (0.0032) +[2024-06-18 05:50:56,319][12883] Updated weights for policy 0, policy_version 70981 (0.0040) +[2024-06-18 05:50:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1162985472. Throughput: 0: 42374.2. Samples: 1163124920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 05:50:56,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 05:51:00,331][12883] Updated weights for policy 0, policy_version 70991 (0.0032) +[2024-06-18 05:51:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1163182080. Throughput: 0: 42488.9. Samples: 1163251340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:01,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 05:51:04,058][12883] Updated weights for policy 0, policy_version 71001 (0.0036) +[2024-06-18 05:51:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1163411456. Throughput: 0: 42396.4. Samples: 1163507220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:06,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 05:51:08,108][12883] Updated weights for policy 0, policy_version 71011 (0.0040) +[2024-06-18 05:51:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1163591680. Throughput: 0: 42428.8. Samples: 1163763080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:11,994][12645] Avg episode reward: [(0, '0.319')] +[2024-06-18 05:51:12,025][12883] Updated weights for policy 0, policy_version 71021 (0.0039) +[2024-06-18 05:51:16,333][12883] Updated weights for policy 0, policy_version 71031 (0.0032) +[2024-06-18 05:51:16,996][12645] Fps is (10 sec: 37674.7, 60 sec: 42323.7, 300 sec: 42264.8). Total num frames: 1163788288. Throughput: 0: 42368.6. Samples: 1163880680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:16,996][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 05:51:19,731][12883] Updated weights for policy 0, policy_version 71041 (0.0028) +[2024-06-18 05:51:21,748][12862] Signal inference workers to stop experience collection... (16850 times) +[2024-06-18 05:51:21,749][12862] Signal inference workers to resume experience collection... (16850 times) +[2024-06-18 05:51:21,790][12883] InferenceWorker_p0-w0: stopping experience collection (16850 times) +[2024-06-18 05:51:21,790][12883] InferenceWorker_p0-w0: resuming experience collection (16850 times) +[2024-06-18 05:51:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1164050432. Throughput: 0: 42151.6. Samples: 1164135620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:21,994][12645] Avg episode reward: [(0, '0.151')] +[2024-06-18 05:51:24,049][12883] Updated weights for policy 0, policy_version 71051 (0.0041) +[2024-06-18 05:51:26,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1164230656. Throughput: 0: 42182.2. Samples: 1164393800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:26,994][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 05:51:27,566][12883] Updated weights for policy 0, policy_version 71061 (0.0026) +[2024-06-18 05:51:31,631][12883] Updated weights for policy 0, policy_version 71071 (0.0036) +[2024-06-18 05:51:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1164443648. Throughput: 0: 42231.1. Samples: 1164515480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:31,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 05:51:35,283][12883] Updated weights for policy 0, policy_version 71081 (0.0027) +[2024-06-18 05:51:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42056.6, 300 sec: 42376.2). Total num frames: 1164673024. Throughput: 0: 42225.7. Samples: 1164773400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:36,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 05:51:39,327][12883] Updated weights for policy 0, policy_version 71091 (0.0023) +[2024-06-18 05:51:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1164853248. Throughput: 0: 42343.6. Samples: 1165030380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 05:51:41,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 05:51:42,956][12883] Updated weights for policy 0, policy_version 71101 (0.0040) +[2024-06-18 05:51:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1165066240. Throughput: 0: 42324.9. Samples: 1165155960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-18 05:51:46,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 05:51:47,002][12883] Updated weights for policy 0, policy_version 71111 (0.0039) +[2024-06-18 05:51:51,107][12883] Updated weights for policy 0, policy_version 71121 (0.0030) +[2024-06-18 05:51:51,996][12645] Fps is (10 sec: 42588.6, 60 sec: 41777.6, 300 sec: 42209.6). Total num frames: 1165279232. Throughput: 0: 42250.3. Samples: 1165408580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-18 05:51:51,997][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 05:51:54,606][12883] Updated weights for policy 0, policy_version 71131 (0.0039) +[2024-06-18 05:51:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1165492224. Throughput: 0: 42216.9. Samples: 1165662840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-18 05:51:56,994][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 05:51:58,870][12883] Updated weights for policy 0, policy_version 71141 (0.0026) +[2024-06-18 05:52:01,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1165705216. Throughput: 0: 42337.7. Samples: 1165785780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-18 05:52:01,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 05:52:02,840][12883] Updated weights for policy 0, policy_version 71151 (0.0031) +[2024-06-18 05:52:06,526][12883] Updated weights for policy 0, policy_version 71161 (0.0046) +[2024-06-18 05:52:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 1165918208. Throughput: 0: 42363.3. Samples: 1166041960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-18 05:52:06,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 05:52:10,480][12883] Updated weights for policy 0, policy_version 71171 (0.0045) +[2024-06-18 05:52:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1166098432. Throughput: 0: 42300.6. Samples: 1166297320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-18 05:52:11,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 05:52:14,379][12883] Updated weights for policy 0, policy_version 71181 (0.0031) +[2024-06-18 05:52:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42599.9, 300 sec: 42320.7). Total num frames: 1166344192. Throughput: 0: 42254.1. Samples: 1166416920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-18 05:52:16,994][12645] Avg episode reward: [(0, '0.062')] +[2024-06-18 05:52:18,158][12883] Updated weights for policy 0, policy_version 71191 (0.0029) +[2024-06-18 05:52:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 1166540800. Throughput: 0: 42064.9. Samples: 1166666320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) +[2024-06-18 05:52:21,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 05:52:22,413][12883] Updated weights for policy 0, policy_version 71201 (0.0035) +[2024-06-18 05:52:25,752][12883] Updated weights for policy 0, policy_version 71211 (0.0032) +[2024-06-18 05:52:26,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42050.7, 300 sec: 42264.9). Total num frames: 1166753792. Throughput: 0: 41997.0. Samples: 1166920340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:52:26,997][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 05:52:29,995][12883] Updated weights for policy 0, policy_version 71221 (0.0030) +[2024-06-18 05:52:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1166983168. Throughput: 0: 42117.7. Samples: 1167051260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:52:31,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 05:52:33,431][12883] Updated weights for policy 0, policy_version 71231 (0.0035) +[2024-06-18 05:52:36,994][12645] Fps is (10 sec: 42608.0, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1167179776. Throughput: 0: 42135.0. Samples: 1167304560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:52:36,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 05:52:37,090][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071240_1167196160.pth... +[2024-06-18 05:52:37,150][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070622_1157070848.pth +[2024-06-18 05:52:37,543][12883] Updated weights for policy 0, policy_version 71241 (0.0035) +[2024-06-18 05:52:41,420][12883] Updated weights for policy 0, policy_version 71251 (0.0041) +[2024-06-18 05:52:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1167392768. Throughput: 0: 41959.0. Samples: 1167551000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:52:41,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 05:52:45,386][12883] Updated weights for policy 0, policy_version 71261 (0.0043) +[2024-06-18 05:52:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1167605760. Throughput: 0: 41987.4. Samples: 1167675220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:52:46,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 05:52:48,970][12883] Updated weights for policy 0, policy_version 71271 (0.0030) +[2024-06-18 05:52:50,944][12862] Signal inference workers to stop experience collection... (16900 times) +[2024-06-18 05:52:50,976][12883] InferenceWorker_p0-w0: stopping experience collection (16900 times) +[2024-06-18 05:52:51,001][12862] Signal inference workers to resume experience collection... (16900 times) +[2024-06-18 05:52:51,008][12883] InferenceWorker_p0-w0: resuming experience collection (16900 times) +[2024-06-18 05:52:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42327.0, 300 sec: 42265.2). Total num frames: 1167818752. Throughput: 0: 41983.1. Samples: 1167931200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:52:51,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 05:52:53,096][12883] Updated weights for policy 0, policy_version 71281 (0.0031) +[2024-06-18 05:52:56,654][12883] Updated weights for policy 0, policy_version 71291 (0.0041) +[2024-06-18 05:52:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 1168031744. Throughput: 0: 41912.3. Samples: 1168183380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:52:56,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 05:53:01,023][12883] Updated weights for policy 0, policy_version 71301 (0.0045) +[2024-06-18 05:53:01,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 1168228352. Throughput: 0: 42074.1. Samples: 1168310260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:53:01,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 05:53:04,540][12883] Updated weights for policy 0, policy_version 71311 (0.0048) +[2024-06-18 05:53:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.2, 300 sec: 42376.2). Total num frames: 1168474112. Throughput: 0: 42248.0. Samples: 1168567480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 05:53:06,995][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 05:53:08,563][12883] Updated weights for policy 0, policy_version 71321 (0.0026) +[2024-06-18 05:53:11,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1168654336. Throughput: 0: 42374.2. Samples: 1168827080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:11,994][12645] Avg episode reward: [(0, '0.111')] +[2024-06-18 05:53:12,366][12883] Updated weights for policy 0, policy_version 71331 (0.0035) +[2024-06-18 05:53:16,213][12883] Updated weights for policy 0, policy_version 71341 (0.0032) +[2024-06-18 05:53:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1168867328. Throughput: 0: 42156.0. Samples: 1168948280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:16,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 05:53:20,461][12883] Updated weights for policy 0, policy_version 71351 (0.0025) +[2024-06-18 05:53:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1169096704. Throughput: 0: 42258.7. Samples: 1169206200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:21,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 05:53:23,859][12883] Updated weights for policy 0, policy_version 71361 (0.0028) +[2024-06-18 05:53:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 1169293312. Throughput: 0: 42410.2. Samples: 1169459460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:26,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 05:53:28,184][12883] Updated weights for policy 0, policy_version 71371 (0.0030) +[2024-06-18 05:53:31,407][12883] Updated weights for policy 0, policy_version 71381 (0.0042) +[2024-06-18 05:53:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1169506304. Throughput: 0: 42390.7. Samples: 1169582800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:31,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 05:53:35,884][12883] Updated weights for policy 0, policy_version 71391 (0.0039) +[2024-06-18 05:53:36,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1169752064. Throughput: 0: 42564.8. Samples: 1169846620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:36,995][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 05:53:39,032][12883] Updated weights for policy 0, policy_version 71401 (0.0037) +[2024-06-18 05:53:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1169915904. Throughput: 0: 42554.2. Samples: 1170098320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:41,994][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 05:53:43,501][12883] Updated weights for policy 0, policy_version 71411 (0.0044) +[2024-06-18 05:53:46,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42323.8, 300 sec: 42320.4). Total num frames: 1170145280. Throughput: 0: 42381.6. Samples: 1170217520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:46,996][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 05:53:47,082][12883] Updated weights for policy 0, policy_version 71421 (0.0040) +[2024-06-18 05:53:51,190][12883] Updated weights for policy 0, policy_version 71431 (0.0029) +[2024-06-18 05:53:51,994][12645] Fps is (10 sec: 45876.3, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1170374656. Throughput: 0: 42514.5. Samples: 1170480620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 05:53:51,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 05:53:54,667][12883] Updated weights for policy 0, policy_version 71441 (0.0033) +[2024-06-18 05:53:56,994][12645] Fps is (10 sec: 39329.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1170538496. Throughput: 0: 42237.6. Samples: 1170727780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:53:56,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 05:53:58,978][12883] Updated weights for policy 0, policy_version 71451 (0.0037) +[2024-06-18 05:54:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1170784256. Throughput: 0: 42183.6. Samples: 1170846540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:54:01,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 05:54:02,438][12883] Updated weights for policy 0, policy_version 71461 (0.0032) +[2024-06-18 05:54:06,846][12883] Updated weights for policy 0, policy_version 71471 (0.0021) +[2024-06-18 05:54:06,993][12645] Fps is (10 sec: 45876.7, 60 sec: 42052.5, 300 sec: 42265.2). Total num frames: 1170997248. Throughput: 0: 42322.8. Samples: 1171110720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:54:06,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 05:54:10,079][12883] Updated weights for policy 0, policy_version 71481 (0.0028) +[2024-06-18 05:54:10,658][12862] Signal inference workers to stop experience collection... (16950 times) +[2024-06-18 05:54:10,658][12862] Signal inference workers to resume experience collection... (16950 times) +[2024-06-18 05:54:10,682][12883] InferenceWorker_p0-w0: stopping experience collection (16950 times) +[2024-06-18 05:54:10,682][12883] InferenceWorker_p0-w0: resuming experience collection (16950 times) +[2024-06-18 05:54:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1171193856. Throughput: 0: 42296.5. Samples: 1171362800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:54:11,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 05:54:14,512][12883] Updated weights for policy 0, policy_version 71491 (0.0043) +[2024-06-18 05:54:16,996][12645] Fps is (10 sec: 42588.0, 60 sec: 42596.8, 300 sec: 42320.4). Total num frames: 1171423232. Throughput: 0: 42297.9. Samples: 1171486300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:54:16,996][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 05:54:17,731][12883] Updated weights for policy 0, policy_version 71501 (0.0033) +[2024-06-18 05:54:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 1171603456. Throughput: 0: 42018.3. Samples: 1171737440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:54:21,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 05:54:22,305][12883] Updated weights for policy 0, policy_version 71511 (0.0041) +[2024-06-18 05:54:26,140][12883] Updated weights for policy 0, policy_version 71521 (0.0030) +[2024-06-18 05:54:26,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1171832832. Throughput: 0: 42026.8. Samples: 1171989520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:54:26,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 05:54:30,099][12883] Updated weights for policy 0, policy_version 71531 (0.0032) +[2024-06-18 05:54:31,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42323.8, 300 sec: 42264.8). Total num frames: 1172045824. Throughput: 0: 42338.2. Samples: 1172122740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:54:31,997][12645] Avg episode reward: [(0, '0.654')] +[2024-06-18 05:54:32,043][12862] Saving new best policy, reward=0.654! +[2024-06-18 05:54:33,862][12883] Updated weights for policy 0, policy_version 71541 (0.0038) +[2024-06-18 05:54:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 1172242432. Throughput: 0: 42025.1. Samples: 1172371760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 05:54:36,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 05:54:37,000][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071548_1172242432.pth... +[2024-06-18 05:54:37,051][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070932_1162149888.pth +[2024-06-18 05:54:37,723][12883] Updated weights for policy 0, policy_version 71551 (0.0028) +[2024-06-18 05:54:41,460][12883] Updated weights for policy 0, policy_version 71561 (0.0034) +[2024-06-18 05:54:41,994][12645] Fps is (10 sec: 40969.8, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 1172455424. Throughput: 0: 42182.5. Samples: 1172625980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:54:41,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 05:54:45,591][12883] Updated weights for policy 0, policy_version 71571 (0.0025) +[2024-06-18 05:54:46,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42599.9, 300 sec: 42320.7). Total num frames: 1172701184. Throughput: 0: 42594.6. Samples: 1172763300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:54:46,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 05:54:49,192][12883] Updated weights for policy 0, policy_version 71581 (0.0030) +[2024-06-18 05:54:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 42209.6). Total num frames: 1172865024. Throughput: 0: 41980.7. Samples: 1172999860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:54:51,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 05:54:53,476][12883] Updated weights for policy 0, policy_version 71591 (0.0041) +[2024-06-18 05:54:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 1173094400. Throughput: 0: 42110.1. Samples: 1173257760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:54:56,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 05:54:57,267][12883] Updated weights for policy 0, policy_version 71601 (0.0032) +[2024-06-18 05:55:01,025][12883] Updated weights for policy 0, policy_version 71611 (0.0023) +[2024-06-18 05:55:01,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1173340160. Throughput: 0: 42325.3. Samples: 1173390840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:55:01,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 05:55:05,132][12883] Updated weights for policy 0, policy_version 71621 (0.0029) +[2024-06-18 05:55:06,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42050.6, 300 sec: 42264.8). Total num frames: 1173520384. Throughput: 0: 42289.0. Samples: 1173640540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:55:06,996][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 05:55:08,624][12883] Updated weights for policy 0, policy_version 71631 (0.0045) +[2024-06-18 05:55:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1173733376. Throughput: 0: 42403.1. Samples: 1173897660. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:55:11,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 05:55:12,874][12883] Updated weights for policy 0, policy_version 71641 (0.0030) +[2024-06-18 05:55:16,345][12883] Updated weights for policy 0, policy_version 71651 (0.0031) +[2024-06-18 05:55:16,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42326.9, 300 sec: 42265.1). Total num frames: 1173962752. Throughput: 0: 42262.1. Samples: 1174024440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) +[2024-06-18 05:55:16,995][12645] Avg episode reward: [(0, '0.095')] +[2024-06-18 05:55:20,639][12883] Updated weights for policy 0, policy_version 71661 (0.0031) +[2024-06-18 05:55:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42265.5). Total num frames: 1174159360. Throughput: 0: 42353.9. Samples: 1174277680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:55:21,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 05:55:24,133][12883] Updated weights for policy 0, policy_version 71671 (0.0045) +[2024-06-18 05:55:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1174372352. Throughput: 0: 42306.0. Samples: 1174529760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:55:26,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 05:55:28,303][12883] Updated weights for policy 0, policy_version 71681 (0.0040) +[2024-06-18 05:55:28,717][12862] Signal inference workers to stop experience collection... (17000 times) +[2024-06-18 05:55:28,770][12862] Signal inference workers to resume experience collection... (17000 times) +[2024-06-18 05:55:28,771][12883] InferenceWorker_p0-w0: stopping experience collection (17000 times) +[2024-06-18 05:55:28,791][12883] InferenceWorker_p0-w0: resuming experience collection (17000 times) +[2024-06-18 05:55:31,676][12883] Updated weights for policy 0, policy_version 71691 (0.0027) +[2024-06-18 05:55:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42600.1, 300 sec: 42210.5). Total num frames: 1174601728. Throughput: 0: 42080.6. Samples: 1174656920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:55:31,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 05:55:36,012][12883] Updated weights for policy 0, policy_version 71701 (0.0026) +[2024-06-18 05:55:36,996][12645] Fps is (10 sec: 44227.7, 60 sec: 42870.0, 300 sec: 42320.4). Total num frames: 1174814720. Throughput: 0: 42551.3. Samples: 1174914760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:55:36,996][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 05:55:39,442][12883] Updated weights for policy 0, policy_version 71711 (0.0033) +[2024-06-18 05:55:41,994][12645] Fps is (10 sec: 39320.6, 60 sec: 42325.1, 300 sec: 42209.6). Total num frames: 1174994944. Throughput: 0: 42373.3. Samples: 1175164560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:55:41,994][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 05:55:43,743][12883] Updated weights for policy 0, policy_version 71721 (0.0040) +[2024-06-18 05:55:46,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1175224320. Throughput: 0: 42220.3. Samples: 1175290760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:55:46,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 05:55:47,342][12883] Updated weights for policy 0, policy_version 71731 (0.0048) +[2024-06-18 05:55:51,282][12883] Updated weights for policy 0, policy_version 71741 (0.0038) +[2024-06-18 05:55:51,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1175420928. Throughput: 0: 42423.0. Samples: 1175549480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:55:51,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 05:55:55,267][12883] Updated weights for policy 0, policy_version 71751 (0.0030) +[2024-06-18 05:55:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1175633920. Throughput: 0: 42186.2. Samples: 1175796040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:55:56,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 05:55:59,330][12883] Updated weights for policy 0, policy_version 71761 (0.0047) +[2024-06-18 05:56:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1175846912. Throughput: 0: 42266.3. Samples: 1175926420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 05:56:01,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 05:56:02,841][12883] Updated weights for policy 0, policy_version 71771 (0.0037) +[2024-06-18 05:56:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1176043520. Throughput: 0: 42210.3. Samples: 1176177140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:06,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 05:56:07,070][12883] Updated weights for policy 0, policy_version 71781 (0.0034) +[2024-06-18 05:56:10,573][12883] Updated weights for policy 0, policy_version 71791 (0.0027) +[2024-06-18 05:56:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42376.5). Total num frames: 1176289280. Throughput: 0: 42120.0. Samples: 1176425160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:11,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 05:56:14,986][12883] Updated weights for policy 0, policy_version 71801 (0.0037) +[2024-06-18 05:56:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 1176469504. Throughput: 0: 42248.0. Samples: 1176558080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:16,994][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 05:56:18,117][12883] Updated weights for policy 0, policy_version 71811 (0.0042) +[2024-06-18 05:56:21,994][12645] Fps is (10 sec: 37683.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1176666112. Throughput: 0: 42046.1. Samples: 1176806740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:21,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 05:56:22,695][12883] Updated weights for policy 0, policy_version 71821 (0.0038) +[2024-06-18 05:56:25,924][12883] Updated weights for policy 0, policy_version 71831 (0.0043) +[2024-06-18 05:56:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 1176911872. Throughput: 0: 42090.3. Samples: 1177058620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:26,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 05:56:30,299][12883] Updated weights for policy 0, policy_version 71841 (0.0030) +[2024-06-18 05:56:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1177108480. Throughput: 0: 42139.2. Samples: 1177187020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:31,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 05:56:33,661][12883] Updated weights for policy 0, policy_version 71851 (0.0033) +[2024-06-18 05:56:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41780.7, 300 sec: 42265.2). Total num frames: 1177321472. Throughput: 0: 41981.7. Samples: 1177438660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:36,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 05:56:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071858_1177321472.pth... +[2024-06-18 05:56:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071240_1167196160.pth +[2024-06-18 05:56:37,958][12883] Updated weights for policy 0, policy_version 71861 (0.0039) +[2024-06-18 05:56:41,519][12883] Updated weights for policy 0, policy_version 71871 (0.0047) +[2024-06-18 05:56:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1177550848. Throughput: 0: 42012.8. Samples: 1177686620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:41,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 05:56:46,278][12883] Updated weights for policy 0, policy_version 71881 (0.0036) +[2024-06-18 05:56:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42265.5). Total num frames: 1177747456. Throughput: 0: 42100.0. Samples: 1177820920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 05:56:46,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 05:56:48,092][12862] Signal inference workers to stop experience collection... (17050 times) +[2024-06-18 05:56:48,092][12862] Signal inference workers to resume experience collection... (17050 times) +[2024-06-18 05:56:48,133][12883] InferenceWorker_p0-w0: stopping experience collection (17050 times) +[2024-06-18 05:56:48,133][12883] InferenceWorker_p0-w0: resuming experience collection (17050 times) +[2024-06-18 05:56:49,427][12883] Updated weights for policy 0, policy_version 71891 (0.0037) +[2024-06-18 05:56:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1177944064. Throughput: 0: 42094.1. Samples: 1178071380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:56:51,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 05:56:53,945][12883] Updated weights for policy 0, policy_version 71901 (0.0038) +[2024-06-18 05:56:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1178173440. Throughput: 0: 42097.9. Samples: 1178319560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:56:56,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 05:56:57,053][12883] Updated weights for policy 0, policy_version 71911 (0.0033) +[2024-06-18 05:57:01,557][12883] Updated weights for policy 0, policy_version 71921 (0.0032) +[2024-06-18 05:57:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1178353664. Throughput: 0: 42068.8. Samples: 1178451180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:57:01,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 05:57:04,610][12883] Updated weights for policy 0, policy_version 71931 (0.0035) +[2024-06-18 05:57:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1178583040. Throughput: 0: 42221.2. Samples: 1178706700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:57:06,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 05:57:09,211][12883] Updated weights for policy 0, policy_version 71941 (0.0031) +[2024-06-18 05:57:11,995][12645] Fps is (10 sec: 47507.5, 60 sec: 42324.5, 300 sec: 42320.5). Total num frames: 1178828800. Throughput: 0: 42142.9. Samples: 1178955100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:57:11,995][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 05:57:12,159][12883] Updated weights for policy 0, policy_version 71951 (0.0034) +[2024-06-18 05:57:16,799][12883] Updated weights for policy 0, policy_version 71961 (0.0027) +[2024-06-18 05:57:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1179009024. Throughput: 0: 42160.9. Samples: 1179084260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:57:16,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 05:57:20,179][12883] Updated weights for policy 0, policy_version 71971 (0.0032) +[2024-06-18 05:57:21,994][12645] Fps is (10 sec: 37688.4, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 1179205632. Throughput: 0: 42066.8. Samples: 1179331660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:57:21,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 05:57:25,025][12883] Updated weights for policy 0, policy_version 71981 (0.0045) +[2024-06-18 05:57:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 1179435008. Throughput: 0: 42225.9. Samples: 1179586780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:57:26,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 05:57:27,856][12883] Updated weights for policy 0, policy_version 71991 (0.0035) +[2024-06-18 05:57:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1179631616. Throughput: 0: 42017.2. Samples: 1179711700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 05:57:31,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 05:57:32,724][12883] Updated weights for policy 0, policy_version 72001 (0.0032) +[2024-06-18 05:57:35,454][12883] Updated weights for policy 0, policy_version 72011 (0.0048) +[2024-06-18 05:57:36,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42323.7, 300 sec: 42264.9). Total num frames: 1179860992. Throughput: 0: 41913.5. Samples: 1179957580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:57:36,997][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 05:57:40,280][12883] Updated weights for policy 0, policy_version 72021 (0.0042) +[2024-06-18 05:57:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1180057600. Throughput: 0: 42388.0. Samples: 1180227020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:57:41,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 05:57:43,479][12883] Updated weights for policy 0, policy_version 72031 (0.0036) +[2024-06-18 05:57:46,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1180270592. Throughput: 0: 42072.8. Samples: 1180344460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:57:46,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 05:57:48,307][12883] Updated weights for policy 0, policy_version 72041 (0.0045) +[2024-06-18 05:57:51,279][12883] Updated weights for policy 0, policy_version 72051 (0.0025) +[2024-06-18 05:57:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1180499968. Throughput: 0: 42074.7. Samples: 1180600060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:57:51,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 05:57:55,902][12883] Updated weights for policy 0, policy_version 72061 (0.0027) +[2024-06-18 05:57:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 1180663808. Throughput: 0: 42292.2. Samples: 1180858200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:57:56,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 05:57:59,082][12883] Updated weights for policy 0, policy_version 72071 (0.0035) +[2024-06-18 05:58:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1180893184. Throughput: 0: 41928.4. Samples: 1180971040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:58:01,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 05:58:03,731][12883] Updated weights for policy 0, policy_version 72081 (0.0029) +[2024-06-18 05:58:06,814][12883] Updated weights for policy 0, policy_version 72091 (0.0026) +[2024-06-18 05:58:06,994][12645] Fps is (10 sec: 47514.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1181138944. Throughput: 0: 42294.2. Samples: 1181234900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:58:06,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 05:58:11,415][12883] Updated weights for policy 0, policy_version 72101 (0.0040) +[2024-06-18 05:58:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.9, 300 sec: 42154.1). Total num frames: 1181302784. Throughput: 0: 42283.9. Samples: 1181489560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:58:11,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 05:58:14,719][12883] Updated weights for policy 0, policy_version 72111 (0.0044) +[2024-06-18 05:58:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1181532160. Throughput: 0: 42242.4. Samples: 1181612600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 05:58:16,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 05:58:19,335][12883] Updated weights for policy 0, policy_version 72121 (0.0035) +[2024-06-18 05:58:21,415][12862] Signal inference workers to stop experience collection... (17100 times) +[2024-06-18 05:58:21,456][12883] InferenceWorker_p0-w0: stopping experience collection (17100 times) +[2024-06-18 05:58:21,488][12862] Signal inference workers to resume experience collection... (17100 times) +[2024-06-18 05:58:21,492][12883] InferenceWorker_p0-w0: resuming experience collection (17100 times) +[2024-06-18 05:58:21,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1181777920. Throughput: 0: 42575.1. Samples: 1181873360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:58:21,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 05:58:22,358][12883] Updated weights for policy 0, policy_version 72131 (0.0035) +[2024-06-18 05:58:26,892][12883] Updated weights for policy 0, policy_version 72141 (0.0027) +[2024-06-18 05:58:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1181958144. Throughput: 0: 42268.0. Samples: 1182129080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:58:26,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 05:58:30,276][12883] Updated weights for policy 0, policy_version 72151 (0.0041) +[2024-06-18 05:58:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 1182187520. Throughput: 0: 42322.3. Samples: 1182248960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:58:31,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 05:58:34,543][12883] Updated weights for policy 0, policy_version 72161 (0.0041) +[2024-06-18 05:58:37,000][12645] Fps is (10 sec: 44209.3, 60 sec: 42322.5, 300 sec: 42319.8). Total num frames: 1182400512. Throughput: 0: 42416.0. Samples: 1182509040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:58:37,000][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 05:58:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072168_1182400512.pth... +[2024-06-18 05:58:37,093][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071548_1172242432.pth +[2024-06-18 05:58:37,801][12883] Updated weights for policy 0, policy_version 72171 (0.0026) +[2024-06-18 05:58:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 1182580736. Throughput: 0: 42427.3. Samples: 1182767420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:58:41,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 05:58:42,393][12883] Updated weights for policy 0, policy_version 72181 (0.0028) +[2024-06-18 05:58:45,384][12883] Updated weights for policy 0, policy_version 72191 (0.0039) +[2024-06-18 05:58:46,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 1182826496. Throughput: 0: 42537.4. Samples: 1182885220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:58:46,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 05:58:50,374][12883] Updated weights for policy 0, policy_version 72201 (0.0032) +[2024-06-18 05:58:51,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1183039488. Throughput: 0: 42485.1. Samples: 1183146740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:58:52,006][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 05:58:53,512][12883] Updated weights for policy 0, policy_version 72211 (0.0042) +[2024-06-18 05:58:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1183219712. Throughput: 0: 42248.9. Samples: 1183390760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:58:56,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 05:58:58,212][12883] Updated weights for policy 0, policy_version 72221 (0.0029) +[2024-06-18 05:59:01,219][12883] Updated weights for policy 0, policy_version 72231 (0.0035) +[2024-06-18 05:59:01,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42869.9, 300 sec: 42264.8). Total num frames: 1183465472. Throughput: 0: 42335.1. Samples: 1183517780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 05:59:01,997][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 05:59:06,158][12883] Updated weights for policy 0, policy_version 72241 (0.0042) +[2024-06-18 05:59:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 1183645696. Throughput: 0: 42279.0. Samples: 1183775920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 05:59:06,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 05:59:08,817][12883] Updated weights for policy 0, policy_version 72251 (0.0028) +[2024-06-18 05:59:11,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42598.4, 300 sec: 42154.4). Total num frames: 1183858688. Throughput: 0: 42019.5. Samples: 1184019960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 05:59:11,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 05:59:13,906][12883] Updated weights for policy 0, policy_version 72261 (0.0032) +[2024-06-18 05:59:16,827][12883] Updated weights for policy 0, policy_version 72271 (0.0027) +[2024-06-18 05:59:16,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42596.8, 300 sec: 42320.4). Total num frames: 1184088064. Throughput: 0: 42289.9. Samples: 1184152100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 05:59:16,996][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 05:59:21,532][12883] Updated weights for policy 0, policy_version 72281 (0.0031) +[2024-06-18 05:59:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41233.1, 300 sec: 42098.6). Total num frames: 1184251904. Throughput: 0: 42164.1. Samples: 1184406160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 05:59:21,994][12645] Avg episode reward: [(0, '0.103')] +[2024-06-18 05:59:24,634][12883] Updated weights for policy 0, policy_version 72291 (0.0032) +[2024-06-18 05:59:26,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42209.9). Total num frames: 1184497664. Throughput: 0: 41817.2. Samples: 1184649200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 05:59:26,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 05:59:29,529][12883] Updated weights for policy 0, policy_version 72301 (0.0033) +[2024-06-18 05:59:31,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1184727040. Throughput: 0: 42251.6. Samples: 1184786540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 05:59:31,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 05:59:32,345][12883] Updated weights for policy 0, policy_version 72311 (0.0026) +[2024-06-18 05:59:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41510.4, 300 sec: 42154.1). Total num frames: 1184890880. Throughput: 0: 42061.0. Samples: 1185039480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 05:59:36,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 05:59:37,251][12883] Updated weights for policy 0, policy_version 72321 (0.0034) +[2024-06-18 05:59:40,220][12883] Updated weights for policy 0, policy_version 72331 (0.0033) +[2024-06-18 05:59:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 1185153024. Throughput: 0: 41984.5. Samples: 1185280060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) +[2024-06-18 05:59:41,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 05:59:45,017][12883] Updated weights for policy 0, policy_version 72341 (0.0034) +[2024-06-18 05:59:45,652][12862] Signal inference workers to stop experience collection... (17150 times) +[2024-06-18 05:59:45,652][12862] Signal inference workers to resume experience collection... (17150 times) +[2024-06-18 05:59:45,680][12883] InferenceWorker_p0-w0: stopping experience collection (17150 times) +[2024-06-18 05:59:45,680][12883] InferenceWorker_p0-w0: resuming experience collection (17150 times) +[2024-06-18 05:59:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1185349632. Throughput: 0: 42202.0. Samples: 1185416780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 05:59:46,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 05:59:47,852][12883] Updated weights for policy 0, policy_version 72351 (0.0032) +[2024-06-18 05:59:51,994][12645] Fps is (10 sec: 36045.1, 60 sec: 41233.2, 300 sec: 42098.6). Total num frames: 1185513472. Throughput: 0: 42004.1. Samples: 1185666100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 05:59:51,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 05:59:52,593][12883] Updated weights for policy 0, policy_version 72361 (0.0034) +[2024-06-18 05:59:55,714][12883] Updated weights for policy 0, policy_version 72371 (0.0028) +[2024-06-18 05:59:56,995][12645] Fps is (10 sec: 44229.8, 60 sec: 42870.3, 300 sec: 42209.4). Total num frames: 1185792000. Throughput: 0: 41913.1. Samples: 1185906120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 05:59:56,996][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 06:00:00,533][12883] Updated weights for policy 0, policy_version 72381 (0.0029) +[2024-06-18 06:00:01,996][12645] Fps is (10 sec: 45865.5, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 1185972224. Throughput: 0: 42229.0. Samples: 1186052400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 06:00:01,996][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 06:00:03,494][12883] Updated weights for policy 0, policy_version 72391 (0.0032) +[2024-06-18 06:00:06,994][12645] Fps is (10 sec: 37689.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1186168832. Throughput: 0: 42015.9. Samples: 1186296880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 06:00:06,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 06:00:08,202][12883] Updated weights for policy 0, policy_version 72401 (0.0026) +[2024-06-18 06:00:11,354][12883] Updated weights for policy 0, policy_version 72411 (0.0035) +[2024-06-18 06:00:11,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 1186414592. Throughput: 0: 42280.1. Samples: 1186551800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 06:00:11,994][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 06:00:15,746][12883] Updated weights for policy 0, policy_version 72421 (0.0043) +[2024-06-18 06:00:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1186611200. Throughput: 0: 42243.2. Samples: 1186687480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 06:00:16,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 06:00:18,854][12883] Updated weights for policy 0, policy_version 72431 (0.0035) +[2024-06-18 06:00:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 1186824192. Throughput: 0: 42109.7. Samples: 1186934420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 06:00:21,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 06:00:23,296][12883] Updated weights for policy 0, policy_version 72441 (0.0046) +[2024-06-18 06:00:26,596][12883] Updated weights for policy 0, policy_version 72451 (0.0038) +[2024-06-18 06:00:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1187037184. Throughput: 0: 42515.2. Samples: 1187193240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) +[2024-06-18 06:00:26,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 06:00:31,190][12883] Updated weights for policy 0, policy_version 72461 (0.0047) +[2024-06-18 06:00:31,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 42043.3). Total num frames: 1187217408. Throughput: 0: 42298.9. Samples: 1187320220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:00:31,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 06:00:34,376][12883] Updated weights for policy 0, policy_version 72471 (0.0045) +[2024-06-18 06:00:37,000][12645] Fps is (10 sec: 42571.5, 60 sec: 42867.0, 300 sec: 42264.3). Total num frames: 1187463168. Throughput: 0: 42223.4. Samples: 1187566420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:00:37,000][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 06:00:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072477_1187463168.pth... +[2024-06-18 06:00:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071858_1177321472.pth +[2024-06-18 06:00:38,911][12883] Updated weights for policy 0, policy_version 72481 (0.0033) +[2024-06-18 06:00:41,909][12883] Updated weights for policy 0, policy_version 72491 (0.0045) +[2024-06-18 06:00:41,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1187692544. Throughput: 0: 42612.8. Samples: 1187823620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:00:41,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 06:00:46,615][12883] Updated weights for policy 0, policy_version 72501 (0.0035) +[2024-06-18 06:00:46,994][12645] Fps is (10 sec: 39346.3, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1187856384. Throughput: 0: 42146.8. Samples: 1187948920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:00:46,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 06:00:50,181][12883] Updated weights for policy 0, policy_version 72511 (0.0040) +[2024-06-18 06:00:51,996][12645] Fps is (10 sec: 40950.5, 60 sec: 43142.9, 300 sec: 42264.8). Total num frames: 1188102144. Throughput: 0: 42302.8. Samples: 1188200600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:00:51,997][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 06:00:54,505][12883] Updated weights for policy 0, policy_version 72521 (0.0047) +[2024-06-18 06:00:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41780.4, 300 sec: 42209.6). Total num frames: 1188298752. Throughput: 0: 42314.2. Samples: 1188455940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:00:56,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 06:00:57,888][12883] Updated weights for policy 0, policy_version 72531 (0.0027) +[2024-06-18 06:01:01,981][12883] Updated weights for policy 0, policy_version 72541 (0.0032) +[2024-06-18 06:01:01,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42326.8, 300 sec: 42265.2). Total num frames: 1188511744. Throughput: 0: 42119.1. Samples: 1188582840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:01:01,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 06:01:03,367][12862] Signal inference workers to stop experience collection... (17200 times) +[2024-06-18 06:01:03,367][12862] Signal inference workers to resume experience collection... (17200 times) +[2024-06-18 06:01:03,386][12883] InferenceWorker_p0-w0: stopping experience collection (17200 times) +[2024-06-18 06:01:03,386][12883] InferenceWorker_p0-w0: resuming experience collection (17200 times) +[2024-06-18 06:01:05,513][12883] Updated weights for policy 0, policy_version 72551 (0.0032) +[2024-06-18 06:01:06,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 1188741120. Throughput: 0: 42270.2. Samples: 1188836580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:01:06,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 06:01:09,669][12883] Updated weights for policy 0, policy_version 72561 (0.0050) +[2024-06-18 06:01:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42265.1). Total num frames: 1188937728. Throughput: 0: 42236.3. Samples: 1189093880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 06:01:11,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 06:01:13,319][12883] Updated weights for policy 0, policy_version 72571 (0.0022) +[2024-06-18 06:01:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1189150720. Throughput: 0: 42208.0. Samples: 1189219580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:16,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 06:01:17,241][12883] Updated weights for policy 0, policy_version 72581 (0.0038) +[2024-06-18 06:01:20,966][12883] Updated weights for policy 0, policy_version 72591 (0.0036) +[2024-06-18 06:01:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1189363712. Throughput: 0: 42427.6. Samples: 1189475400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:21,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 06:01:25,251][12883] Updated weights for policy 0, policy_version 72601 (0.0035) +[2024-06-18 06:01:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1189560320. Throughput: 0: 42333.8. Samples: 1189728640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:26,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 06:01:28,770][12883] Updated weights for policy 0, policy_version 72611 (0.0040) +[2024-06-18 06:01:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 1189789696. Throughput: 0: 42220.4. Samples: 1189848840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:31,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 06:01:32,767][12883] Updated weights for policy 0, policy_version 72621 (0.0032) +[2024-06-18 06:01:36,365][12883] Updated weights for policy 0, policy_version 72631 (0.0039) +[2024-06-18 06:01:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42329.7, 300 sec: 42209.6). Total num frames: 1190002688. Throughput: 0: 42440.3. Samples: 1190110320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:36,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 06:01:40,475][12883] Updated weights for policy 0, policy_version 72641 (0.0030) +[2024-06-18 06:01:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 1190199296. Throughput: 0: 42317.2. Samples: 1190360220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:41,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 06:01:44,009][12883] Updated weights for policy 0, policy_version 72651 (0.0037) +[2024-06-18 06:01:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1190395904. Throughput: 0: 42248.0. Samples: 1190484000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:46,994][12645] Avg episode reward: [(0, '0.124')] +[2024-06-18 06:01:48,611][12883] Updated weights for policy 0, policy_version 72661 (0.0031) +[2024-06-18 06:01:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1190625280. Throughput: 0: 42265.4. Samples: 1190738520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:51,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 06:01:52,221][12883] Updated weights for policy 0, policy_version 72671 (0.0032) +[2024-06-18 06:01:56,535][12883] Updated weights for policy 0, policy_version 72681 (0.0029) +[2024-06-18 06:01:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1190838272. Throughput: 0: 42241.9. Samples: 1190994760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 06:01:56,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 06:01:59,788][12883] Updated weights for policy 0, policy_version 72691 (0.0044) +[2024-06-18 06:02:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1191034880. Throughput: 0: 42179.0. Samples: 1191117640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:01,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 06:02:04,263][12883] Updated weights for policy 0, policy_version 72701 (0.0038) +[2024-06-18 06:02:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42154.3). Total num frames: 1191264256. Throughput: 0: 42158.2. Samples: 1191372520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:06,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 06:02:07,654][12883] Updated weights for policy 0, policy_version 72711 (0.0031) +[2024-06-18 06:02:11,868][12883] Updated weights for policy 0, policy_version 72721 (0.0029) +[2024-06-18 06:02:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1191460864. Throughput: 0: 42244.2. Samples: 1191629640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:11,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 06:02:15,230][12883] Updated weights for policy 0, policy_version 72731 (0.0034) +[2024-06-18 06:02:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1191690240. Throughput: 0: 42331.9. Samples: 1191753780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:16,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 06:02:19,499][12883] Updated weights for policy 0, policy_version 72741 (0.0032) +[2024-06-18 06:02:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1191903232. Throughput: 0: 42336.1. Samples: 1192015440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:21,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 06:02:22,881][12883] Updated weights for policy 0, policy_version 72751 (0.0040) +[2024-06-18 06:02:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1192099840. Throughput: 0: 42467.2. Samples: 1192271240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:26,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 06:02:27,078][12883] Updated weights for policy 0, policy_version 72761 (0.0031) +[2024-06-18 06:02:30,462][12883] Updated weights for policy 0, policy_version 72771 (0.0029) +[2024-06-18 06:02:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42321.0). Total num frames: 1192345600. Throughput: 0: 42435.9. Samples: 1192393620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:31,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 06:02:34,799][12883] Updated weights for policy 0, policy_version 72781 (0.0022) +[2024-06-18 06:02:35,855][12862] Signal inference workers to stop experience collection... (17250 times) +[2024-06-18 06:02:35,855][12862] Signal inference workers to resume experience collection... (17250 times) +[2024-06-18 06:02:35,897][12883] InferenceWorker_p0-w0: stopping experience collection (17250 times) +[2024-06-18 06:02:35,897][12883] InferenceWorker_p0-w0: resuming experience collection (17250 times) +[2024-06-18 06:02:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1192542208. Throughput: 0: 42633.4. Samples: 1192657020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:36,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 06:02:37,064][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072788_1192558592.pth... +[2024-06-18 06:02:37,116][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072168_1182400512.pth +[2024-06-18 06:02:38,338][12883] Updated weights for policy 0, policy_version 72791 (0.0035) +[2024-06-18 06:02:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1192738816. Throughput: 0: 42577.8. Samples: 1192910760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 06:02:41,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 06:02:42,319][12883] Updated weights for policy 0, policy_version 72801 (0.0030) +[2024-06-18 06:02:45,870][12883] Updated weights for policy 0, policy_version 72811 (0.0023) +[2024-06-18 06:02:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 1192984576. Throughput: 0: 42660.5. Samples: 1193037360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:02:46,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 06:02:50,248][12883] Updated weights for policy 0, policy_version 72821 (0.0036) +[2024-06-18 06:02:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1193181184. Throughput: 0: 42819.2. Samples: 1193299380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:02:51,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 06:02:53,477][12883] Updated weights for policy 0, policy_version 72831 (0.0029) +[2024-06-18 06:02:57,000][12645] Fps is (10 sec: 39296.7, 60 sec: 42320.9, 300 sec: 42319.8). Total num frames: 1193377792. Throughput: 0: 42719.9. Samples: 1193552300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:02:57,001][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 06:02:58,204][12883] Updated weights for policy 0, policy_version 72841 (0.0029) +[2024-06-18 06:03:01,100][12883] Updated weights for policy 0, policy_version 72851 (0.0032) +[2024-06-18 06:03:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42265.1). Total num frames: 1193607168. Throughput: 0: 42738.7. Samples: 1193677020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:03:01,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 06:03:05,803][12883] Updated weights for policy 0, policy_version 72861 (0.0032) +[2024-06-18 06:03:06,994][12645] Fps is (10 sec: 44264.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1193820160. Throughput: 0: 42618.6. Samples: 1193933280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:03:06,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 06:03:08,999][12883] Updated weights for policy 0, policy_version 72871 (0.0038) +[2024-06-18 06:03:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1194033152. Throughput: 0: 42632.8. Samples: 1194189720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:03:11,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 06:03:13,332][12883] Updated weights for policy 0, policy_version 72881 (0.0035) +[2024-06-18 06:03:16,506][12883] Updated weights for policy 0, policy_version 72891 (0.0031) +[2024-06-18 06:03:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 1194262528. Throughput: 0: 42813.5. Samples: 1194320220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:03:16,994][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 06:03:20,814][12883] Updated weights for policy 0, policy_version 72901 (0.0036) +[2024-06-18 06:03:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1194442752. Throughput: 0: 42710.1. Samples: 1194578980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:03:21,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 06:03:24,285][12883] Updated weights for policy 0, policy_version 72911 (0.0030) +[2024-06-18 06:03:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1194655744. Throughput: 0: 42734.3. Samples: 1194833800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) +[2024-06-18 06:03:26,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 06:03:28,470][12883] Updated weights for policy 0, policy_version 72921 (0.0043) +[2024-06-18 06:03:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42321.6). Total num frames: 1194885120. Throughput: 0: 42771.9. Samples: 1194962100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:03:31,997][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 06:03:32,363][12883] Updated weights for policy 0, policy_version 72931 (0.0023) +[2024-06-18 06:03:36,109][12883] Updated weights for policy 0, policy_version 72941 (0.0039) +[2024-06-18 06:03:36,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1195098112. Throughput: 0: 42615.8. Samples: 1195217100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:03:36,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 06:03:39,839][12883] Updated weights for policy 0, policy_version 72951 (0.0037) +[2024-06-18 06:03:41,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42869.8, 300 sec: 42320.4). Total num frames: 1195311104. Throughput: 0: 42559.8. Samples: 1195467320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:03:41,997][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 06:03:43,872][12883] Updated weights for policy 0, policy_version 72961 (0.0039) +[2024-06-18 06:03:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1195540480. Throughput: 0: 42594.8. Samples: 1195593780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:03:46,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 06:03:47,347][12883] Updated weights for policy 0, policy_version 72971 (0.0032) +[2024-06-18 06:03:51,636][12883] Updated weights for policy 0, policy_version 72981 (0.0031) +[2024-06-18 06:03:51,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1195737088. Throughput: 0: 42870.7. Samples: 1195862460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:03:51,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 06:03:54,721][12883] Updated weights for policy 0, policy_version 72991 (0.0039) +[2024-06-18 06:03:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42875.8, 300 sec: 42321.0). Total num frames: 1195950080. Throughput: 0: 42787.5. Samples: 1196115160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:03:56,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 06:03:59,215][12883] Updated weights for policy 0, policy_version 73001 (0.0034) +[2024-06-18 06:04:01,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1196195840. Throughput: 0: 42669.6. Samples: 1196240360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:04:01,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 06:04:02,486][12883] Updated weights for policy 0, policy_version 73011 (0.0044) +[2024-06-18 06:04:04,486][12862] Signal inference workers to stop experience collection... (17300 times) +[2024-06-18 06:04:04,524][12883] InferenceWorker_p0-w0: stopping experience collection (17300 times) +[2024-06-18 06:04:04,543][12862] Signal inference workers to resume experience collection... (17300 times) +[2024-06-18 06:04:04,544][12883] InferenceWorker_p0-w0: resuming experience collection (17300 times) +[2024-06-18 06:04:06,843][12883] Updated weights for policy 0, policy_version 73021 (0.0051) +[2024-06-18 06:04:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1196376064. Throughput: 0: 42721.3. Samples: 1196501440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:04:06,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 06:04:10,281][12883] Updated weights for policy 0, policy_version 73031 (0.0036) +[2024-06-18 06:04:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42321.0). Total num frames: 1196572672. Throughput: 0: 42587.8. Samples: 1196750260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 06:04:11,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 06:04:14,630][12883] Updated weights for policy 0, policy_version 73041 (0.0037) +[2024-06-18 06:04:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1196818432. Throughput: 0: 42581.4. Samples: 1196878260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:16,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 06:04:17,693][12883] Updated weights for policy 0, policy_version 73051 (0.0030) +[2024-06-18 06:04:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1196982272. Throughput: 0: 42539.5. Samples: 1197131380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:21,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 06:04:22,486][12883] Updated weights for policy 0, policy_version 73061 (0.0037) +[2024-06-18 06:04:25,757][12883] Updated weights for policy 0, policy_version 73071 (0.0047) +[2024-06-18 06:04:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42376.2). Total num frames: 1197228032. Throughput: 0: 42578.9. Samples: 1197383280. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:26,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 06:04:30,256][12883] Updated weights for policy 0, policy_version 73081 (0.0038) +[2024-06-18 06:04:31,996][12645] Fps is (10 sec: 49141.8, 60 sec: 43143.0, 300 sec: 42653.6). Total num frames: 1197473792. Throughput: 0: 42785.0. Samples: 1197519200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:31,996][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 06:04:33,566][12883] Updated weights for policy 0, policy_version 73091 (0.0028) +[2024-06-18 06:04:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1197637632. Throughput: 0: 42413.2. Samples: 1197771060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:36,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 06:04:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073098_1197637632.pth... +[2024-06-18 06:04:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072477_1187463168.pth +[2024-06-18 06:04:37,979][12883] Updated weights for policy 0, policy_version 73101 (0.0030) +[2024-06-18 06:04:41,048][12883] Updated weights for policy 0, policy_version 73111 (0.0036) +[2024-06-18 06:04:41,994][12645] Fps is (10 sec: 39330.5, 60 sec: 42600.1, 300 sec: 42431.8). Total num frames: 1197867008. Throughput: 0: 42334.4. Samples: 1198020200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:41,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 06:04:45,498][12883] Updated weights for policy 0, policy_version 73121 (0.0040) +[2024-06-18 06:04:46,994][12645] Fps is (10 sec: 45876.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1198096384. Throughput: 0: 42550.4. Samples: 1198155120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:46,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 06:04:48,593][12883] Updated weights for policy 0, policy_version 73131 (0.0038) +[2024-06-18 06:04:51,996][12645] Fps is (10 sec: 39312.5, 60 sec: 42050.7, 300 sec: 42265.1). Total num frames: 1198260224. Throughput: 0: 42403.7. Samples: 1198409700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:51,996][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 06:04:53,119][12883] Updated weights for policy 0, policy_version 73141 (0.0033) +[2024-06-18 06:04:56,329][12883] Updated weights for policy 0, policy_version 73151 (0.0030) +[2024-06-18 06:04:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42487.6). Total num frames: 1198505984. Throughput: 0: 42241.9. Samples: 1198651140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) +[2024-06-18 06:04:56,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 06:05:00,928][12883] Updated weights for policy 0, policy_version 73161 (0.0022) +[2024-06-18 06:05:01,994][12645] Fps is (10 sec: 44247.2, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1198702592. Throughput: 0: 42416.1. Samples: 1198786980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:01,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 06:05:03,982][12883] Updated weights for policy 0, policy_version 73171 (0.0045) +[2024-06-18 06:05:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1198899200. Throughput: 0: 42307.6. Samples: 1199035220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:06,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 06:05:08,533][12883] Updated weights for policy 0, policy_version 73181 (0.0036) +[2024-06-18 06:05:11,567][12883] Updated weights for policy 0, policy_version 73191 (0.0044) +[2024-06-18 06:05:11,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1199161344. Throughput: 0: 42099.6. Samples: 1199277760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:11,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 06:05:16,789][12883] Updated weights for policy 0, policy_version 73201 (0.0043) +[2024-06-18 06:05:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 1199325184. Throughput: 0: 42203.8. Samples: 1199418280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:16,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 06:05:18,105][12862] Signal inference workers to stop experience collection... (17350 times) +[2024-06-18 06:05:18,105][12862] Signal inference workers to resume experience collection... (17350 times) +[2024-06-18 06:05:18,157][12883] InferenceWorker_p0-w0: stopping experience collection (17350 times) +[2024-06-18 06:05:18,157][12883] InferenceWorker_p0-w0: resuming experience collection (17350 times) +[2024-06-18 06:05:19,876][12883] Updated weights for policy 0, policy_version 73211 (0.0037) +[2024-06-18 06:05:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1199538176. Throughput: 0: 42121.8. Samples: 1199666540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 06:05:24,391][12883] Updated weights for policy 0, policy_version 73221 (0.0029) +[2024-06-18 06:05:26,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 1199783936. Throughput: 0: 42116.0. Samples: 1199915520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:27,005][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 06:05:27,385][12883] Updated weights for policy 0, policy_version 73231 (0.0032) +[2024-06-18 06:05:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41507.6, 300 sec: 42377.1). Total num frames: 1199964160. Throughput: 0: 42115.8. Samples: 1200050340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:32,000][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 06:05:32,227][12883] Updated weights for policy 0, policy_version 73241 (0.0032) +[2024-06-18 06:05:35,590][12883] Updated weights for policy 0, policy_version 73251 (0.0038) +[2024-06-18 06:05:36,994][12645] Fps is (10 sec: 37691.3, 60 sec: 42052.3, 300 sec: 42265.1). Total num frames: 1200160768. Throughput: 0: 41950.0. Samples: 1200297360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:36,995][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 06:05:39,888][12883] Updated weights for policy 0, policy_version 73261 (0.0033) +[2024-06-18 06:05:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1200406528. Throughput: 0: 42163.9. Samples: 1200548520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 06:05:41,995][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 06:05:43,650][12883] Updated weights for policy 0, policy_version 73271 (0.0033) +[2024-06-18 06:05:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42376.6). Total num frames: 1200603136. Throughput: 0: 42071.8. Samples: 1200680220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:05:46,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 06:05:47,717][12883] Updated weights for policy 0, policy_version 73281 (0.0034) +[2024-06-18 06:05:51,131][12883] Updated weights for policy 0, policy_version 73291 (0.0035) +[2024-06-18 06:05:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42600.0, 300 sec: 42431.8). Total num frames: 1200816128. Throughput: 0: 42043.6. Samples: 1200927180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:05:51,994][12645] Avg episode reward: [(0, '0.101')] +[2024-06-18 06:05:55,345][12883] Updated weights for policy 0, policy_version 73301 (0.0026) +[2024-06-18 06:05:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1201045504. Throughput: 0: 42423.6. Samples: 1201186820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:05:56,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 06:05:58,649][12883] Updated weights for policy 0, policy_version 73311 (0.0046) +[2024-06-18 06:06:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1201225728. Throughput: 0: 42251.2. Samples: 1201319580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:06:01,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 06:06:02,879][12883] Updated weights for policy 0, policy_version 73321 (0.0030) +[2024-06-18 06:06:06,410][12883] Updated weights for policy 0, policy_version 73331 (0.0036) +[2024-06-18 06:06:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1201455104. Throughput: 0: 42250.3. Samples: 1201567800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:06:06,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 06:06:10,660][12883] Updated weights for policy 0, policy_version 73341 (0.0036) +[2024-06-18 06:06:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 1201668096. Throughput: 0: 42411.1. Samples: 1201823920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:06:11,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 06:06:14,595][12883] Updated weights for policy 0, policy_version 73351 (0.0030) +[2024-06-18 06:06:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1201864704. Throughput: 0: 42152.9. Samples: 1201947220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:06:16,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 06:06:18,383][12883] Updated weights for policy 0, policy_version 73361 (0.0031) +[2024-06-18 06:06:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1202094080. Throughput: 0: 42383.3. Samples: 1202204600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:06:21,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 06:06:22,110][12883] Updated weights for policy 0, policy_version 73371 (0.0023) +[2024-06-18 06:06:26,139][12883] Updated weights for policy 0, policy_version 73381 (0.0027) +[2024-06-18 06:06:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42053.8, 300 sec: 42431.8). Total num frames: 1202307072. Throughput: 0: 42490.2. Samples: 1202460580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 06:06:26,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 06:06:29,669][12883] Updated weights for policy 0, policy_version 73391 (0.0030) +[2024-06-18 06:06:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1202503680. Throughput: 0: 42315.6. Samples: 1202584420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:06:31,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 06:06:33,823][12862] Signal inference workers to stop experience collection... (17400 times) +[2024-06-18 06:06:33,828][12862] Signal inference workers to resume experience collection... (17400 times) +[2024-06-18 06:06:33,857][12883] InferenceWorker_p0-w0: stopping experience collection (17400 times) +[2024-06-18 06:06:33,857][12883] InferenceWorker_p0-w0: resuming experience collection (17400 times) +[2024-06-18 06:06:33,978][12883] Updated weights for policy 0, policy_version 73401 (0.0029) +[2024-06-18 06:06:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1202733056. Throughput: 0: 42395.5. Samples: 1202834980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:06:37,004][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 06:06:37,149][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073410_1202749440.pth... +[2024-06-18 06:06:37,214][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072788_1192558592.pth +[2024-06-18 06:06:37,773][12883] Updated weights for policy 0, policy_version 73411 (0.0041) +[2024-06-18 06:06:41,715][12883] Updated weights for policy 0, policy_version 73421 (0.0032) +[2024-06-18 06:06:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1202929664. Throughput: 0: 42292.9. Samples: 1203090000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:06:41,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 06:06:45,629][12883] Updated weights for policy 0, policy_version 73431 (0.0043) +[2024-06-18 06:06:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1203142656. Throughput: 0: 42257.3. Samples: 1203221160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:06:46,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 06:06:49,505][12883] Updated weights for policy 0, policy_version 73441 (0.0033) +[2024-06-18 06:06:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1203355648. Throughput: 0: 42123.6. Samples: 1203463360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:06:51,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 06:06:53,615][12883] Updated weights for policy 0, policy_version 73451 (0.0028) +[2024-06-18 06:06:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1203568640. Throughput: 0: 42142.2. Samples: 1203720320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:06:56,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 06:06:57,176][12883] Updated weights for policy 0, policy_version 73461 (0.0034) +[2024-06-18 06:07:01,334][12883] Updated weights for policy 0, policy_version 73471 (0.0047) +[2024-06-18 06:07:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1203781632. Throughput: 0: 42297.4. Samples: 1203850600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:07:01,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 06:07:05,109][12883] Updated weights for policy 0, policy_version 73481 (0.0038) +[2024-06-18 06:07:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1203978240. Throughput: 0: 42047.2. Samples: 1204096720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:07:06,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 06:07:09,183][12883] Updated weights for policy 0, policy_version 73491 (0.0027) +[2024-06-18 06:07:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1204191232. Throughput: 0: 42024.9. Samples: 1204351700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:07:11,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 06:07:12,802][12883] Updated weights for policy 0, policy_version 73501 (0.0035) +[2024-06-18 06:07:16,955][12883] Updated weights for policy 0, policy_version 73511 (0.0037) +[2024-06-18 06:07:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1204404224. Throughput: 0: 42118.1. Samples: 1204479740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:16,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 06:07:20,717][12883] Updated weights for policy 0, policy_version 73521 (0.0042) +[2024-06-18 06:07:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1204617216. Throughput: 0: 42153.8. Samples: 1204731900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:21,995][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 06:07:25,128][12883] Updated weights for policy 0, policy_version 73531 (0.0039) +[2024-06-18 06:07:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1204813824. Throughput: 0: 42180.0. Samples: 1204988100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:26,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 06:07:28,387][12883] Updated weights for policy 0, policy_version 73541 (0.0027) +[2024-06-18 06:07:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1205026816. Throughput: 0: 41996.9. Samples: 1205111020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:31,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 06:07:32,687][12883] Updated weights for policy 0, policy_version 73551 (0.0026) +[2024-06-18 06:07:35,998][12883] Updated weights for policy 0, policy_version 73561 (0.0032) +[2024-06-18 06:07:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1205256192. Throughput: 0: 42355.5. Samples: 1205369360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:36,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 06:07:40,232][12883] Updated weights for policy 0, policy_version 73571 (0.0028) +[2024-06-18 06:07:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1205452800. Throughput: 0: 42393.8. Samples: 1205628040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:41,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 06:07:43,619][12883] Updated weights for policy 0, policy_version 73581 (0.0025) +[2024-06-18 06:07:46,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1205665792. Throughput: 0: 42253.4. Samples: 1205752000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:46,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 06:07:48,009][12883] Updated weights for policy 0, policy_version 73591 (0.0032) +[2024-06-18 06:07:51,306][12883] Updated weights for policy 0, policy_version 73601 (0.0022) +[2024-06-18 06:07:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42432.7). Total num frames: 1205895168. Throughput: 0: 42397.3. Samples: 1206004600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:51,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 06:07:56,351][12883] Updated weights for policy 0, policy_version 73611 (0.0031) +[2024-06-18 06:07:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 1206075392. Throughput: 0: 42427.5. Samples: 1206260940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 06:07:56,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 06:07:57,444][12862] Signal inference workers to stop experience collection... (17450 times) +[2024-06-18 06:07:57,445][12862] Signal inference workers to resume experience collection... (17450 times) +[2024-06-18 06:07:57,485][12883] InferenceWorker_p0-w0: stopping experience collection (17450 times) +[2024-06-18 06:07:57,485][12883] InferenceWorker_p0-w0: resuming experience collection (17450 times) +[2024-06-18 06:07:59,153][12883] Updated weights for policy 0, policy_version 73621 (0.0035) +[2024-06-18 06:08:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1206304768. Throughput: 0: 42065.5. Samples: 1206372680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:01,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 06:08:04,088][12883] Updated weights for policy 0, policy_version 73631 (0.0040) +[2024-06-18 06:08:06,925][12883] Updated weights for policy 0, policy_version 73641 (0.0040) +[2024-06-18 06:08:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1206534144. Throughput: 0: 42207.1. Samples: 1206631220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:06,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 06:08:11,689][12883] Updated weights for policy 0, policy_version 73651 (0.0037) +[2024-06-18 06:08:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1206697984. Throughput: 0: 42244.4. Samples: 1206889100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:11,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 06:08:14,818][12883] Updated weights for policy 0, policy_version 73661 (0.0031) +[2024-06-18 06:08:16,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42050.7, 300 sec: 42320.4). Total num frames: 1206927360. Throughput: 0: 42117.3. Samples: 1207006400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:16,997][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 06:08:19,415][12883] Updated weights for policy 0, policy_version 73671 (0.0028) +[2024-06-18 06:08:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1207156736. Throughput: 0: 42067.6. Samples: 1207262400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:21,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 06:08:22,637][12883] Updated weights for policy 0, policy_version 73681 (0.0045) +[2024-06-18 06:08:26,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1207336960. Throughput: 0: 42039.0. Samples: 1207519800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:26,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 06:08:27,232][12883] Updated weights for policy 0, policy_version 73691 (0.0033) +[2024-06-18 06:08:30,589][12883] Updated weights for policy 0, policy_version 73701 (0.0032) +[2024-06-18 06:08:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1207582720. Throughput: 0: 42045.7. Samples: 1207644060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:31,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 06:08:34,798][12883] Updated weights for policy 0, policy_version 73711 (0.0043) +[2024-06-18 06:08:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42265.5). Total num frames: 1207779328. Throughput: 0: 42175.9. Samples: 1207902520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:36,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 06:08:37,047][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073718_1207795712.pth... +[2024-06-18 06:08:37,113][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073098_1197637632.pth +[2024-06-18 06:08:38,345][12883] Updated weights for policy 0, policy_version 73721 (0.0031) +[2024-06-18 06:08:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1207975936. Throughput: 0: 42111.1. Samples: 1208155940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 06:08:41,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 06:08:42,278][12883] Updated weights for policy 0, policy_version 73731 (0.0031) +[2024-06-18 06:08:46,009][12883] Updated weights for policy 0, policy_version 73741 (0.0031) +[2024-06-18 06:08:46,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 1208238080. Throughput: 0: 42485.4. Samples: 1208284520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:08:46,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 06:08:50,099][12883] Updated weights for policy 0, policy_version 73751 (0.0041) +[2024-06-18 06:08:51,999][12645] Fps is (10 sec: 42575.6, 60 sec: 41775.4, 300 sec: 42208.9). Total num frames: 1208401920. Throughput: 0: 42226.6. Samples: 1208531640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:08:52,000][12645] Avg episode reward: [(0, '0.024')] +[2024-06-18 06:08:53,973][12883] Updated weights for policy 0, policy_version 73761 (0.0027) +[2024-06-18 06:08:56,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 1208614912. Throughput: 0: 42134.3. Samples: 1208785140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:08:56,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 06:08:57,789][12883] Updated weights for policy 0, policy_version 73771 (0.0042) +[2024-06-18 06:09:01,632][12883] Updated weights for policy 0, policy_version 73781 (0.0028) +[2024-06-18 06:09:01,994][12645] Fps is (10 sec: 44261.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1208844288. Throughput: 0: 42449.4. Samples: 1208916520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:09:01,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 06:09:05,456][12883] Updated weights for policy 0, policy_version 73791 (0.0023) +[2024-06-18 06:09:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1209040896. Throughput: 0: 42351.5. Samples: 1209168220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:09:06,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 06:09:09,355][12883] Updated weights for policy 0, policy_version 73801 (0.0035) +[2024-06-18 06:09:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 1209270272. Throughput: 0: 42112.5. Samples: 1209414860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:09:11,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 06:09:13,294][12883] Updated weights for policy 0, policy_version 73811 (0.0043) +[2024-06-18 06:09:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1209466880. Throughput: 0: 42426.2. Samples: 1209553240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:09:16,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 06:09:17,047][12883] Updated weights for policy 0, policy_version 73821 (0.0038) +[2024-06-18 06:09:21,099][12883] Updated weights for policy 0, policy_version 73831 (0.0026) +[2024-06-18 06:09:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1209679872. Throughput: 0: 42124.4. Samples: 1209798120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:09:21,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 06:09:24,877][12883] Updated weights for policy 0, policy_version 73841 (0.0038) +[2024-06-18 06:09:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42209.9). Total num frames: 1209925632. Throughput: 0: 41983.6. Samples: 1210045200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) +[2024-06-18 06:09:26,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 06:09:28,728][12883] Updated weights for policy 0, policy_version 73851 (0.0028) +[2024-06-18 06:09:31,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42050.7, 300 sec: 42264.9). Total num frames: 1210105856. Throughput: 0: 42205.8. Samples: 1210183880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:09:31,997][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 06:09:32,352][12862] Signal inference workers to stop experience collection... (17500 times) +[2024-06-18 06:09:32,352][12862] Signal inference workers to resume experience collection... (17500 times) +[2024-06-18 06:09:32,391][12883] InferenceWorker_p0-w0: stopping experience collection (17500 times) +[2024-06-18 06:09:32,391][12883] InferenceWorker_p0-w0: resuming experience collection (17500 times) +[2024-06-18 06:09:32,495][12883] Updated weights for policy 0, policy_version 73861 (0.0047) +[2024-06-18 06:09:36,303][12883] Updated weights for policy 0, policy_version 73871 (0.0041) +[2024-06-18 06:09:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1210302464. Throughput: 0: 42215.2. Samples: 1210431100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:09:36,998][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 06:09:40,270][12883] Updated weights for policy 0, policy_version 73881 (0.0040) +[2024-06-18 06:09:41,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 1210548224. Throughput: 0: 42148.9. Samples: 1210681840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:09:41,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 06:09:43,955][12883] Updated weights for policy 0, policy_version 73891 (0.0037) +[2024-06-18 06:09:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.1, 300 sec: 42321.0). Total num frames: 1210744832. Throughput: 0: 42245.8. Samples: 1210817580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:09:46,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 06:09:48,020][12883] Updated weights for policy 0, policy_version 73901 (0.0038) +[2024-06-18 06:09:51,614][12883] Updated weights for policy 0, policy_version 73911 (0.0033) +[2024-06-18 06:09:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42602.3, 300 sec: 42209.6). Total num frames: 1210957824. Throughput: 0: 42120.5. Samples: 1211063640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:09:51,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 06:09:55,659][12883] Updated weights for policy 0, policy_version 73921 (0.0040) +[2024-06-18 06:09:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1211187200. Throughput: 0: 42362.3. Samples: 1211321160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:09:56,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 06:09:59,233][12883] Updated weights for policy 0, policy_version 73931 (0.0038) +[2024-06-18 06:10:01,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42052.1, 300 sec: 42265.1). Total num frames: 1211367424. Throughput: 0: 42070.6. Samples: 1211446420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:10:01,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 06:10:03,352][12883] Updated weights for policy 0, policy_version 73941 (0.0046) +[2024-06-18 06:10:06,996][12645] Fps is (10 sec: 40950.4, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 1211596800. Throughput: 0: 42341.0. Samples: 1211703560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:10:06,997][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 06:10:07,231][12883] Updated weights for policy 0, policy_version 73951 (0.0042) +[2024-06-18 06:10:11,178][12883] Updated weights for policy 0, policy_version 73961 (0.0041) +[2024-06-18 06:10:11,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1211809792. Throughput: 0: 42503.1. Samples: 1211957840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:10:11,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 06:10:14,932][12883] Updated weights for policy 0, policy_version 73971 (0.0038) +[2024-06-18 06:10:16,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1211990016. Throughput: 0: 42106.9. Samples: 1212078600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:16,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 06:10:19,038][12883] Updated weights for policy 0, policy_version 73981 (0.0038) +[2024-06-18 06:10:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42209.9). Total num frames: 1212235776. Throughput: 0: 42348.0. Samples: 1212336760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:21,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 06:10:22,640][12883] Updated weights for policy 0, policy_version 73991 (0.0028) +[2024-06-18 06:10:26,754][12883] Updated weights for policy 0, policy_version 74001 (0.0029) +[2024-06-18 06:10:26,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1212432384. Throughput: 0: 42425.0. Samples: 1212590960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:26,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 06:10:30,688][12883] Updated weights for policy 0, policy_version 74011 (0.0033) +[2024-06-18 06:10:31,996][12645] Fps is (10 sec: 39313.3, 60 sec: 42052.3, 300 sec: 42264.9). Total num frames: 1212628992. Throughput: 0: 42199.2. Samples: 1212716640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:31,996][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 06:10:34,338][12883] Updated weights for policy 0, policy_version 74021 (0.0028) +[2024-06-18 06:10:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1212858368. Throughput: 0: 42364.3. Samples: 1212970040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:36,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 06:10:37,112][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074028_1212874752.pth... +[2024-06-18 06:10:37,161][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073410_1202749440.pth +[2024-06-18 06:10:38,390][12883] Updated weights for policy 0, policy_version 74031 (0.0035) +[2024-06-18 06:10:41,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1213071360. Throughput: 0: 42169.2. Samples: 1213218780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:41,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 06:10:42,020][12883] Updated weights for policy 0, policy_version 74041 (0.0042) +[2024-06-18 06:10:46,698][12883] Updated weights for policy 0, policy_version 74051 (0.0039) +[2024-06-18 06:10:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1213267968. Throughput: 0: 42154.4. Samples: 1213343360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:46,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 06:10:49,884][12883] Updated weights for policy 0, policy_version 74061 (0.0026) +[2024-06-18 06:10:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1213480960. Throughput: 0: 42131.1. Samples: 1213599360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:51,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 06:10:54,329][12883] Updated weights for policy 0, policy_version 74071 (0.0030) +[2024-06-18 06:10:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1213710336. Throughput: 0: 42108.1. Samples: 1213852700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) +[2024-06-18 06:10:56,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 06:10:57,480][12883] Updated weights for policy 0, policy_version 74081 (0.0037) +[2024-06-18 06:11:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.5, 300 sec: 42154.1). Total num frames: 1213890560. Throughput: 0: 42241.5. Samples: 1213979460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:01,994][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 06:11:02,013][12883] Updated weights for policy 0, policy_version 74091 (0.0030) +[2024-06-18 06:11:05,061][12883] Updated weights for policy 0, policy_version 74101 (0.0042) +[2024-06-18 06:11:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42053.8, 300 sec: 42209.6). Total num frames: 1214119936. Throughput: 0: 42106.7. Samples: 1214231560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:06,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 06:11:09,853][12883] Updated weights for policy 0, policy_version 74111 (0.0027) +[2024-06-18 06:11:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 1214332928. Throughput: 0: 42117.3. Samples: 1214486240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:11,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 06:11:13,049][12883] Updated weights for policy 0, policy_version 74121 (0.0038) +[2024-06-18 06:11:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1214513152. Throughput: 0: 42179.4. Samples: 1214614620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:16,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 06:11:17,639][12883] Updated weights for policy 0, policy_version 74131 (0.0039) +[2024-06-18 06:11:20,836][12883] Updated weights for policy 0, policy_version 74141 (0.0032) +[2024-06-18 06:11:21,996][12645] Fps is (10 sec: 40950.1, 60 sec: 41777.7, 300 sec: 42153.8). Total num frames: 1214742528. Throughput: 0: 42025.5. Samples: 1214861280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:21,997][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 06:11:25,395][12883] Updated weights for policy 0, policy_version 74151 (0.0027) +[2024-06-18 06:11:26,392][12862] Signal inference workers to stop experience collection... (17550 times) +[2024-06-18 06:11:26,392][12862] Signal inference workers to resume experience collection... (17550 times) +[2024-06-18 06:11:26,428][12883] InferenceWorker_p0-w0: stopping experience collection (17550 times) +[2024-06-18 06:11:26,428][12883] InferenceWorker_p0-w0: resuming experience collection (17550 times) +[2024-06-18 06:11:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1214955520. Throughput: 0: 42327.6. Samples: 1215123520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:26,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 06:11:28,339][12883] Updated weights for policy 0, policy_version 74161 (0.0027) +[2024-06-18 06:11:31,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42326.9, 300 sec: 42154.1). Total num frames: 1215168512. Throughput: 0: 42451.6. Samples: 1215253680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:31,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 06:11:33,279][12883] Updated weights for policy 0, policy_version 74171 (0.0027) +[2024-06-18 06:11:35,924][12883] Updated weights for policy 0, policy_version 74181 (0.0023) +[2024-06-18 06:11:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1215381504. Throughput: 0: 42219.4. Samples: 1215499240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:36,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 06:11:40,980][12883] Updated weights for policy 0, policy_version 74191 (0.0043) +[2024-06-18 06:11:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1215594496. Throughput: 0: 42468.3. Samples: 1215763780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:11:41,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 06:11:43,566][12883] Updated weights for policy 0, policy_version 74201 (0.0034) +[2024-06-18 06:11:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1215791104. Throughput: 0: 42416.3. Samples: 1215888200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:11:46,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 06:11:48,518][12883] Updated weights for policy 0, policy_version 74211 (0.0042) +[2024-06-18 06:11:51,683][12883] Updated weights for policy 0, policy_version 74221 (0.0027) +[2024-06-18 06:11:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 1216036864. Throughput: 0: 42376.0. Samples: 1216138480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:11:51,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 06:11:56,056][12883] Updated weights for policy 0, policy_version 74231 (0.0027) +[2024-06-18 06:11:56,996][12645] Fps is (10 sec: 44227.4, 60 sec: 42050.7, 300 sec: 42209.3). Total num frames: 1216233472. Throughput: 0: 42510.3. Samples: 1216399300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:11:56,996][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 06:11:59,225][12883] Updated weights for policy 0, policy_version 74241 (0.0041) +[2024-06-18 06:12:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 1216446464. Throughput: 0: 42402.3. Samples: 1216522720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:12:01,994][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 06:12:03,925][12883] Updated weights for policy 0, policy_version 74251 (0.0024) +[2024-06-18 06:12:06,994][12645] Fps is (10 sec: 44246.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1216675840. Throughput: 0: 42585.2. Samples: 1216777520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:12:06,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 06:12:07,368][12883] Updated weights for policy 0, policy_version 74261 (0.0037) +[2024-06-18 06:12:11,494][12883] Updated weights for policy 0, policy_version 74271 (0.0027) +[2024-06-18 06:12:11,993][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1216872448. Throughput: 0: 42408.2. Samples: 1217031880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:12:11,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 06:12:15,471][12883] Updated weights for policy 0, policy_version 74281 (0.0044) +[2024-06-18 06:12:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1217085440. Throughput: 0: 42317.7. Samples: 1217157980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:12:16,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 06:12:19,373][12883] Updated weights for policy 0, policy_version 74291 (0.0037) +[2024-06-18 06:12:21,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42873.0, 300 sec: 42376.2). Total num frames: 1217314816. Throughput: 0: 42501.3. Samples: 1217411800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:12:21,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 06:12:23,133][12883] Updated weights for policy 0, policy_version 74301 (0.0029) +[2024-06-18 06:12:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1217495040. Throughput: 0: 42383.6. Samples: 1217671040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 06:12:26,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 06:12:27,092][12883] Updated weights for policy 0, policy_version 74311 (0.0025) +[2024-06-18 06:12:30,608][12883] Updated weights for policy 0, policy_version 74321 (0.0028) +[2024-06-18 06:12:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1217708032. Throughput: 0: 42392.9. Samples: 1217795880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:12:31,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 06:12:33,927][12862] Signal inference workers to stop experience collection... (17600 times) +[2024-06-18 06:12:33,927][12862] Signal inference workers to resume experience collection... (17600 times) +[2024-06-18 06:12:33,942][12883] InferenceWorker_p0-w0: stopping experience collection (17600 times) +[2024-06-18 06:12:33,956][12883] InferenceWorker_p0-w0: resuming experience collection (17600 times) +[2024-06-18 06:12:34,583][12883] Updated weights for policy 0, policy_version 74331 (0.0040) +[2024-06-18 06:12:36,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1217953792. Throughput: 0: 42453.7. Samples: 1218048900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:12:36,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 06:12:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074338_1217953792.pth... +[2024-06-18 06:12:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073718_1207795712.pth +[2024-06-18 06:12:38,495][12883] Updated weights for policy 0, policy_version 74341 (0.0037) +[2024-06-18 06:12:41,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.8, 300 sec: 42320.4). Total num frames: 1218150400. Throughput: 0: 42438.2. Samples: 1218309020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:12:41,996][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 06:12:42,285][12883] Updated weights for policy 0, policy_version 74351 (0.0046) +[2024-06-18 06:12:46,300][12883] Updated weights for policy 0, policy_version 74361 (0.0038) +[2024-06-18 06:12:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1218363392. Throughput: 0: 42361.3. Samples: 1218428980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:12:46,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 06:12:50,152][12883] Updated weights for policy 0, policy_version 74371 (0.0040) +[2024-06-18 06:12:51,996][12645] Fps is (10 sec: 44236.7, 60 sec: 42596.8, 300 sec: 42431.5). Total num frames: 1218592768. Throughput: 0: 42322.8. Samples: 1218682140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:12:51,997][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 06:12:53,861][12883] Updated weights for policy 0, policy_version 74381 (0.0030) +[2024-06-18 06:12:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42326.8, 300 sec: 42265.1). Total num frames: 1218772992. Throughput: 0: 42556.2. Samples: 1218946920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:12:56,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 06:12:57,747][12883] Updated weights for policy 0, policy_version 74391 (0.0031) +[2024-06-18 06:13:01,499][12883] Updated weights for policy 0, policy_version 74401 (0.0037) +[2024-06-18 06:13:01,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1218985984. Throughput: 0: 42340.4. Samples: 1219063300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:13:01,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 06:13:05,521][12883] Updated weights for policy 0, policy_version 74411 (0.0053) +[2024-06-18 06:13:06,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 1219215360. Throughput: 0: 42413.5. Samples: 1219320500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:13:06,997][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 06:13:09,115][12883] Updated weights for policy 0, policy_version 74421 (0.0036) +[2024-06-18 06:13:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.1, 300 sec: 42265.5). Total num frames: 1219395584. Throughput: 0: 42444.3. Samples: 1219581040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 06:13:11,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 06:13:13,263][12883] Updated weights for policy 0, policy_version 74431 (0.0050) +[2024-06-18 06:13:16,754][12883] Updated weights for policy 0, policy_version 74441 (0.0042) +[2024-06-18 06:13:16,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1219641344. Throughput: 0: 42324.5. Samples: 1219700480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:16,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 06:13:20,667][12883] Updated weights for policy 0, policy_version 74451 (0.0022) +[2024-06-18 06:13:22,000][12645] Fps is (10 sec: 47483.8, 60 sec: 42594.0, 300 sec: 42486.4). Total num frames: 1219870720. Throughput: 0: 42480.3. Samples: 1219960780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:22,001][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 06:13:24,435][12883] Updated weights for policy 0, policy_version 74461 (0.0032) +[2024-06-18 06:13:26,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1220034560. Throughput: 0: 42503.8. Samples: 1220221600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:27,003][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 06:13:28,393][12883] Updated weights for policy 0, policy_version 74471 (0.0028) +[2024-06-18 06:13:31,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1220280320. Throughput: 0: 42480.9. Samples: 1220340620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:31,996][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 06:13:32,171][12883] Updated weights for policy 0, policy_version 74481 (0.0033) +[2024-06-18 06:13:35,959][12883] Updated weights for policy 0, policy_version 74491 (0.0034) +[2024-06-18 06:13:36,994][12645] Fps is (10 sec: 45876.3, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1220493312. Throughput: 0: 42652.0. Samples: 1220601380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:36,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 06:13:39,913][12883] Updated weights for policy 0, policy_version 74501 (0.0021) +[2024-06-18 06:13:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 1220689920. Throughput: 0: 42562.3. Samples: 1220862220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:41,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 06:13:43,474][12862] Signal inference workers to stop experience collection... (17650 times) +[2024-06-18 06:13:43,510][12883] InferenceWorker_p0-w0: stopping experience collection (17650 times) +[2024-06-18 06:13:43,536][12862] Signal inference workers to resume experience collection... (17650 times) +[2024-06-18 06:13:43,537][12883] InferenceWorker_p0-w0: resuming experience collection (17650 times) +[2024-06-18 06:13:43,695][12883] Updated weights for policy 0, policy_version 74511 (0.0038) +[2024-06-18 06:13:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42377.0). Total num frames: 1220902912. Throughput: 0: 42736.6. Samples: 1220986440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:46,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 06:13:47,502][12883] Updated weights for policy 0, policy_version 74521 (0.0033) +[2024-06-18 06:13:51,315][12883] Updated weights for policy 0, policy_version 74531 (0.0045) +[2024-06-18 06:13:51,996][12645] Fps is (10 sec: 45865.3, 60 sec: 42598.4, 300 sec: 42487.0). Total num frames: 1221148672. Throughput: 0: 42865.4. Samples: 1221249440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:51,997][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 06:13:55,596][12883] Updated weights for policy 0, policy_version 74541 (0.0036) +[2024-06-18 06:13:56,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.9, 300 sec: 42320.4). Total num frames: 1221328896. Throughput: 0: 42672.1. Samples: 1221501380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 06:13:56,996][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 06:13:58,986][12883] Updated weights for policy 0, policy_version 74551 (0.0036) +[2024-06-18 06:14:01,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1221558272. Throughput: 0: 42698.3. Samples: 1221621900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:01,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 06:14:03,312][12883] Updated weights for policy 0, policy_version 74561 (0.0029) +[2024-06-18 06:14:06,638][12883] Updated weights for policy 0, policy_version 74571 (0.0041) +[2024-06-18 06:14:06,994][12645] Fps is (10 sec: 47524.1, 60 sec: 43146.1, 300 sec: 42487.3). Total num frames: 1221804032. Throughput: 0: 42776.2. Samples: 1221885440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:06,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 06:14:10,993][12883] Updated weights for policy 0, policy_version 74581 (0.0034) +[2024-06-18 06:14:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1221967872. Throughput: 0: 42515.6. Samples: 1222134800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:11,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 06:14:14,334][12883] Updated weights for policy 0, policy_version 74591 (0.0022) +[2024-06-18 06:14:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1222180864. Throughput: 0: 42626.5. Samples: 1222258820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:16,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 06:14:19,010][12883] Updated weights for policy 0, policy_version 74601 (0.0031) +[2024-06-18 06:14:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42329.8, 300 sec: 42320.7). Total num frames: 1222410240. Throughput: 0: 42606.6. Samples: 1222518680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:21,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 06:14:22,187][12883] Updated weights for policy 0, policy_version 74611 (0.0039) +[2024-06-18 06:14:26,479][12883] Updated weights for policy 0, policy_version 74621 (0.0030) +[2024-06-18 06:14:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42376.6). Total num frames: 1222606848. Throughput: 0: 42464.0. Samples: 1222773100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:26,994][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 06:14:29,893][12883] Updated weights for policy 0, policy_version 74631 (0.0022) +[2024-06-18 06:14:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1222836224. Throughput: 0: 42496.4. Samples: 1222898780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:31,994][12645] Avg episode reward: [(0, '0.165')] +[2024-06-18 06:14:34,180][12883] Updated weights for policy 0, policy_version 74641 (0.0036) +[2024-06-18 06:14:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.1, 300 sec: 42320.7). Total num frames: 1223032832. Throughput: 0: 42343.3. Samples: 1223154800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:36,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 06:14:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074648_1223032832.pth... +[2024-06-18 06:14:37,097][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074028_1212874752.pth +[2024-06-18 06:14:37,665][12883] Updated weights for policy 0, policy_version 74651 (0.0036) +[2024-06-18 06:14:41,802][12883] Updated weights for policy 0, policy_version 74661 (0.0026) +[2024-06-18 06:14:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1223245824. Throughput: 0: 42434.6. Samples: 1223410840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:41,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 06:14:45,504][12883] Updated weights for policy 0, policy_version 74671 (0.0044) +[2024-06-18 06:14:46,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1223475200. Throughput: 0: 42542.2. Samples: 1223536300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 06:14:46,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 06:14:49,198][12862] Signal inference workers to stop experience collection... (17700 times) +[2024-06-18 06:14:49,199][12862] Signal inference workers to resume experience collection... (17700 times) +[2024-06-18 06:14:49,243][12883] InferenceWorker_p0-w0: stopping experience collection (17700 times) +[2024-06-18 06:14:49,244][12883] InferenceWorker_p0-w0: resuming experience collection (17700 times) +[2024-06-18 06:14:49,348][12883] Updated weights for policy 0, policy_version 74681 (0.0036) +[2024-06-18 06:14:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.8, 300 sec: 42320.7). Total num frames: 1223671808. Throughput: 0: 42349.7. Samples: 1223791180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:14:51,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 06:14:53,450][12883] Updated weights for policy 0, policy_version 74691 (0.0039) +[2024-06-18 06:14:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42599.9, 300 sec: 42431.8). Total num frames: 1223884800. Throughput: 0: 42444.4. Samples: 1224044800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:14:56,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 06:14:57,430][12883] Updated weights for policy 0, policy_version 74701 (0.0036) +[2024-06-18 06:15:01,120][12883] Updated weights for policy 0, policy_version 74711 (0.0029) +[2024-06-18 06:15:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42432.1). Total num frames: 1224114176. Throughput: 0: 42442.3. Samples: 1224168720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:15:01,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 06:15:05,213][12883] Updated weights for policy 0, policy_version 74721 (0.0036) +[2024-06-18 06:15:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1224310784. Throughput: 0: 42364.4. Samples: 1224425080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:15:06,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 06:15:09,086][12883] Updated weights for policy 0, policy_version 74731 (0.0031) +[2024-06-18 06:15:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 1224523776. Throughput: 0: 42227.7. Samples: 1224673340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:15:11,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 06:15:12,942][12883] Updated weights for policy 0, policy_version 74741 (0.0038) +[2024-06-18 06:15:16,673][12883] Updated weights for policy 0, policy_version 74751 (0.0037) +[2024-06-18 06:15:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1224720384. Throughput: 0: 42245.2. Samples: 1224799820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:15:16,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 06:15:20,557][12883] Updated weights for policy 0, policy_version 74761 (0.0038) +[2024-06-18 06:15:21,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42431.7). Total num frames: 1224949760. Throughput: 0: 42116.0. Samples: 1225050020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:15:21,994][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 06:15:24,548][12883] Updated weights for policy 0, policy_version 74771 (0.0041) +[2024-06-18 06:15:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42376.6). Total num frames: 1225129984. Throughput: 0: 42085.3. Samples: 1225304680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:15:26,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 06:15:28,351][12883] Updated weights for policy 0, policy_version 74781 (0.0028) +[2024-06-18 06:15:31,994][12645] Fps is (10 sec: 39322.5, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 1225342976. Throughput: 0: 42041.4. Samples: 1225428160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 06:15:31,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 06:15:32,282][12883] Updated weights for policy 0, policy_version 74791 (0.0030) +[2024-06-18 06:15:36,177][12883] Updated weights for policy 0, policy_version 74801 (0.0038) +[2024-06-18 06:15:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1225588736. Throughput: 0: 42104.5. Samples: 1225685880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:15:36,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 06:15:40,094][12883] Updated weights for policy 0, policy_version 74811 (0.0027) +[2024-06-18 06:15:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1225768960. Throughput: 0: 42052.5. Samples: 1225937160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:15:41,996][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 06:15:43,702][12883] Updated weights for policy 0, policy_version 74821 (0.0034) +[2024-06-18 06:15:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1225981952. Throughput: 0: 42098.2. Samples: 1226063140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:15:46,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 06:15:48,069][12883] Updated weights for policy 0, policy_version 74831 (0.0036) +[2024-06-18 06:15:51,437][12883] Updated weights for policy 0, policy_version 74841 (0.0031) +[2024-06-18 06:15:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1226211328. Throughput: 0: 42131.2. Samples: 1226320980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:15:51,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 06:15:52,134][12862] Signal inference workers to stop experience collection... (17750 times) +[2024-06-18 06:15:52,189][12883] InferenceWorker_p0-w0: stopping experience collection (17750 times) +[2024-06-18 06:15:52,251][12862] Signal inference workers to resume experience collection... (17750 times) +[2024-06-18 06:15:52,251][12883] InferenceWorker_p0-w0: resuming experience collection (17750 times) +[2024-06-18 06:15:55,882][12883] Updated weights for policy 0, policy_version 74851 (0.0040) +[2024-06-18 06:15:56,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 1226424320. Throughput: 0: 42238.7. Samples: 1226574080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:15:56,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 06:15:59,168][12883] Updated weights for policy 0, policy_version 74861 (0.0038) +[2024-06-18 06:16:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 1226604544. Throughput: 0: 42141.8. Samples: 1226696200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:16:01,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 06:16:03,658][12883] Updated weights for policy 0, policy_version 74871 (0.0036) +[2024-06-18 06:16:06,989][12883] Updated weights for policy 0, policy_version 74881 (0.0033) +[2024-06-18 06:16:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1226850304. Throughput: 0: 42358.8. Samples: 1226956160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:16:06,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 06:16:11,274][12883] Updated weights for policy 0, policy_version 74891 (0.0029) +[2024-06-18 06:16:11,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1227063296. Throughput: 0: 42331.7. Samples: 1227209600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:16:11,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 06:16:14,925][12883] Updated weights for policy 0, policy_version 74901 (0.0035) +[2024-06-18 06:16:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 1227259904. Throughput: 0: 42329.1. Samples: 1227332980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 06:16:16,994][12645] Avg episode reward: [(0, '0.319')] +[2024-06-18 06:16:19,002][12883] Updated weights for policy 0, policy_version 74911 (0.0028) +[2024-06-18 06:16:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 1227456512. Throughput: 0: 42285.4. Samples: 1227588720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:16:21,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 06:16:22,615][12883] Updated weights for policy 0, policy_version 74921 (0.0030) +[2024-06-18 06:16:26,717][12883] Updated weights for policy 0, policy_version 74931 (0.0039) +[2024-06-18 06:16:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1227669504. Throughput: 0: 42393.8. Samples: 1227844880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:16:26,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 06:16:30,591][12883] Updated weights for policy 0, policy_version 74941 (0.0038) +[2024-06-18 06:16:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 1227915264. Throughput: 0: 42333.8. Samples: 1227968160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:16:31,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 06:16:34,471][12883] Updated weights for policy 0, policy_version 74951 (0.0029) +[2024-06-18 06:16:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1228095488. Throughput: 0: 42209.8. Samples: 1228220420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:16:36,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 06:16:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074958_1228111872.pth... +[2024-06-18 06:16:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074338_1217953792.pth +[2024-06-18 06:16:38,176][12883] Updated weights for policy 0, policy_version 74961 (0.0023) +[2024-06-18 06:16:41,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1228292096. Throughput: 0: 42047.5. Samples: 1228466220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:16:41,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 06:16:42,445][12883] Updated weights for policy 0, policy_version 74971 (0.0024) +[2024-06-18 06:16:46,250][12883] Updated weights for policy 0, policy_version 74981 (0.0054) +[2024-06-18 06:16:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1228521472. Throughput: 0: 42084.1. Samples: 1228589980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:16:46,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 06:16:50,359][12883] Updated weights for policy 0, policy_version 74991 (0.0037) +[2024-06-18 06:16:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42321.0). Total num frames: 1228718080. Throughput: 0: 41693.4. Samples: 1228832360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:16:51,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 06:16:54,131][12883] Updated weights for policy 0, policy_version 75001 (0.0034) +[2024-06-18 06:16:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41232.9, 300 sec: 42209.6). Total num frames: 1228898304. Throughput: 0: 41713.2. Samples: 1229086700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:16:56,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 06:16:58,273][12883] Updated weights for policy 0, policy_version 75011 (0.0038) +[2024-06-18 06:17:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1229127680. Throughput: 0: 41684.1. Samples: 1229208760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 06:17:01,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 06:17:02,057][12883] Updated weights for policy 0, policy_version 75021 (0.0024) +[2024-06-18 06:17:06,039][12883] Updated weights for policy 0, policy_version 75031 (0.0031) +[2024-06-18 06:17:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1229357056. Throughput: 0: 41710.1. Samples: 1229465680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:06,998][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 06:17:09,864][12883] Updated weights for policy 0, policy_version 75041 (0.0031) +[2024-06-18 06:17:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 42265.2). Total num frames: 1229553664. Throughput: 0: 41507.5. Samples: 1229712720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:11,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 06:17:13,933][12883] Updated weights for policy 0, policy_version 75051 (0.0037) +[2024-06-18 06:17:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1229783040. Throughput: 0: 41543.1. Samples: 1229837600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:16,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 06:17:17,485][12883] Updated weights for policy 0, policy_version 75061 (0.0040) +[2024-06-18 06:17:21,075][12862] Signal inference workers to stop experience collection... (17800 times) +[2024-06-18 06:17:21,076][12862] Signal inference workers to resume experience collection... (17800 times) +[2024-06-18 06:17:21,088][12883] InferenceWorker_p0-w0: stopping experience collection (17800 times) +[2024-06-18 06:17:21,088][12883] InferenceWorker_p0-w0: resuming experience collection (17800 times) +[2024-06-18 06:17:21,755][12883] Updated weights for policy 0, policy_version 75071 (0.0023) +[2024-06-18 06:17:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1229979648. Throughput: 0: 41710.6. Samples: 1230097400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:21,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 06:17:24,998][12883] Updated weights for policy 0, policy_version 75081 (0.0027) +[2024-06-18 06:17:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1230209024. Throughput: 0: 41822.2. Samples: 1230348220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:26,994][12645] Avg episode reward: [(0, '0.124')] +[2024-06-18 06:17:29,453][12883] Updated weights for policy 0, policy_version 75091 (0.0050) +[2024-06-18 06:17:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41506.2, 300 sec: 42209.7). Total num frames: 1230405632. Throughput: 0: 41963.7. Samples: 1230478340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:31,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 06:17:32,543][12883] Updated weights for policy 0, policy_version 75101 (0.0035) +[2024-06-18 06:17:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42209.9). Total num frames: 1230602240. Throughput: 0: 42252.9. Samples: 1230733740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:36,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 06:17:37,086][12883] Updated weights for policy 0, policy_version 75111 (0.0037) +[2024-06-18 06:17:40,304][12883] Updated weights for policy 0, policy_version 75121 (0.0046) +[2024-06-18 06:17:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1230815232. Throughput: 0: 42063.6. Samples: 1230979560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:41,999][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 06:17:44,736][12883] Updated weights for policy 0, policy_version 75131 (0.0035) +[2024-06-18 06:17:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 42210.0). Total num frames: 1231044608. Throughput: 0: 42240.6. Samples: 1231109580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:17:46,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 06:17:48,062][12883] Updated weights for policy 0, policy_version 75141 (0.0038) +[2024-06-18 06:17:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1231241216. Throughput: 0: 42112.8. Samples: 1231360760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:17:51,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 06:17:52,779][12883] Updated weights for policy 0, policy_version 75151 (0.0034) +[2024-06-18 06:17:56,673][12883] Updated weights for policy 0, policy_version 75161 (0.0045) +[2024-06-18 06:17:56,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1231454208. Throughput: 0: 42227.0. Samples: 1231612940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:17:56,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 06:18:00,544][12883] Updated weights for policy 0, policy_version 75171 (0.0033) +[2024-06-18 06:18:01,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 1231667200. Throughput: 0: 42305.0. Samples: 1231741320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:18:01,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 06:18:04,368][12883] Updated weights for policy 0, policy_version 75181 (0.0032) +[2024-06-18 06:18:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 1231847424. Throughput: 0: 42051.6. Samples: 1231989720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:18:06,996][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 06:18:08,257][12883] Updated weights for policy 0, policy_version 75191 (0.0033) +[2024-06-18 06:18:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1232076800. Throughput: 0: 42049.8. Samples: 1232240460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:18:11,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 06:18:12,022][12883] Updated weights for policy 0, policy_version 75201 (0.0029) +[2024-06-18 06:18:15,964][12883] Updated weights for policy 0, policy_version 75211 (0.0028) +[2024-06-18 06:18:16,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 42099.5). Total num frames: 1232289792. Throughput: 0: 42121.3. Samples: 1232373800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:18:16,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 06:18:19,749][12883] Updated weights for policy 0, policy_version 75221 (0.0032) +[2024-06-18 06:18:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1232502784. Throughput: 0: 42183.1. Samples: 1232631980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:18:21,994][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 06:18:23,580][12883] Updated weights for policy 0, policy_version 75231 (0.0038) +[2024-06-18 06:18:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1232732160. Throughput: 0: 42271.6. Samples: 1232881780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:18:26,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 06:18:27,409][12883] Updated weights for policy 0, policy_version 75241 (0.0040) +[2024-06-18 06:18:31,651][12883] Updated weights for policy 0, policy_version 75251 (0.0028) +[2024-06-18 06:18:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1232928768. Throughput: 0: 42335.9. Samples: 1233014700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 06:18:31,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 06:18:32,696][12862] Signal inference workers to stop experience collection... (17850 times) +[2024-06-18 06:18:32,744][12883] InferenceWorker_p0-w0: stopping experience collection (17850 times) +[2024-06-18 06:18:32,747][12862] Signal inference workers to resume experience collection... (17850 times) +[2024-06-18 06:18:32,757][12883] InferenceWorker_p0-w0: resuming experience collection (17850 times) +[2024-06-18 06:18:35,014][12883] Updated weights for policy 0, policy_version 75261 (0.0046) +[2024-06-18 06:18:36,996][12645] Fps is (10 sec: 39312.6, 60 sec: 42050.7, 300 sec: 42153.8). Total num frames: 1233125376. Throughput: 0: 42313.9. Samples: 1233264980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:18:36,996][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 06:18:37,042][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075265_1233141760.pth... +[2024-06-18 06:18:37,093][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074648_1223032832.pth +[2024-06-18 06:18:39,341][12883] Updated weights for policy 0, policy_version 75271 (0.0031) +[2024-06-18 06:18:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 1233371136. Throughput: 0: 42285.8. Samples: 1233515800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:18:41,998][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 06:18:42,558][12883] Updated weights for policy 0, policy_version 75281 (0.0037) +[2024-06-18 06:18:46,994][12645] Fps is (10 sec: 42608.1, 60 sec: 41779.1, 300 sec: 42043.3). Total num frames: 1233551360. Throughput: 0: 42435.0. Samples: 1233650900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:18:46,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 06:18:47,009][12883] Updated weights for policy 0, policy_version 75291 (0.0038) +[2024-06-18 06:18:50,231][12883] Updated weights for policy 0, policy_version 75301 (0.0039) +[2024-06-18 06:18:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42154.4). Total num frames: 1233764352. Throughput: 0: 42394.3. Samples: 1233897460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:18:51,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 06:18:54,716][12883] Updated weights for policy 0, policy_version 75311 (0.0046) +[2024-06-18 06:18:56,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 1234010112. Throughput: 0: 42363.1. Samples: 1234146800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:18:56,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 06:18:58,032][12883] Updated weights for policy 0, policy_version 75321 (0.0046) +[2024-06-18 06:19:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 1234173952. Throughput: 0: 42459.0. Samples: 1234284460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:19:01,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 06:19:02,489][12883] Updated weights for policy 0, policy_version 75331 (0.0039) +[2024-06-18 06:19:05,818][12883] Updated weights for policy 0, policy_version 75341 (0.0049) +[2024-06-18 06:19:06,995][12645] Fps is (10 sec: 39316.7, 60 sec: 42597.6, 300 sec: 42153.9). Total num frames: 1234403328. Throughput: 0: 42172.6. Samples: 1234529800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:19:06,995][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 06:19:10,287][12883] Updated weights for policy 0, policy_version 75351 (0.0028) +[2024-06-18 06:19:11,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 1234649088. Throughput: 0: 42139.1. Samples: 1234778040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:19:11,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 06:19:14,014][12883] Updated weights for policy 0, policy_version 75361 (0.0042) +[2024-06-18 06:19:17,000][12645] Fps is (10 sec: 39301.7, 60 sec: 41774.8, 300 sec: 41986.6). Total num frames: 1234796544. Throughput: 0: 42097.3. Samples: 1234909340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:19:17,001][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 06:19:17,854][12883] Updated weights for policy 0, policy_version 75371 (0.0029) +[2024-06-18 06:19:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 1235025920. Throughput: 0: 41995.9. Samples: 1235154700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 06:19:21,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 06:19:22,084][12883] Updated weights for policy 0, policy_version 75381 (0.0040) +[2024-06-18 06:19:25,638][12883] Updated weights for policy 0, policy_version 75391 (0.0027) +[2024-06-18 06:19:26,994][12645] Fps is (10 sec: 47543.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1235271680. Throughput: 0: 41961.8. Samples: 1235404080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:19:26,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 06:19:29,846][12883] Updated weights for policy 0, policy_version 75401 (0.0031) +[2024-06-18 06:19:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1235435520. Throughput: 0: 41800.0. Samples: 1235531900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:19:31,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 06:19:33,597][12883] Updated weights for policy 0, policy_version 75411 (0.0032) +[2024-06-18 06:19:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42154.1). Total num frames: 1235681280. Throughput: 0: 41803.1. Samples: 1235778600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:19:36,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 06:19:38,156][12883] Updated weights for policy 0, policy_version 75421 (0.0025) +[2024-06-18 06:19:41,472][12883] Updated weights for policy 0, policy_version 75431 (0.0034) +[2024-06-18 06:19:41,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42050.8, 300 sec: 42098.2). Total num frames: 1235894272. Throughput: 0: 42021.0. Samples: 1236037840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:19:41,996][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 06:19:45,678][12883] Updated weights for policy 0, policy_version 75441 (0.0037) +[2024-06-18 06:19:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1236074496. Throughput: 0: 41791.9. Samples: 1236165100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:19:46,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 06:19:47,948][12862] Signal inference workers to stop experience collection... (17900 times) +[2024-06-18 06:19:47,983][12883] InferenceWorker_p0-w0: stopping experience collection (17900 times) +[2024-06-18 06:19:48,005][12862] Signal inference workers to resume experience collection... (17900 times) +[2024-06-18 06:19:48,006][12883] InferenceWorker_p0-w0: resuming experience collection (17900 times) +[2024-06-18 06:19:49,004][12883] Updated weights for policy 0, policy_version 75451 (0.0035) +[2024-06-18 06:19:51,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1236320256. Throughput: 0: 42004.3. Samples: 1236419940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:19:51,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 06:19:53,218][12883] Updated weights for policy 0, policy_version 75461 (0.0046) +[2024-06-18 06:19:56,605][12883] Updated weights for policy 0, policy_version 75471 (0.0036) +[2024-06-18 06:19:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 1236516864. Throughput: 0: 42193.7. Samples: 1236676760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:19:56,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 06:20:00,725][12883] Updated weights for policy 0, policy_version 75481 (0.0040) +[2024-06-18 06:20:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1236713472. Throughput: 0: 42014.4. Samples: 1236799720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:20:01,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 06:20:04,440][12883] Updated weights for policy 0, policy_version 75491 (0.0036) +[2024-06-18 06:20:07,000][12645] Fps is (10 sec: 44209.9, 60 sec: 42594.8, 300 sec: 42153.2). Total num frames: 1236959232. Throughput: 0: 42255.1. Samples: 1237056440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:20:07,000][12645] Avg episode reward: [(0, '0.215')] +[2024-06-18 06:20:08,273][12883] Updated weights for policy 0, policy_version 75501 (0.0031) +[2024-06-18 06:20:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1237155840. Throughput: 0: 42459.7. Samples: 1237314760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:11,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 06:20:12,038][12883] Updated weights for policy 0, policy_version 75511 (0.0043) +[2024-06-18 06:20:15,949][12883] Updated weights for policy 0, policy_version 75521 (0.0042) +[2024-06-18 06:20:16,994][12645] Fps is (10 sec: 37706.4, 60 sec: 42329.7, 300 sec: 41987.5). Total num frames: 1237336064. Throughput: 0: 42308.8. Samples: 1237435800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:16,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 06:20:19,928][12883] Updated weights for policy 0, policy_version 75531 (0.0035) +[2024-06-18 06:20:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 1237581824. Throughput: 0: 42376.8. Samples: 1237685560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:21,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 06:20:23,999][12883] Updated weights for policy 0, policy_version 75541 (0.0038) +[2024-06-18 06:20:26,994][12645] Fps is (10 sec: 44237.6, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1237778432. Throughput: 0: 42435.1. Samples: 1237947320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:26,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 06:20:27,751][12883] Updated weights for policy 0, policy_version 75551 (0.0041) +[2024-06-18 06:20:31,786][12883] Updated weights for policy 0, policy_version 75561 (0.0041) +[2024-06-18 06:20:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 1237991424. Throughput: 0: 42229.0. Samples: 1238065400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:31,996][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 06:20:35,503][12883] Updated weights for policy 0, policy_version 75571 (0.0034) +[2024-06-18 06:20:36,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1238220800. Throughput: 0: 42206.1. Samples: 1238319220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:36,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 06:20:37,119][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075576_1238237184.pth... +[2024-06-18 06:20:37,174][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074958_1228111872.pth +[2024-06-18 06:20:39,638][12883] Updated weights for policy 0, policy_version 75581 (0.0037) +[2024-06-18 06:20:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 1238417408. Throughput: 0: 42277.5. Samples: 1238579240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:41,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 06:20:43,115][12883] Updated weights for policy 0, policy_version 75591 (0.0033) +[2024-06-18 06:20:46,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42596.9, 300 sec: 42098.2). Total num frames: 1238630400. Throughput: 0: 42208.1. Samples: 1238699180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:46,996][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 06:20:47,828][12883] Updated weights for policy 0, policy_version 75601 (0.0043) +[2024-06-18 06:20:50,712][12883] Updated weights for policy 0, policy_version 75611 (0.0036) +[2024-06-18 06:20:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1238843392. Throughput: 0: 42263.2. Samples: 1238958020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:20:51,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 06:20:55,533][12883] Updated weights for policy 0, policy_version 75621 (0.0027) +[2024-06-18 06:20:56,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1239040000. Throughput: 0: 42195.9. Samples: 1239213580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:20:56,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 06:20:58,354][12883] Updated weights for policy 0, policy_version 75631 (0.0037) +[2024-06-18 06:21:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 1239269376. Throughput: 0: 42313.9. Samples: 1239339920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:01,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 06:21:03,071][12883] Updated weights for policy 0, policy_version 75641 (0.0027) +[2024-06-18 06:21:06,473][12883] Updated weights for policy 0, policy_version 75651 (0.0031) +[2024-06-18 06:21:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41783.5, 300 sec: 42043.0). Total num frames: 1239465984. Throughput: 0: 42327.1. Samples: 1239590280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:06,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 06:21:08,972][12862] Signal inference workers to stop experience collection... (17950 times) +[2024-06-18 06:21:08,973][12862] Signal inference workers to resume experience collection... (17950 times) +[2024-06-18 06:21:09,004][12883] InferenceWorker_p0-w0: stopping experience collection (17950 times) +[2024-06-18 06:21:09,004][12883] InferenceWorker_p0-w0: resuming experience collection (17950 times) +[2024-06-18 06:21:10,787][12883] Updated weights for policy 0, policy_version 75661 (0.0038) +[2024-06-18 06:21:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1239695360. Throughput: 0: 42137.2. Samples: 1239843500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:11,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 06:21:14,035][12883] Updated weights for policy 0, policy_version 75671 (0.0030) +[2024-06-18 06:21:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 1239908352. Throughput: 0: 42445.7. Samples: 1239975460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:16,994][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 06:21:18,328][12883] Updated weights for policy 0, policy_version 75681 (0.0024) +[2024-06-18 06:21:21,963][12883] Updated weights for policy 0, policy_version 75691 (0.0041) +[2024-06-18 06:21:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1240121344. Throughput: 0: 42512.0. Samples: 1240232260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:21,995][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 06:21:26,134][12883] Updated weights for policy 0, policy_version 75701 (0.0033) +[2024-06-18 06:21:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 1240334336. Throughput: 0: 42303.4. Samples: 1240482900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:26,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 06:21:29,947][12883] Updated weights for policy 0, policy_version 75711 (0.0032) +[2024-06-18 06:21:31,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1240530944. Throughput: 0: 42405.8. Samples: 1240607340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:31,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 06:21:33,723][12883] Updated weights for policy 0, policy_version 75721 (0.0035) +[2024-06-18 06:21:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1240743936. Throughput: 0: 42296.9. Samples: 1240861380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:36,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 06:21:37,605][12883] Updated weights for policy 0, policy_version 75731 (0.0036) +[2024-06-18 06:21:41,317][12883] Updated weights for policy 0, policy_version 75741 (0.0043) +[2024-06-18 06:21:41,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1240940544. Throughput: 0: 42148.0. Samples: 1241110240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 06:21:41,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 06:21:45,526][12883] Updated weights for policy 0, policy_version 75751 (0.0039) +[2024-06-18 06:21:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42326.8, 300 sec: 42209.6). Total num frames: 1241169920. Throughput: 0: 42099.0. Samples: 1241234380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:21:46,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 06:21:49,237][12883] Updated weights for policy 0, policy_version 75761 (0.0044) +[2024-06-18 06:21:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1241382912. Throughput: 0: 42177.4. Samples: 1241488260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:21:51,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 06:21:53,223][12883] Updated weights for policy 0, policy_version 75771 (0.0040) +[2024-06-18 06:21:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1241579520. Throughput: 0: 42115.5. Samples: 1241738700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:21:56,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 06:21:57,262][12883] Updated weights for policy 0, policy_version 75781 (0.0029) +[2024-06-18 06:22:01,191][12883] Updated weights for policy 0, policy_version 75791 (0.0044) +[2024-06-18 06:22:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1241792512. Throughput: 0: 41934.3. Samples: 1241862500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:22:01,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 06:22:05,073][12883] Updated weights for policy 0, policy_version 75801 (0.0037) +[2024-06-18 06:22:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1242005504. Throughput: 0: 41894.4. Samples: 1242117500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:22:06,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 06:22:07,073][12862] Saving new best policy, reward=0.665! +[2024-06-18 06:22:09,046][12883] Updated weights for policy 0, policy_version 75811 (0.0042) +[2024-06-18 06:22:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 1242218496. Throughput: 0: 41886.8. Samples: 1242367800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:22:11,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 06:22:12,702][12883] Updated weights for policy 0, policy_version 75821 (0.0034) +[2024-06-18 06:22:16,814][12883] Updated weights for policy 0, policy_version 75831 (0.0032) +[2024-06-18 06:22:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1242415104. Throughput: 0: 41936.7. Samples: 1242494500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:22:16,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 06:22:20,423][12883] Updated weights for policy 0, policy_version 75841 (0.0033) +[2024-06-18 06:22:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 1242611712. Throughput: 0: 41879.1. Samples: 1242745940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:22:21,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 06:22:24,551][12883] Updated weights for policy 0, policy_version 75851 (0.0038) +[2024-06-18 06:22:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1242841088. Throughput: 0: 42069.5. Samples: 1243003360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 06:22:26,994][12645] Avg episode reward: [(0, '0.144')] +[2024-06-18 06:22:28,104][12883] Updated weights for policy 0, policy_version 75861 (0.0031) +[2024-06-18 06:22:31,996][12645] Fps is (10 sec: 42588.6, 60 sec: 41777.5, 300 sec: 42153.8). Total num frames: 1243037696. Throughput: 0: 42075.7. Samples: 1243127880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:22:31,997][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 06:22:32,400][12883] Updated weights for policy 0, policy_version 75871 (0.0032) +[2024-06-18 06:22:36,119][12883] Updated weights for policy 0, policy_version 75881 (0.0034) +[2024-06-18 06:22:36,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1243250688. Throughput: 0: 41950.6. Samples: 1243376040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:22:36,996][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 06:22:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075882_1243250688.pth... +[2024-06-18 06:22:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075265_1233141760.pth +[2024-06-18 06:22:40,197][12883] Updated weights for policy 0, policy_version 75891 (0.0042) +[2024-06-18 06:22:41,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1243463680. Throughput: 0: 42073.4. Samples: 1243632000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:22:41,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 06:22:43,678][12883] Updated weights for policy 0, policy_version 75901 (0.0022) +[2024-06-18 06:22:45,404][12862] Signal inference workers to stop experience collection... (18000 times) +[2024-06-18 06:22:45,405][12862] Signal inference workers to resume experience collection... (18000 times) +[2024-06-18 06:22:45,422][12883] InferenceWorker_p0-w0: stopping experience collection (18000 times) +[2024-06-18 06:22:45,449][12883] InferenceWorker_p0-w0: resuming experience collection (18000 times) +[2024-06-18 06:22:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1243676672. Throughput: 0: 42082.2. Samples: 1243756200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:22:46,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 06:22:47,869][12883] Updated weights for policy 0, policy_version 75911 (0.0054) +[2024-06-18 06:22:51,856][12883] Updated weights for policy 0, policy_version 75921 (0.0028) +[2024-06-18 06:22:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1243889664. Throughput: 0: 42000.8. Samples: 1244007540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:22:51,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 06:22:55,525][12883] Updated weights for policy 0, policy_version 75931 (0.0039) +[2024-06-18 06:22:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1244119040. Throughput: 0: 42171.4. Samples: 1244265520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:22:56,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 06:22:59,577][12883] Updated weights for policy 0, policy_version 75941 (0.0037) +[2024-06-18 06:23:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1244315648. Throughput: 0: 42085.4. Samples: 1244388340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:23:01,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 06:23:03,198][12883] Updated weights for policy 0, policy_version 75951 (0.0028) +[2024-06-18 06:23:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1244528640. Throughput: 0: 42143.5. Samples: 1244642400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:23:06,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 06:23:07,194][12883] Updated weights for policy 0, policy_version 75961 (0.0030) +[2024-06-18 06:23:10,919][12883] Updated weights for policy 0, policy_version 75971 (0.0025) +[2024-06-18 06:23:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1244741632. Throughput: 0: 42214.2. Samples: 1244903000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) +[2024-06-18 06:23:11,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 06:23:14,911][12883] Updated weights for policy 0, policy_version 75981 (0.0033) +[2024-06-18 06:23:16,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.9, 300 sec: 42264.8). Total num frames: 1244971008. Throughput: 0: 42279.6. Samples: 1245030460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:16,996][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 06:23:18,623][12883] Updated weights for policy 0, policy_version 75991 (0.0032) +[2024-06-18 06:23:21,996][12645] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 1245151232. Throughput: 0: 42381.1. Samples: 1245283280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:21,997][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 06:23:22,686][12883] Updated weights for policy 0, policy_version 76001 (0.0036) +[2024-06-18 06:23:26,449][12883] Updated weights for policy 0, policy_version 76011 (0.0040) +[2024-06-18 06:23:26,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1245380608. Throughput: 0: 42401.7. Samples: 1245540080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:26,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 06:23:30,487][12883] Updated weights for policy 0, policy_version 76021 (0.0039) +[2024-06-18 06:23:31,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42600.1, 300 sec: 42265.5). Total num frames: 1245593600. Throughput: 0: 42449.8. Samples: 1245666440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:31,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 06:23:33,928][12883] Updated weights for policy 0, policy_version 76031 (0.0022) +[2024-06-18 06:23:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1245806592. Throughput: 0: 42468.4. Samples: 1245918620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:36,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 06:23:38,197][12883] Updated weights for policy 0, policy_version 76041 (0.0027) +[2024-06-18 06:23:41,521][12883] Updated weights for policy 0, policy_version 76051 (0.0031) +[2024-06-18 06:23:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1246035968. Throughput: 0: 42378.4. Samples: 1246172540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:41,994][12645] Avg episode reward: [(0, '0.088')] +[2024-06-18 06:23:45,796][12883] Updated weights for policy 0, policy_version 76061 (0.0051) +[2024-06-18 06:23:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1246216192. Throughput: 0: 42585.0. Samples: 1246304660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:46,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 06:23:49,192][12883] Updated weights for policy 0, policy_version 76071 (0.0038) +[2024-06-18 06:23:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 1246445568. Throughput: 0: 42529.0. Samples: 1246556200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:51,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 06:23:53,267][12883] Updated weights for policy 0, policy_version 76081 (0.0032) +[2024-06-18 06:23:56,741][12883] Updated weights for policy 0, policy_version 76091 (0.0039) +[2024-06-18 06:23:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1246674944. Throughput: 0: 42456.4. Samples: 1246813540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 06:23:56,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 06:24:01,291][12883] Updated weights for policy 0, policy_version 76101 (0.0039) +[2024-06-18 06:24:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42209.8). Total num frames: 1246855168. Throughput: 0: 42648.4. Samples: 1246949540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:01,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 06:24:04,769][12883] Updated weights for policy 0, policy_version 76111 (0.0029) +[2024-06-18 06:24:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1247084544. Throughput: 0: 42469.7. Samples: 1247194320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:06,994][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 06:24:08,739][12883] Updated weights for policy 0, policy_version 76121 (0.0032) +[2024-06-18 06:24:11,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42377.1). Total num frames: 1247297536. Throughput: 0: 42506.7. Samples: 1247452880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:11,994][12645] Avg episode reward: [(0, '0.096')] +[2024-06-18 06:24:12,514][12883] Updated weights for policy 0, policy_version 76131 (0.0028) +[2024-06-18 06:24:16,513][12883] Updated weights for policy 0, policy_version 76141 (0.0035) +[2024-06-18 06:24:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1247510528. Throughput: 0: 42466.6. Samples: 1247577440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:16,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 06:24:20,262][12883] Updated weights for policy 0, policy_version 76151 (0.0027) +[2024-06-18 06:24:20,487][12862] Signal inference workers to stop experience collection... (18050 times) +[2024-06-18 06:24:20,513][12883] InferenceWorker_p0-w0: stopping experience collection (18050 times) +[2024-06-18 06:24:20,545][12862] Signal inference workers to resume experience collection... (18050 times) +[2024-06-18 06:24:20,546][12883] InferenceWorker_p0-w0: resuming experience collection (18050 times) +[2024-06-18 06:24:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42873.1, 300 sec: 42209.6). Total num frames: 1247723520. Throughput: 0: 42560.1. Samples: 1247833820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:21,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 06:24:24,171][12883] Updated weights for policy 0, policy_version 76161 (0.0023) +[2024-06-18 06:24:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1247920128. Throughput: 0: 42822.6. Samples: 1248099560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:26,998][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 06:24:27,736][12883] Updated weights for policy 0, policy_version 76171 (0.0023) +[2024-06-18 06:24:31,863][12883] Updated weights for policy 0, policy_version 76181 (0.0047) +[2024-06-18 06:24:31,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42596.8, 300 sec: 42264.8). Total num frames: 1248149504. Throughput: 0: 42468.1. Samples: 1248215820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:31,996][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 06:24:35,675][12883] Updated weights for policy 0, policy_version 76191 (0.0038) +[2024-06-18 06:24:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42321.0). Total num frames: 1248378880. Throughput: 0: 42608.4. Samples: 1248473580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:37,000][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 06:24:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076195_1248378880.pth... +[2024-06-18 06:24:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075576_1238237184.pth +[2024-06-18 06:24:39,663][12883] Updated weights for policy 0, policy_version 76201 (0.0035) +[2024-06-18 06:24:41,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1248559104. Throughput: 0: 42684.5. Samples: 1248734340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:41,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 06:24:43,147][12883] Updated weights for policy 0, policy_version 76211 (0.0030) +[2024-06-18 06:24:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42265.1). Total num frames: 1248788480. Throughput: 0: 42385.2. Samples: 1248856880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 06:24:46,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 06:24:47,197][12883] Updated weights for policy 0, policy_version 76221 (0.0037) +[2024-06-18 06:24:50,658][12883] Updated weights for policy 0, policy_version 76231 (0.0034) +[2024-06-18 06:24:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 1249017856. Throughput: 0: 42743.6. Samples: 1249117780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:24:51,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 06:24:54,689][12883] Updated weights for policy 0, policy_version 76241 (0.0037) +[2024-06-18 06:24:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1249214464. Throughput: 0: 42733.4. Samples: 1249375880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:24:56,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 06:24:58,501][12883] Updated weights for policy 0, policy_version 76251 (0.0026) +[2024-06-18 06:25:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42266.0). Total num frames: 1249427456. Throughput: 0: 42678.6. Samples: 1249497980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:25:01,994][12645] Avg episode reward: [(0, '0.143')] +[2024-06-18 06:25:02,781][12883] Updated weights for policy 0, policy_version 76261 (0.0044) +[2024-06-18 06:25:06,163][12883] Updated weights for policy 0, policy_version 76271 (0.0024) +[2024-06-18 06:25:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 1249656832. Throughput: 0: 42632.9. Samples: 1249752300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:25:06,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 06:25:10,441][12883] Updated weights for policy 0, policy_version 76281 (0.0027) +[2024-06-18 06:25:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1249820672. Throughput: 0: 42513.4. Samples: 1250012660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:25:11,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 06:25:13,737][12883] Updated weights for policy 0, policy_version 76291 (0.0040) +[2024-06-18 06:25:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1250066432. Throughput: 0: 42618.1. Samples: 1250133540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:25:16,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 06:25:18,038][12883] Updated weights for policy 0, policy_version 76301 (0.0038) +[2024-06-18 06:25:21,587][12883] Updated weights for policy 0, policy_version 76311 (0.0030) +[2024-06-18 06:25:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1250279424. Throughput: 0: 42535.2. Samples: 1250387660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:25:21,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 06:25:25,706][12883] Updated weights for policy 0, policy_version 76321 (0.0037) +[2024-06-18 06:25:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1250459648. Throughput: 0: 42565.6. Samples: 1250649800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:25:26,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 06:25:29,355][12883] Updated weights for policy 0, policy_version 76331 (0.0024) +[2024-06-18 06:25:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 1250689024. Throughput: 0: 42461.4. Samples: 1250767640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 06:25:31,994][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 06:25:33,463][12883] Updated weights for policy 0, policy_version 76341 (0.0037) +[2024-06-18 06:25:36,922][12883] Updated weights for policy 0, policy_version 76351 (0.0035) +[2024-06-18 06:25:36,996][12645] Fps is (10 sec: 47503.0, 60 sec: 42596.8, 300 sec: 42431.4). Total num frames: 1250934784. Throughput: 0: 42396.0. Samples: 1251025700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:25:36,997][12645] Avg episode reward: [(0, '0.692')] +[2024-06-18 06:25:37,013][12862] Saving new best policy, reward=0.692! +[2024-06-18 06:25:41,105][12883] Updated weights for policy 0, policy_version 76361 (0.0026) +[2024-06-18 06:25:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42265.5). Total num frames: 1251098624. Throughput: 0: 42446.3. Samples: 1251285960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:25:41,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 06:25:44,359][12883] Updated weights for policy 0, policy_version 76371 (0.0028) +[2024-06-18 06:25:46,994][12645] Fps is (10 sec: 39330.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1251328000. Throughput: 0: 42330.2. Samples: 1251402840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:25:46,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 06:25:48,274][12862] Signal inference workers to stop experience collection... (18100 times) +[2024-06-18 06:25:48,274][12862] Signal inference workers to resume experience collection... (18100 times) +[2024-06-18 06:25:48,293][12883] InferenceWorker_p0-w0: stopping experience collection (18100 times) +[2024-06-18 06:25:48,322][12883] InferenceWorker_p0-w0: resuming experience collection (18100 times) +[2024-06-18 06:25:48,898][12883] Updated weights for policy 0, policy_version 76381 (0.0032) +[2024-06-18 06:25:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1251557376. Throughput: 0: 42392.4. Samples: 1251659960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:25:51,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 06:25:52,492][12883] Updated weights for policy 0, policy_version 76391 (0.0028) +[2024-06-18 06:25:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1251737600. Throughput: 0: 42306.7. Samples: 1251916460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:25:56,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 06:25:57,079][12883] Updated weights for policy 0, policy_version 76401 (0.0036) +[2024-06-18 06:26:00,081][12883] Updated weights for policy 0, policy_version 76411 (0.0040) +[2024-06-18 06:26:02,000][12645] Fps is (10 sec: 39297.4, 60 sec: 42048.0, 300 sec: 42319.8). Total num frames: 1251950592. Throughput: 0: 42287.1. Samples: 1252036720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:26:02,001][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 06:26:04,913][12883] Updated weights for policy 0, policy_version 76421 (0.0035) +[2024-06-18 06:26:06,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1252196352. Throughput: 0: 42394.2. Samples: 1252295400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:26:06,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 06:26:07,729][12883] Updated weights for policy 0, policy_version 76431 (0.0029) +[2024-06-18 06:26:11,994][12645] Fps is (10 sec: 40985.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1252360192. Throughput: 0: 42252.0. Samples: 1252551140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:26:11,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 06:26:12,489][12883] Updated weights for policy 0, policy_version 76441 (0.0037) +[2024-06-18 06:26:15,439][12883] Updated weights for policy 0, policy_version 76451 (0.0040) +[2024-06-18 06:26:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1252589568. Throughput: 0: 42292.4. Samples: 1252670800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) +[2024-06-18 06:26:16,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 06:26:20,112][12883] Updated weights for policy 0, policy_version 76461 (0.0032) +[2024-06-18 06:26:21,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1252835328. Throughput: 0: 42461.7. Samples: 1252936380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:26:21,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 06:26:23,167][12883] Updated weights for policy 0, policy_version 76471 (0.0028) +[2024-06-18 06:26:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1253015552. Throughput: 0: 42397.3. Samples: 1253193840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:26:26,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 06:26:27,790][12883] Updated weights for policy 0, policy_version 76481 (0.0045) +[2024-06-18 06:26:30,710][12883] Updated weights for policy 0, policy_version 76491 (0.0023) +[2024-06-18 06:26:31,994][12645] Fps is (10 sec: 40958.8, 60 sec: 42598.2, 300 sec: 42376.2). Total num frames: 1253244928. Throughput: 0: 42475.3. Samples: 1253314240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:26:31,995][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 06:26:35,877][12883] Updated weights for policy 0, policy_version 76501 (0.0038) +[2024-06-18 06:26:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42053.8, 300 sec: 42431.8). Total num frames: 1253457920. Throughput: 0: 42497.3. Samples: 1253572340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:26:36,995][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 06:26:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076505_1253457920.pth... +[2024-06-18 06:26:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075882_1243250688.pth +[2024-06-18 06:26:39,008][12883] Updated weights for policy 0, policy_version 76511 (0.0033) +[2024-06-18 06:26:41,996][12645] Fps is (10 sec: 39313.9, 60 sec: 42323.7, 300 sec: 42264.9). Total num frames: 1253638144. Throughput: 0: 42401.0. Samples: 1253824600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:26:41,997][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 06:26:43,323][12883] Updated weights for policy 0, policy_version 76521 (0.0023) +[2024-06-18 06:26:46,839][12883] Updated weights for policy 0, policy_version 76531 (0.0034) +[2024-06-18 06:26:46,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42596.8, 300 sec: 42375.9). Total num frames: 1253883904. Throughput: 0: 42445.9. Samples: 1253946620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:26:46,996][12645] Avg episode reward: [(0, '0.063')] +[2024-06-18 06:26:51,221][12883] Updated weights for policy 0, policy_version 76541 (0.0044) +[2024-06-18 06:26:51,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1254096896. Throughput: 0: 42539.0. Samples: 1254209660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:26:51,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 06:26:54,485][12883] Updated weights for policy 0, policy_version 76551 (0.0031) +[2024-06-18 06:26:56,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1254293504. Throughput: 0: 42409.8. Samples: 1254459580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:26:56,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 06:26:58,917][12883] Updated weights for policy 0, policy_version 76561 (0.0036) +[2024-06-18 06:26:59,200][12862] Signal inference workers to stop experience collection... (18150 times) +[2024-06-18 06:26:59,200][12862] Signal inference workers to resume experience collection... (18150 times) +[2024-06-18 06:26:59,239][12883] InferenceWorker_p0-w0: stopping experience collection (18150 times) +[2024-06-18 06:26:59,239][12883] InferenceWorker_p0-w0: resuming experience collection (18150 times) +[2024-06-18 06:27:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42875.8, 300 sec: 42431.8). Total num frames: 1254522880. Throughput: 0: 42589.8. Samples: 1254587340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:27:01,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 06:27:02,411][12883] Updated weights for policy 0, policy_version 76571 (0.0033) +[2024-06-18 06:27:06,705][12883] Updated weights for policy 0, policy_version 76581 (0.0032) +[2024-06-18 06:27:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1254719488. Throughput: 0: 42528.0. Samples: 1254850140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) +[2024-06-18 06:27:06,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 06:27:10,185][12883] Updated weights for policy 0, policy_version 76591 (0.0042) +[2024-06-18 06:27:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1254948864. Throughput: 0: 42131.4. Samples: 1255089760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:11,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 06:27:14,342][12883] Updated weights for policy 0, policy_version 76601 (0.0043) +[2024-06-18 06:27:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1255178240. Throughput: 0: 42487.4. Samples: 1255226160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:16,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 06:27:17,651][12883] Updated weights for policy 0, policy_version 76611 (0.0035) +[2024-06-18 06:27:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1255342080. Throughput: 0: 42541.0. Samples: 1255486680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:21,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 06:27:22,023][12883] Updated weights for policy 0, policy_version 76621 (0.0029) +[2024-06-18 06:27:25,815][12883] Updated weights for policy 0, policy_version 76631 (0.0031) +[2024-06-18 06:27:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 1255587840. Throughput: 0: 42298.6. Samples: 1255727940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:26,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 06:27:29,958][12883] Updated weights for policy 0, policy_version 76641 (0.0028) +[2024-06-18 06:27:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.7, 300 sec: 42542.9). Total num frames: 1255800832. Throughput: 0: 42580.4. Samples: 1255862640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:31,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 06:27:33,318][12883] Updated weights for policy 0, policy_version 76651 (0.0030) +[2024-06-18 06:27:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1255997440. Throughput: 0: 42418.3. Samples: 1256118480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:36,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 06:27:37,412][12883] Updated weights for policy 0, policy_version 76661 (0.0024) +[2024-06-18 06:27:40,916][12883] Updated weights for policy 0, policy_version 76671 (0.0038) +[2024-06-18 06:27:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43146.2, 300 sec: 42542.9). Total num frames: 1256226816. Throughput: 0: 42439.7. Samples: 1256369360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:41,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 06:27:44,981][12883] Updated weights for policy 0, policy_version 76681 (0.0041) +[2024-06-18 06:27:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1256423424. Throughput: 0: 42525.4. Samples: 1256500980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:46,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 06:27:48,708][12883] Updated weights for policy 0, policy_version 76691 (0.0042) +[2024-06-18 06:27:51,996][12645] Fps is (10 sec: 39312.5, 60 sec: 42050.7, 300 sec: 42375.9). Total num frames: 1256620032. Throughput: 0: 42257.0. Samples: 1256751800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) +[2024-06-18 06:27:51,997][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 06:27:52,932][12883] Updated weights for policy 0, policy_version 76701 (0.0030) +[2024-06-18 06:27:56,283][12883] Updated weights for policy 0, policy_version 76711 (0.0031) +[2024-06-18 06:27:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1256849408. Throughput: 0: 42474.7. Samples: 1257001120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:27:56,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 06:28:00,570][12883] Updated weights for policy 0, policy_version 76721 (0.0037) +[2024-06-18 06:28:01,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1257062400. Throughput: 0: 42384.1. Samples: 1257133440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:28:01,994][12645] Avg episode reward: [(0, '0.228')] +[2024-06-18 06:28:03,932][12883] Updated weights for policy 0, policy_version 76731 (0.0027) +[2024-06-18 06:28:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1257259008. Throughput: 0: 42224.0. Samples: 1257386760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:28:06,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 06:28:08,268][12883] Updated weights for policy 0, policy_version 76741 (0.0040) +[2024-06-18 06:28:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42376.6). Total num frames: 1257472000. Throughput: 0: 42359.6. Samples: 1257634120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:28:11,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 06:28:12,055][12883] Updated weights for policy 0, policy_version 76751 (0.0033) +[2024-06-18 06:28:16,066][12883] Updated weights for policy 0, policy_version 76761 (0.0034) +[2024-06-18 06:28:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 1257701376. Throughput: 0: 42295.5. Samples: 1257765940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:28:16,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 06:28:19,695][12883] Updated weights for policy 0, policy_version 76771 (0.0029) +[2024-06-18 06:28:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1257897984. Throughput: 0: 42189.2. Samples: 1258017000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:28:21,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 06:28:23,831][12883] Updated weights for policy 0, policy_version 76781 (0.0028) +[2024-06-18 06:28:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1258127360. Throughput: 0: 42225.2. Samples: 1258269500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:28:26,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 06:28:27,218][12883] Updated weights for policy 0, policy_version 76791 (0.0034) +[2024-06-18 06:28:30,796][12862] Signal inference workers to stop experience collection... (18200 times) +[2024-06-18 06:28:30,828][12883] InferenceWorker_p0-w0: stopping experience collection (18200 times) +[2024-06-18 06:28:30,854][12862] Signal inference workers to resume experience collection... (18200 times) +[2024-06-18 06:28:30,855][12883] InferenceWorker_p0-w0: resuming experience collection (18200 times) +[2024-06-18 06:28:31,377][12883] Updated weights for policy 0, policy_version 76801 (0.0023) +[2024-06-18 06:28:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1258323968. Throughput: 0: 42267.1. Samples: 1258403000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:28:31,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 06:28:34,694][12883] Updated weights for policy 0, policy_version 76811 (0.0032) +[2024-06-18 06:28:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1258536960. Throughput: 0: 42454.4. Samples: 1258662160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:28:37,000][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 06:28:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076815_1258536960.pth... +[2024-06-18 06:28:37,093][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076195_1248378880.pth +[2024-06-18 06:28:39,434][12883] Updated weights for policy 0, policy_version 76821 (0.0036) +[2024-06-18 06:28:41,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1258766336. Throughput: 0: 42407.5. Samples: 1258909460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:28:41,995][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 06:28:42,443][12883] Updated weights for policy 0, policy_version 76831 (0.0030) +[2024-06-18 06:28:46,881][12883] Updated weights for policy 0, policy_version 76841 (0.0034) +[2024-06-18 06:28:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1258962944. Throughput: 0: 42459.9. Samples: 1259044140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:28:46,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 06:28:50,492][12883] Updated weights for policy 0, policy_version 76851 (0.0035) +[2024-06-18 06:28:51,994][12645] Fps is (10 sec: 39322.5, 60 sec: 42327.0, 300 sec: 42320.7). Total num frames: 1259159552. Throughput: 0: 42350.7. Samples: 1259292540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:28:51,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 06:28:54,712][12883] Updated weights for policy 0, policy_version 76861 (0.0027) +[2024-06-18 06:28:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1259388928. Throughput: 0: 42580.8. Samples: 1259550260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:28:56,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 06:28:58,170][12883] Updated weights for policy 0, policy_version 76871 (0.0034) +[2024-06-18 06:29:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1259601920. Throughput: 0: 42532.8. Samples: 1259679920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:29:01,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 06:29:02,296][12883] Updated weights for policy 0, policy_version 76881 (0.0043) +[2024-06-18 06:29:05,831][12883] Updated weights for policy 0, policy_version 76891 (0.0041) +[2024-06-18 06:29:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42431.8). Total num frames: 1259814912. Throughput: 0: 42375.4. Samples: 1259923900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:29:06,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 06:29:10,209][12883] Updated weights for policy 0, policy_version 76901 (0.0031) +[2024-06-18 06:29:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1260027904. Throughput: 0: 42215.5. Samples: 1260169200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:29:11,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 06:29:14,312][12883] Updated weights for policy 0, policy_version 76911 (0.0034) +[2024-06-18 06:29:16,994][12645] Fps is (10 sec: 39322.5, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1260208128. Throughput: 0: 42168.0. Samples: 1260300560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:29:16,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 06:29:17,912][12883] Updated weights for policy 0, policy_version 76921 (0.0050) +[2024-06-18 06:29:21,972][12883] Updated weights for policy 0, policy_version 76931 (0.0031) +[2024-06-18 06:29:21,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 1260437504. Throughput: 0: 42054.4. Samples: 1260554700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:29:21,997][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 06:29:25,498][12883] Updated weights for policy 0, policy_version 76941 (0.0026) +[2024-06-18 06:29:26,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 1260666880. Throughput: 0: 42219.6. Samples: 1260809340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:29:26,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 06:29:29,633][12883] Updated weights for policy 0, policy_version 76951 (0.0027) +[2024-06-18 06:29:31,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1260847104. Throughput: 0: 42148.5. Samples: 1260940820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:29:31,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 06:29:33,494][12883] Updated weights for policy 0, policy_version 76961 (0.0031) +[2024-06-18 06:29:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1261076480. Throughput: 0: 42256.3. Samples: 1261194080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:29:36,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 06:29:37,142][12883] Updated weights for policy 0, policy_version 76971 (0.0033) +[2024-06-18 06:29:41,091][12883] Updated weights for policy 0, policy_version 76981 (0.0033) +[2024-06-18 06:29:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1261289472. Throughput: 0: 42202.2. Samples: 1261449360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:29:41,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 06:29:44,820][12883] Updated weights for policy 0, policy_version 76991 (0.0031) +[2024-06-18 06:29:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1261486080. Throughput: 0: 42102.7. Samples: 1261574540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:29:46,998][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 06:29:48,638][12883] Updated weights for policy 0, policy_version 77001 (0.0037) +[2024-06-18 06:29:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1261715456. Throughput: 0: 42366.0. Samples: 1261830360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:29:51,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 06:29:52,408][12883] Updated weights for policy 0, policy_version 77011 (0.0034) +[2024-06-18 06:29:56,345][12883] Updated weights for policy 0, policy_version 77021 (0.0031) +[2024-06-18 06:29:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1261912064. Throughput: 0: 42528.9. Samples: 1262083000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:29:56,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 06:30:00,402][12883] Updated weights for policy 0, policy_version 77031 (0.0032) +[2024-06-18 06:30:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1262141440. Throughput: 0: 42519.5. Samples: 1262213940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:30:01,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 06:30:03,861][12883] Updated weights for policy 0, policy_version 77041 (0.0027) +[2024-06-18 06:30:05,556][12862] Signal inference workers to stop experience collection... (18250 times) +[2024-06-18 06:30:05,556][12862] Signal inference workers to resume experience collection... (18250 times) +[2024-06-18 06:30:05,597][12883] InferenceWorker_p0-w0: stopping experience collection (18250 times) +[2024-06-18 06:30:05,598][12883] InferenceWorker_p0-w0: resuming experience collection (18250 times) +[2024-06-18 06:30:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.5, 300 sec: 42431.8). Total num frames: 1262338048. Throughput: 0: 42520.0. Samples: 1262468000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:30:06,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 06:30:08,153][12883] Updated weights for policy 0, policy_version 77051 (0.0032) +[2024-06-18 06:30:11,882][12883] Updated weights for policy 0, policy_version 77061 (0.0044) +[2024-06-18 06:30:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1262567424. Throughput: 0: 42530.8. Samples: 1262723220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:30:11,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 06:30:15,957][12883] Updated weights for policy 0, policy_version 77071 (0.0039) +[2024-06-18 06:30:16,997][12645] Fps is (10 sec: 44222.9, 60 sec: 42869.3, 300 sec: 42375.8). Total num frames: 1262780416. Throughput: 0: 42510.4. Samples: 1262853920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:16,997][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 06:30:19,846][12883] Updated weights for policy 0, policy_version 77081 (0.0034) +[2024-06-18 06:30:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 1262993408. Throughput: 0: 42426.3. Samples: 1263103260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:21,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 06:30:23,794][12883] Updated weights for policy 0, policy_version 77091 (0.0053) +[2024-06-18 06:30:26,998][12645] Fps is (10 sec: 40956.7, 60 sec: 42049.6, 300 sec: 42375.7). Total num frames: 1263190016. Throughput: 0: 42359.1. Samples: 1263355680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:26,998][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 06:30:27,775][12883] Updated weights for policy 0, policy_version 77101 (0.0036) +[2024-06-18 06:30:31,513][12883] Updated weights for policy 0, policy_version 77111 (0.0033) +[2024-06-18 06:30:31,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42596.8, 300 sec: 42265.2). Total num frames: 1263403008. Throughput: 0: 42397.9. Samples: 1263482540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:31,996][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 06:30:35,255][12883] Updated weights for policy 0, policy_version 77121 (0.0025) +[2024-06-18 06:30:36,994][12645] Fps is (10 sec: 44254.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1263632384. Throughput: 0: 42413.8. Samples: 1263738980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:36,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 06:30:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077126_1263632384.pth... +[2024-06-18 06:30:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076505_1253457920.pth +[2024-06-18 06:30:39,252][12883] Updated weights for policy 0, policy_version 77131 (0.0049) +[2024-06-18 06:30:41,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1263828992. Throughput: 0: 42265.3. Samples: 1263984940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:41,994][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 06:30:43,339][12883] Updated weights for policy 0, policy_version 77141 (0.0038) +[2024-06-18 06:30:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1264025600. Throughput: 0: 42197.0. Samples: 1264112800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:46,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 06:30:47,015][12883] Updated weights for policy 0, policy_version 77151 (0.0024) +[2024-06-18 06:30:50,770][12883] Updated weights for policy 0, policy_version 77161 (0.0033) +[2024-06-18 06:30:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1264254976. Throughput: 0: 42334.9. Samples: 1264373080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:51,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 06:30:54,704][12883] Updated weights for policy 0, policy_version 77171 (0.0041) +[2024-06-18 06:30:56,996][12645] Fps is (10 sec: 45864.5, 60 sec: 42869.9, 300 sec: 42487.9). Total num frames: 1264484352. Throughput: 0: 42216.0. Samples: 1264623040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:30:56,997][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 06:30:58,238][12883] Updated weights for policy 0, policy_version 77181 (0.0028) +[2024-06-18 06:31:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 1264664576. Throughput: 0: 42355.0. Samples: 1264759760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:31:01,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 06:31:02,532][12883] Updated weights for policy 0, policy_version 77191 (0.0022) +[2024-06-18 06:31:05,800][12883] Updated weights for policy 0, policy_version 77201 (0.0049) +[2024-06-18 06:31:06,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1264893952. Throughput: 0: 42405.8. Samples: 1265011520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:06,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 06:31:10,190][12883] Updated weights for policy 0, policy_version 77211 (0.0037) +[2024-06-18 06:31:11,996][12645] Fps is (10 sec: 47502.5, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 1265139712. Throughput: 0: 42377.1. Samples: 1265262580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:11,996][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 06:31:13,716][12883] Updated weights for policy 0, policy_version 77221 (0.0035) +[2024-06-18 06:31:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42054.5, 300 sec: 42265.2). Total num frames: 1265303552. Throughput: 0: 42471.5. Samples: 1265393660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:16,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 06:31:18,104][12883] Updated weights for policy 0, policy_version 77231 (0.0039) +[2024-06-18 06:31:21,287][12883] Updated weights for policy 0, policy_version 77241 (0.0031) +[2024-06-18 06:31:21,994][12645] Fps is (10 sec: 37691.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1265516544. Throughput: 0: 42294.2. Samples: 1265642220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:21,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 06:31:24,220][12862] Signal inference workers to stop experience collection... (18300 times) +[2024-06-18 06:31:24,224][12862] Signal inference workers to resume experience collection... (18300 times) +[2024-06-18 06:31:24,260][12883] InferenceWorker_p0-w0: stopping experience collection (18300 times) +[2024-06-18 06:31:24,260][12883] InferenceWorker_p0-w0: resuming experience collection (18300 times) +[2024-06-18 06:31:25,625][12883] Updated weights for policy 0, policy_version 77251 (0.0021) +[2024-06-18 06:31:26,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42874.1, 300 sec: 42431.8). Total num frames: 1265762304. Throughput: 0: 42421.8. Samples: 1265893920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:26,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 06:31:28,893][12883] Updated weights for policy 0, policy_version 77261 (0.0045) +[2024-06-18 06:31:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1265942528. Throughput: 0: 42602.2. Samples: 1266029900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:31,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 06:31:33,224][12883] Updated weights for policy 0, policy_version 77271 (0.0034) +[2024-06-18 06:31:36,538][12883] Updated weights for policy 0, policy_version 77281 (0.0045) +[2024-06-18 06:31:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 1266171904. Throughput: 0: 42472.6. Samples: 1266284340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:36,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 06:31:40,787][12883] Updated weights for policy 0, policy_version 77291 (0.0039) +[2024-06-18 06:31:41,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42432.1). Total num frames: 1266401280. Throughput: 0: 42648.4. Samples: 1266542120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:41,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 06:31:43,924][12883] Updated weights for policy 0, policy_version 77301 (0.0048) +[2024-06-18 06:31:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1266581504. Throughput: 0: 42487.1. Samples: 1266671680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:31:46,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 06:31:48,362][12883] Updated weights for policy 0, policy_version 77311 (0.0033) +[2024-06-18 06:31:51,418][12883] Updated weights for policy 0, policy_version 77321 (0.0034) +[2024-06-18 06:31:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1266827264. Throughput: 0: 42589.8. Samples: 1266928060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:31:51,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 06:31:56,001][12883] Updated weights for policy 0, policy_version 77331 (0.0036) +[2024-06-18 06:31:56,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42598.4, 300 sec: 42431.5). Total num frames: 1267040256. Throughput: 0: 42772.9. Samples: 1267187360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:31:56,996][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 06:31:59,132][12883] Updated weights for policy 0, policy_version 77341 (0.0029) +[2024-06-18 06:32:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1267220480. Throughput: 0: 42751.9. Samples: 1267317500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:01,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 06:32:03,890][12883] Updated weights for policy 0, policy_version 77351 (0.0026) +[2024-06-18 06:32:06,665][12883] Updated weights for policy 0, policy_version 77361 (0.0032) +[2024-06-18 06:32:06,994][12645] Fps is (10 sec: 44247.1, 60 sec: 43144.6, 300 sec: 42487.4). Total num frames: 1267482624. Throughput: 0: 42961.4. Samples: 1267575480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:06,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 06:32:11,483][12883] Updated weights for policy 0, policy_version 77371 (0.0031) +[2024-06-18 06:32:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42053.9, 300 sec: 42320.7). Total num frames: 1267662848. Throughput: 0: 43163.7. Samples: 1267836280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:11,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 06:32:14,198][12883] Updated weights for policy 0, policy_version 77381 (0.0035) +[2024-06-18 06:32:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1267875840. Throughput: 0: 42838.2. Samples: 1267957620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:16,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 06:32:19,085][12883] Updated weights for policy 0, policy_version 77391 (0.0036) +[2024-06-18 06:32:21,900][12883] Updated weights for policy 0, policy_version 77401 (0.0032) +[2024-06-18 06:32:21,994][12645] Fps is (10 sec: 47513.8, 60 sec: 43690.7, 300 sec: 42542.9). Total num frames: 1268137984. Throughput: 0: 43075.2. Samples: 1268222720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:21,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 06:32:26,761][12883] Updated weights for policy 0, policy_version 77411 (0.0036) +[2024-06-18 06:32:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1268301824. Throughput: 0: 43098.6. Samples: 1268481560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:26,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 06:32:29,670][12883] Updated weights for policy 0, policy_version 77421 (0.0027) +[2024-06-18 06:32:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1268531200. Throughput: 0: 42888.8. Samples: 1268601680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:31,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 06:32:34,394][12883] Updated weights for policy 0, policy_version 77431 (0.0036) +[2024-06-18 06:32:36,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 1268776960. Throughput: 0: 42969.9. Samples: 1268861700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:36,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 06:32:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077440_1268776960.pth... +[2024-06-18 06:32:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076815_1258536960.pth +[2024-06-18 06:32:37,249][12883] Updated weights for policy 0, policy_version 77441 (0.0031) +[2024-06-18 06:32:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1268940800. Throughput: 0: 43179.1. Samples: 1269130320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:41,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 06:32:42,016][12883] Updated weights for policy 0, policy_version 77451 (0.0032) +[2024-06-18 06:32:44,076][12862] Signal inference workers to stop experience collection... (18350 times) +[2024-06-18 06:32:44,119][12883] InferenceWorker_p0-w0: stopping experience collection (18350 times) +[2024-06-18 06:32:44,128][12862] Signal inference workers to resume experience collection... (18350 times) +[2024-06-18 06:32:44,135][12883] InferenceWorker_p0-w0: resuming experience collection (18350 times) +[2024-06-18 06:32:44,939][12883] Updated weights for policy 0, policy_version 77461 (0.0046) +[2024-06-18 06:32:46,994][12645] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42543.2). Total num frames: 1269170176. Throughput: 0: 42812.3. Samples: 1269244060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:46,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 06:32:49,559][12883] Updated weights for policy 0, policy_version 77471 (0.0030) +[2024-06-18 06:32:51,994][12645] Fps is (10 sec: 49152.4, 60 sec: 43417.7, 300 sec: 42654.0). Total num frames: 1269432320. Throughput: 0: 42927.6. Samples: 1269507220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:51,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 06:32:52,786][12883] Updated weights for policy 0, policy_version 77481 (0.0036) +[2024-06-18 06:32:57,000][12645] Fps is (10 sec: 42572.4, 60 sec: 42595.5, 300 sec: 42486.4). Total num frames: 1269596160. Throughput: 0: 43053.9. Samples: 1269773980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:32:57,001][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 06:32:57,137][12883] Updated weights for policy 0, policy_version 77491 (0.0042) +[2024-06-18 06:33:00,809][12883] Updated weights for policy 0, policy_version 77501 (0.0030) +[2024-06-18 06:33:01,996][12645] Fps is (10 sec: 39312.4, 60 sec: 43416.0, 300 sec: 42598.1). Total num frames: 1269825536. Throughput: 0: 42888.9. Samples: 1269887720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:33:01,997][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 06:33:05,065][12883] Updated weights for policy 0, policy_version 77511 (0.0029) +[2024-06-18 06:33:06,994][12645] Fps is (10 sec: 45904.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1270054912. Throughput: 0: 42858.6. Samples: 1270151360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:33:06,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 06:33:08,547][12883] Updated weights for policy 0, policy_version 77521 (0.0040) +[2024-06-18 06:33:11,994][12645] Fps is (10 sec: 37691.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1270202368. Throughput: 0: 42864.4. Samples: 1270410460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:33:11,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 06:33:12,837][12883] Updated weights for policy 0, policy_version 77531 (0.0032) +[2024-06-18 06:33:16,277][12883] Updated weights for policy 0, policy_version 77541 (0.0029) +[2024-06-18 06:33:16,996][12645] Fps is (10 sec: 40950.8, 60 sec: 43142.9, 300 sec: 42598.1). Total num frames: 1270464512. Throughput: 0: 42729.0. Samples: 1270524580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:33:16,996][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 06:33:20,564][12883] Updated weights for policy 0, policy_version 77551 (0.0029) +[2024-06-18 06:33:21,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1270677504. Throughput: 0: 42741.0. Samples: 1270785040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 06:33:21,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 06:33:23,790][12883] Updated weights for policy 0, policy_version 77561 (0.0037) +[2024-06-18 06:33:26,994][12645] Fps is (10 sec: 37692.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1270841344. Throughput: 0: 42477.0. Samples: 1271041780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:33:26,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 06:33:28,216][12883] Updated weights for policy 0, policy_version 77571 (0.0031) +[2024-06-18 06:33:31,550][12883] Updated weights for policy 0, policy_version 77581 (0.0041) +[2024-06-18 06:33:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1271103488. Throughput: 0: 42583.8. Samples: 1271160320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:33:31,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 06:33:35,819][12883] Updated weights for policy 0, policy_version 77591 (0.0028) +[2024-06-18 06:33:36,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1271316480. Throughput: 0: 42641.7. Samples: 1271426100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:33:36,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 06:33:39,290][12883] Updated weights for policy 0, policy_version 77601 (0.0029) +[2024-06-18 06:33:41,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1271496704. Throughput: 0: 42255.6. Samples: 1271675220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:33:41,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 06:33:43,672][12883] Updated weights for policy 0, policy_version 77611 (0.0027) +[2024-06-18 06:33:46,743][12883] Updated weights for policy 0, policy_version 77621 (0.0038) +[2024-06-18 06:33:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1271742464. Throughput: 0: 42520.2. Samples: 1271801040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:33:46,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 06:33:51,256][12883] Updated weights for policy 0, policy_version 77631 (0.0039) +[2024-06-18 06:33:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 1271939072. Throughput: 0: 42475.4. Samples: 1272062760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:33:51,995][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 06:33:54,644][12883] Updated weights for policy 0, policy_version 77641 (0.0048) +[2024-06-18 06:33:56,996][12645] Fps is (10 sec: 40951.3, 60 sec: 42601.3, 300 sec: 42542.5). Total num frames: 1272152064. Throughput: 0: 42184.5. Samples: 1272308860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:33:56,997][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 06:33:59,067][12883] Updated weights for policy 0, policy_version 77651 (0.0035) +[2024-06-18 06:34:01,999][12645] Fps is (10 sec: 44211.7, 60 sec: 42595.9, 300 sec: 42597.6). Total num frames: 1272381440. Throughput: 0: 42486.4. Samples: 1272436620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:34:02,000][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 06:34:02,485][12883] Updated weights for policy 0, policy_version 77661 (0.0043) +[2024-06-18 06:34:06,716][12883] Updated weights for policy 0, policy_version 77671 (0.0027) +[2024-06-18 06:34:06,994][12645] Fps is (10 sec: 40969.4, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 1272561664. Throughput: 0: 42462.7. Samples: 1272695860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) +[2024-06-18 06:34:06,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 06:34:10,044][12883] Updated weights for policy 0, policy_version 77681 (0.0047) +[2024-06-18 06:34:11,994][12645] Fps is (10 sec: 42623.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1272807424. Throughput: 0: 42238.1. Samples: 1272942500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:11,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 06:34:14,235][12883] Updated weights for policy 0, policy_version 77691 (0.0037) +[2024-06-18 06:34:16,796][12862] Signal inference workers to stop experience collection... (18400 times) +[2024-06-18 06:34:16,796][12862] Signal inference workers to resume experience collection... (18400 times) +[2024-06-18 06:34:16,819][12883] InferenceWorker_p0-w0: stopping experience collection (18400 times) +[2024-06-18 06:34:16,819][12883] InferenceWorker_p0-w0: resuming experience collection (18400 times) +[2024-06-18 06:34:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42600.0, 300 sec: 42654.3). Total num frames: 1273020416. Throughput: 0: 42599.9. Samples: 1273077320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:16,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 06:34:17,768][12883] Updated weights for policy 0, policy_version 77701 (0.0031) +[2024-06-18 06:34:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1273200640. Throughput: 0: 42345.8. Samples: 1273331660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:21,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 06:34:22,068][12883] Updated weights for policy 0, policy_version 77711 (0.0037) +[2024-06-18 06:34:25,341][12883] Updated weights for policy 0, policy_version 77721 (0.0043) +[2024-06-18 06:34:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1273446400. Throughput: 0: 42287.1. Samples: 1273578140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:26,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 06:34:29,705][12883] Updated weights for policy 0, policy_version 77731 (0.0021) +[2024-06-18 06:34:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1273643008. Throughput: 0: 42526.7. Samples: 1273714740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:31,998][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 06:34:33,033][12883] Updated weights for policy 0, policy_version 77741 (0.0036) +[2024-06-18 06:34:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1273856000. Throughput: 0: 42360.6. Samples: 1273968980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:36,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 06:34:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077750_1273856000.pth... +[2024-06-18 06:34:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077126_1263632384.pth +[2024-06-18 06:34:37,483][12883] Updated weights for policy 0, policy_version 77751 (0.0032) +[2024-06-18 06:34:40,924][12883] Updated weights for policy 0, policy_version 77761 (0.0041) +[2024-06-18 06:34:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1274085376. Throughput: 0: 42381.3. Samples: 1274215920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:41,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 06:34:45,193][12883] Updated weights for policy 0, policy_version 77771 (0.0036) +[2024-06-18 06:34:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 1274265600. Throughput: 0: 42537.8. Samples: 1274350580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:46,994][12645] Avg episode reward: [(0, '0.074')] +[2024-06-18 06:34:48,590][12883] Updated weights for policy 0, policy_version 77781 (0.0043) +[2024-06-18 06:34:51,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1274478592. Throughput: 0: 42403.4. Samples: 1274604020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:51,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 06:34:52,756][12883] Updated weights for policy 0, policy_version 77791 (0.0038) +[2024-06-18 06:34:56,447][12883] Updated weights for policy 0, policy_version 77801 (0.0035) +[2024-06-18 06:34:56,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 1274724352. Throughput: 0: 42456.4. Samples: 1274853040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 06:34:56,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 06:35:00,753][12883] Updated weights for policy 0, policy_version 77811 (0.0039) +[2024-06-18 06:35:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42056.3, 300 sec: 42598.4). Total num frames: 1274904576. Throughput: 0: 42318.7. Samples: 1274981660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:01,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 06:35:04,143][12883] Updated weights for policy 0, policy_version 77821 (0.0024) +[2024-06-18 06:35:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1275101184. Throughput: 0: 42257.3. Samples: 1275233240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:06,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 06:35:08,383][12883] Updated weights for policy 0, policy_version 77831 (0.0036) +[2024-06-18 06:35:11,837][12883] Updated weights for policy 0, policy_version 77841 (0.0033) +[2024-06-18 06:35:11,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42598.8). Total num frames: 1275346944. Throughput: 0: 42523.5. Samples: 1275491700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:11,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 06:35:16,205][12883] Updated weights for policy 0, policy_version 77851 (0.0036) +[2024-06-18 06:35:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 1275527168. Throughput: 0: 42318.2. Samples: 1275619060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:16,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 06:35:19,612][12883] Updated weights for policy 0, policy_version 77861 (0.0031) +[2024-06-18 06:35:21,994][12645] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 1275740160. Throughput: 0: 42232.1. Samples: 1275869420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:21,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 06:35:23,708][12883] Updated weights for policy 0, policy_version 77871 (0.0028) +[2024-06-18 06:35:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 1275969536. Throughput: 0: 42386.2. Samples: 1276123300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:26,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 06:35:27,316][12883] Updated weights for policy 0, policy_version 77881 (0.0035) +[2024-06-18 06:35:31,314][12883] Updated weights for policy 0, policy_version 77891 (0.0032) +[2024-06-18 06:35:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1276166144. Throughput: 0: 42238.7. Samples: 1276251320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:31,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 06:35:35,491][12883] Updated weights for policy 0, policy_version 77901 (0.0031) +[2024-06-18 06:35:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1276379136. Throughput: 0: 42228.2. Samples: 1276504280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:36,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 06:35:38,917][12883] Updated weights for policy 0, policy_version 77911 (0.0038) +[2024-06-18 06:35:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1276608512. Throughput: 0: 42307.1. Samples: 1276756860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 06:35:41,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 06:35:42,310][12862] Signal inference workers to stop experience collection... (18450 times) +[2024-06-18 06:35:42,339][12883] InferenceWorker_p0-w0: stopping experience collection (18450 times) +[2024-06-18 06:35:42,364][12862] Signal inference workers to resume experience collection... (18450 times) +[2024-06-18 06:35:42,365][12883] InferenceWorker_p0-w0: resuming experience collection (18450 times) +[2024-06-18 06:35:43,261][12883] Updated weights for policy 0, policy_version 77921 (0.0050) +[2024-06-18 06:35:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1276805120. Throughput: 0: 42373.3. Samples: 1276888460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:35:46,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 06:35:47,291][12883] Updated weights for policy 0, policy_version 77931 (0.0050) +[2024-06-18 06:35:50,842][12883] Updated weights for policy 0, policy_version 77941 (0.0039) +[2024-06-18 06:35:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1277018112. Throughput: 0: 42430.2. Samples: 1277142600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:35:51,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 06:35:54,700][12883] Updated weights for policy 0, policy_version 77951 (0.0032) +[2024-06-18 06:35:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 1277231104. Throughput: 0: 42372.5. Samples: 1277398460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:35:56,995][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 06:35:58,502][12883] Updated weights for policy 0, policy_version 77961 (0.0032) +[2024-06-18 06:36:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1277460480. Throughput: 0: 42519.1. Samples: 1277532420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:36:01,994][12645] Avg episode reward: [(0, '0.114')] +[2024-06-18 06:36:02,166][12883] Updated weights for policy 0, policy_version 77971 (0.0033) +[2024-06-18 06:36:06,346][12883] Updated weights for policy 0, policy_version 77981 (0.0036) +[2024-06-18 06:36:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 1277657088. Throughput: 0: 42644.7. Samples: 1277788440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:36:06,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 06:36:09,833][12883] Updated weights for policy 0, policy_version 77991 (0.0039) +[2024-06-18 06:36:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1277886464. Throughput: 0: 42572.0. Samples: 1278039040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:36:11,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 06:36:14,221][12883] Updated weights for policy 0, policy_version 78001 (0.0037) +[2024-06-18 06:36:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1278099456. Throughput: 0: 42549.8. Samples: 1278166060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:36:16,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 06:36:17,617][12883] Updated weights for policy 0, policy_version 78011 (0.0034) +[2024-06-18 06:36:21,803][12883] Updated weights for policy 0, policy_version 78021 (0.0033) +[2024-06-18 06:36:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 1278296064. Throughput: 0: 42636.2. Samples: 1278422920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:36:21,995][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 06:36:25,209][12883] Updated weights for policy 0, policy_version 78031 (0.0041) +[2024-06-18 06:36:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1278525440. Throughput: 0: 42679.5. Samples: 1278677440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:36:26,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 06:36:29,634][12883] Updated weights for policy 0, policy_version 78041 (0.0026) +[2024-06-18 06:36:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1278738432. Throughput: 0: 42713.2. Samples: 1278810560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 06:36:31,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 06:36:32,976][12883] Updated weights for policy 0, policy_version 78051 (0.0040) +[2024-06-18 06:36:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1278935040. Throughput: 0: 42516.8. Samples: 1279055860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:36:36,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 06:36:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078060_1278935040.pth... +[2024-06-18 06:36:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077440_1268776960.pth +[2024-06-18 06:36:37,234][12883] Updated weights for policy 0, policy_version 78061 (0.0038) +[2024-06-18 06:36:40,715][12883] Updated weights for policy 0, policy_version 78071 (0.0024) +[2024-06-18 06:36:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1279148032. Throughput: 0: 42464.5. Samples: 1279309360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:36:41,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 06:36:44,865][12883] Updated weights for policy 0, policy_version 78081 (0.0039) +[2024-06-18 06:36:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1279361024. Throughput: 0: 42358.8. Samples: 1279438560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:36:47,000][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 06:36:48,222][12883] Updated weights for policy 0, policy_version 78091 (0.0044) +[2024-06-18 06:36:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 1279574016. Throughput: 0: 42355.2. Samples: 1279694420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:36:51,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 06:36:52,314][12883] Updated weights for policy 0, policy_version 78101 (0.0031) +[2024-06-18 06:36:56,289][12883] Updated weights for policy 0, policy_version 78111 (0.0025) +[2024-06-18 06:36:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1279770624. Throughput: 0: 42321.4. Samples: 1279943500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:36:56,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 06:36:59,989][12883] Updated weights for policy 0, policy_version 78121 (0.0033) +[2024-06-18 06:37:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1280000000. Throughput: 0: 42341.0. Samples: 1280071400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:37:01,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 06:37:03,870][12883] Updated weights for policy 0, policy_version 78131 (0.0031) +[2024-06-18 06:37:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1280212992. Throughput: 0: 42440.6. Samples: 1280332740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:37:06,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 06:37:07,778][12883] Updated weights for policy 0, policy_version 78141 (0.0035) +[2024-06-18 06:37:09,394][12862] Signal inference workers to stop experience collection... (18500 times) +[2024-06-18 06:37:09,394][12862] Signal inference workers to resume experience collection... (18500 times) +[2024-06-18 06:37:09,436][12883] InferenceWorker_p0-w0: stopping experience collection (18500 times) +[2024-06-18 06:37:09,437][12883] InferenceWorker_p0-w0: resuming experience collection (18500 times) +[2024-06-18 06:37:11,587][12883] Updated weights for policy 0, policy_version 78151 (0.0030) +[2024-06-18 06:37:12,000][12645] Fps is (10 sec: 42571.2, 60 sec: 42320.9, 300 sec: 42542.0). Total num frames: 1280425984. Throughput: 0: 42325.4. Samples: 1280582340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:37:12,000][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 06:37:15,590][12883] Updated weights for policy 0, policy_version 78161 (0.0026) +[2024-06-18 06:37:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1280622592. Throughput: 0: 42220.6. Samples: 1280710480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 06:37:16,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 06:37:19,263][12883] Updated weights for policy 0, policy_version 78171 (0.0038) +[2024-06-18 06:37:21,994][12645] Fps is (10 sec: 40985.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1280835584. Throughput: 0: 42474.2. Samples: 1280967200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:37:21,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 06:37:23,305][12883] Updated weights for policy 0, policy_version 78181 (0.0041) +[2024-06-18 06:37:26,915][12883] Updated weights for policy 0, policy_version 78191 (0.0030) +[2024-06-18 06:37:26,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1281081344. Throughput: 0: 42440.6. Samples: 1281219180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:37:26,994][12645] Avg episode reward: [(0, '0.122')] +[2024-06-18 06:37:31,313][12883] Updated weights for policy 0, policy_version 78201 (0.0044) +[2024-06-18 06:37:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1281261568. Throughput: 0: 42437.3. Samples: 1281348240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:37:31,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 06:37:34,717][12883] Updated weights for policy 0, policy_version 78211 (0.0024) +[2024-06-18 06:37:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1281474560. Throughput: 0: 42319.9. Samples: 1281598820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:37:36,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 06:37:39,004][12883] Updated weights for policy 0, policy_version 78221 (0.0045) +[2024-06-18 06:37:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1281703936. Throughput: 0: 42472.4. Samples: 1281854760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:37:41,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 06:37:42,478][12883] Updated weights for policy 0, policy_version 78231 (0.0036) +[2024-06-18 06:37:46,995][12645] Fps is (10 sec: 40953.9, 60 sec: 42051.1, 300 sec: 42209.4). Total num frames: 1281884160. Throughput: 0: 42522.9. Samples: 1281985000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:37:46,996][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 06:37:47,498][12883] Updated weights for policy 0, policy_version 78241 (0.0028) +[2024-06-18 06:37:50,181][12883] Updated weights for policy 0, policy_version 78251 (0.0045) +[2024-06-18 06:37:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42432.7). Total num frames: 1282113536. Throughput: 0: 42145.3. Samples: 1282229280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:37:51,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 06:37:55,002][12883] Updated weights for policy 0, policy_version 78261 (0.0036) +[2024-06-18 06:37:56,994][12645] Fps is (10 sec: 45882.7, 60 sec: 42871.5, 300 sec: 42432.1). Total num frames: 1282342912. Throughput: 0: 42312.2. Samples: 1282486120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:37:56,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 06:37:57,890][12883] Updated weights for policy 0, policy_version 78271 (0.0034) +[2024-06-18 06:38:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1282539520. Throughput: 0: 42462.0. Samples: 1282621280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:38:01,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 06:38:02,439][12883] Updated weights for policy 0, policy_version 78281 (0.0030) +[2024-06-18 06:38:05,407][12883] Updated weights for policy 0, policy_version 78291 (0.0027) +[2024-06-18 06:38:07,000][12645] Fps is (10 sec: 42571.5, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 1282768896. Throughput: 0: 42367.5. Samples: 1282874000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 06:38:07,000][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 06:38:09,913][12883] Updated weights for policy 0, policy_version 78301 (0.0032) +[2024-06-18 06:38:11,993][12645] Fps is (10 sec: 42599.5, 60 sec: 42329.9, 300 sec: 42376.6). Total num frames: 1282965504. Throughput: 0: 42501.9. Samples: 1283131760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:11,994][12645] Avg episode reward: [(0, '0.125')] +[2024-06-18 06:38:12,926][12883] Updated weights for policy 0, policy_version 78311 (0.0037) +[2024-06-18 06:38:16,994][12645] Fps is (10 sec: 39346.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1283162112. Throughput: 0: 42433.8. Samples: 1283257760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:16,994][12645] Avg episode reward: [(0, '0.155')] +[2024-06-18 06:38:17,857][12883] Updated weights for policy 0, policy_version 78321 (0.0029) +[2024-06-18 06:38:20,892][12883] Updated weights for policy 0, policy_version 78331 (0.0041) +[2024-06-18 06:38:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1283407872. Throughput: 0: 42412.1. Samples: 1283507360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:21,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 06:38:25,469][12883] Updated weights for policy 0, policy_version 78341 (0.0031) +[2024-06-18 06:38:26,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42050.6, 300 sec: 42375.9). Total num frames: 1283604480. Throughput: 0: 42364.5. Samples: 1283761260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:26,996][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 06:38:28,657][12883] Updated weights for policy 0, policy_version 78351 (0.0033) +[2024-06-18 06:38:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1283817472. Throughput: 0: 42382.8. Samples: 1283892160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:31,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 06:38:32,998][12883] Updated weights for policy 0, policy_version 78361 (0.0036) +[2024-06-18 06:38:36,915][12883] Updated weights for policy 0, policy_version 78371 (0.0030) +[2024-06-18 06:38:36,994][12645] Fps is (10 sec: 42607.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1284030464. Throughput: 0: 42467.9. Samples: 1284140340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:36,995][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 06:38:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078371_1284030464.pth... +[2024-06-18 06:38:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077750_1273856000.pth +[2024-06-18 06:38:40,636][12883] Updated weights for policy 0, policy_version 78381 (0.0023) +[2024-06-18 06:38:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1284227072. Throughput: 0: 42481.7. Samples: 1284397800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:41,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 06:38:44,668][12883] Updated weights for policy 0, policy_version 78391 (0.0031) +[2024-06-18 06:38:45,876][12862] Signal inference workers to stop experience collection... (18550 times) +[2024-06-18 06:38:45,876][12862] Signal inference workers to resume experience collection... (18550 times) +[2024-06-18 06:38:45,901][12883] InferenceWorker_p0-w0: stopping experience collection (18550 times) +[2024-06-18 06:38:45,901][12883] InferenceWorker_p0-w0: resuming experience collection (18550 times) +[2024-06-18 06:38:46,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42872.6, 300 sec: 42431.8). Total num frames: 1284456448. Throughput: 0: 42336.6. Samples: 1284526420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:46,994][12645] Avg episode reward: [(0, '0.120')] +[2024-06-18 06:38:48,768][12883] Updated weights for policy 0, policy_version 78401 (0.0028) +[2024-06-18 06:38:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 1284669440. Throughput: 0: 42286.2. Samples: 1284776620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:51,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 06:38:52,170][12883] Updated weights for policy 0, policy_version 78411 (0.0030) +[2024-06-18 06:38:56,245][12883] Updated weights for policy 0, policy_version 78421 (0.0035) +[2024-06-18 06:38:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42321.6). Total num frames: 1284866048. Throughput: 0: 42311.1. Samples: 1285035760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 06:38:56,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 06:38:59,754][12883] Updated weights for policy 0, policy_version 78431 (0.0029) +[2024-06-18 06:39:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1285079040. Throughput: 0: 42287.1. Samples: 1285160680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:01,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 06:39:03,809][12883] Updated weights for policy 0, policy_version 78441 (0.0036) +[2024-06-18 06:39:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42329.8, 300 sec: 42376.3). Total num frames: 1285308416. Throughput: 0: 42393.8. Samples: 1285415080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:06,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 06:39:07,539][12883] Updated weights for policy 0, policy_version 78451 (0.0028) +[2024-06-18 06:39:11,568][12883] Updated weights for policy 0, policy_version 78461 (0.0035) +[2024-06-18 06:39:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1285505024. Throughput: 0: 42455.9. Samples: 1285671680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:11,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 06:39:15,235][12883] Updated weights for policy 0, policy_version 78471 (0.0035) +[2024-06-18 06:39:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1285734400. Throughput: 0: 42338.2. Samples: 1285797380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:16,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 06:39:19,231][12883] Updated weights for policy 0, policy_version 78481 (0.0034) +[2024-06-18 06:39:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1285947392. Throughput: 0: 42573.4. Samples: 1286056140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:21,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 06:39:22,923][12883] Updated weights for policy 0, policy_version 78491 (0.0041) +[2024-06-18 06:39:26,910][12883] Updated weights for policy 0, policy_version 78501 (0.0049) +[2024-06-18 06:39:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42600.0, 300 sec: 42431.8). Total num frames: 1286160384. Throughput: 0: 42507.2. Samples: 1286310620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:26,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 06:39:30,862][12883] Updated weights for policy 0, policy_version 78511 (0.0043) +[2024-06-18 06:39:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1286373376. Throughput: 0: 42347.0. Samples: 1286432040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:31,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 06:39:34,882][12883] Updated weights for policy 0, policy_version 78521 (0.0044) +[2024-06-18 06:39:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1286569984. Throughput: 0: 42534.0. Samples: 1286690640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:36,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 06:39:38,446][12883] Updated weights for policy 0, policy_version 78531 (0.0035) +[2024-06-18 06:39:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1286799360. Throughput: 0: 42353.6. Samples: 1286941680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 06:39:41,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 06:39:43,019][12883] Updated weights for policy 0, policy_version 78541 (0.0035) +[2024-06-18 06:39:46,275][12883] Updated weights for policy 0, policy_version 78551 (0.0024) +[2024-06-18 06:39:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1286995968. Throughput: 0: 42404.9. Samples: 1287068900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:39:46,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 06:39:50,793][12883] Updated weights for policy 0, policy_version 78561 (0.0030) +[2024-06-18 06:39:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1287208960. Throughput: 0: 42586.7. Samples: 1287331480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:39:51,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 06:39:54,148][12883] Updated weights for policy 0, policy_version 78571 (0.0039) +[2024-06-18 06:39:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1287438336. Throughput: 0: 42412.0. Samples: 1287580220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:39:56,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 06:39:58,634][12883] Updated weights for policy 0, policy_version 78581 (0.0028) +[2024-06-18 06:40:01,803][12883] Updated weights for policy 0, policy_version 78591 (0.0023) +[2024-06-18 06:40:01,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 1287651328. Throughput: 0: 42438.3. Samples: 1287707200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:40:01,997][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 06:40:06,186][12883] Updated weights for policy 0, policy_version 78601 (0.0038) +[2024-06-18 06:40:06,707][12862] Signal inference workers to stop experience collection... (18600 times) +[2024-06-18 06:40:06,708][12862] Signal inference workers to resume experience collection... (18600 times) +[2024-06-18 06:40:06,726][12883] InferenceWorker_p0-w0: stopping experience collection (18600 times) +[2024-06-18 06:40:06,726][12883] InferenceWorker_p0-w0: resuming experience collection (18600 times) +[2024-06-18 06:40:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1287831552. Throughput: 0: 42324.0. Samples: 1287960720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:40:06,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 06:40:09,391][12883] Updated weights for policy 0, policy_version 78611 (0.0040) +[2024-06-18 06:40:11,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1288060928. Throughput: 0: 42346.5. Samples: 1288216220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:40:11,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 06:40:13,747][12883] Updated weights for policy 0, policy_version 78621 (0.0035) +[2024-06-18 06:40:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1288257536. Throughput: 0: 42506.3. Samples: 1288344820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:40:16,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 06:40:17,478][12883] Updated weights for policy 0, policy_version 78631 (0.0025) +[2024-06-18 06:40:21,599][12883] Updated weights for policy 0, policy_version 78641 (0.0036) +[2024-06-18 06:40:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1288470528. Throughput: 0: 42426.5. Samples: 1288599840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:40:21,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 06:40:24,939][12883] Updated weights for policy 0, policy_version 78651 (0.0035) +[2024-06-18 06:40:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1288699904. Throughput: 0: 42562.6. Samples: 1288857000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:40:26,994][12645] Avg episode reward: [(0, '0.150')] +[2024-06-18 06:40:29,277][12883] Updated weights for policy 0, policy_version 78661 (0.0030) +[2024-06-18 06:40:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1288912896. Throughput: 0: 42712.8. Samples: 1288990980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 06:40:31,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 06:40:32,605][12883] Updated weights for policy 0, policy_version 78671 (0.0041) +[2024-06-18 06:40:36,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1289093120. Throughput: 0: 42402.2. Samples: 1289239580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:40:36,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 06:40:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078681_1289109504.pth... +[2024-06-18 06:40:37,027][12883] Updated weights for policy 0, policy_version 78681 (0.0033) +[2024-06-18 06:40:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078060_1278935040.pth +[2024-06-18 06:40:40,327][12883] Updated weights for policy 0, policy_version 78691 (0.0031) +[2024-06-18 06:40:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1289355264. Throughput: 0: 42517.2. Samples: 1289493500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:40:41,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 06:40:44,673][12883] Updated weights for policy 0, policy_version 78701 (0.0032) +[2024-06-18 06:40:46,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1289535488. Throughput: 0: 42628.3. Samples: 1289625380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:40:46,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 06:40:47,959][12883] Updated weights for policy 0, policy_version 78711 (0.0051) +[2024-06-18 06:40:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1289748480. Throughput: 0: 42488.4. Samples: 1289872700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:40:51,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 06:40:52,269][12883] Updated weights for policy 0, policy_version 78721 (0.0029) +[2024-06-18 06:40:55,767][12883] Updated weights for policy 0, policy_version 78731 (0.0031) +[2024-06-18 06:40:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1289977856. Throughput: 0: 42570.9. Samples: 1290131900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:40:56,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 06:40:59,902][12883] Updated weights for policy 0, policy_version 78741 (0.0032) +[2024-06-18 06:41:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.8, 300 sec: 42431.8). Total num frames: 1290174464. Throughput: 0: 42624.8. Samples: 1290262940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:41:01,998][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 06:41:03,663][12883] Updated weights for policy 0, policy_version 78751 (0.0026) +[2024-06-18 06:41:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1290387456. Throughput: 0: 42443.5. Samples: 1290509800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:41:06,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 06:41:07,678][12883] Updated weights for policy 0, policy_version 78761 (0.0032) +[2024-06-18 06:41:11,322][12883] Updated weights for policy 0, policy_version 78771 (0.0043) +[2024-06-18 06:41:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1290600448. Throughput: 0: 42589.4. Samples: 1290773520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:41:11,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 06:41:12,923][12862] Signal inference workers to stop experience collection... (18650 times) +[2024-06-18 06:41:12,923][12862] Signal inference workers to resume experience collection... (18650 times) +[2024-06-18 06:41:12,968][12883] InferenceWorker_p0-w0: stopping experience collection (18650 times) +[2024-06-18 06:41:12,968][12883] InferenceWorker_p0-w0: resuming experience collection (18650 times) +[2024-06-18 06:41:15,520][12883] Updated weights for policy 0, policy_version 78781 (0.0030) +[2024-06-18 06:41:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1290829824. Throughput: 0: 42376.4. Samples: 1290897920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:41:16,996][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 06:41:19,017][12883] Updated weights for policy 0, policy_version 78791 (0.0037) +[2024-06-18 06:41:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1291026432. Throughput: 0: 42428.8. Samples: 1291148880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 06:41:21,999][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 06:41:23,232][12883] Updated weights for policy 0, policy_version 78801 (0.0033) +[2024-06-18 06:41:26,893][12883] Updated weights for policy 0, policy_version 78811 (0.0030) +[2024-06-18 06:41:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 1291239424. Throughput: 0: 42460.2. Samples: 1291404200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:41:26,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 06:41:30,897][12883] Updated weights for policy 0, policy_version 78821 (0.0035) +[2024-06-18 06:41:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1291436032. Throughput: 0: 42422.2. Samples: 1291534380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:41:31,996][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 06:41:34,415][12883] Updated weights for policy 0, policy_version 78831 (0.0032) +[2024-06-18 06:41:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 1291681792. Throughput: 0: 42497.8. Samples: 1291785100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:41:36,996][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 06:41:38,633][12883] Updated weights for policy 0, policy_version 78841 (0.0043) +[2024-06-18 06:41:41,877][12883] Updated weights for policy 0, policy_version 78851 (0.0035) +[2024-06-18 06:41:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1291894784. Throughput: 0: 42499.4. Samples: 1292044380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:41:41,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 06:41:46,264][12883] Updated weights for policy 0, policy_version 78861 (0.0040) +[2024-06-18 06:41:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1292075008. Throughput: 0: 42364.0. Samples: 1292169320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:41:46,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 06:41:49,344][12883] Updated weights for policy 0, policy_version 78871 (0.0025) +[2024-06-18 06:41:51,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1292320768. Throughput: 0: 42614.4. Samples: 1292427440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:41:51,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 06:41:54,104][12883] Updated weights for policy 0, policy_version 78881 (0.0042) +[2024-06-18 06:41:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1292533760. Throughput: 0: 42478.1. Samples: 1292685040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:41:56,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 06:41:57,052][12883] Updated weights for policy 0, policy_version 78891 (0.0038) +[2024-06-18 06:42:01,675][12883] Updated weights for policy 0, policy_version 78901 (0.0028) +[2024-06-18 06:42:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1292713984. Throughput: 0: 42538.2. Samples: 1292812140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:42:01,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 06:42:04,980][12883] Updated weights for policy 0, policy_version 78911 (0.0036) +[2024-06-18 06:42:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42432.7). Total num frames: 1292943360. Throughput: 0: 42459.5. Samples: 1293059560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 06:42:06,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 06:42:09,220][12883] Updated weights for policy 0, policy_version 78921 (0.0038) +[2024-06-18 06:42:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1293172736. Throughput: 0: 42588.7. Samples: 1293320700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:11,995][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 06:42:12,500][12883] Updated weights for policy 0, policy_version 78931 (0.0034) +[2024-06-18 06:42:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1293352960. Throughput: 0: 42512.0. Samples: 1293447420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:16,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 06:42:17,173][12883] Updated weights for policy 0, policy_version 78941 (0.0039) +[2024-06-18 06:42:18,405][12862] Signal inference workers to stop experience collection... (18700 times) +[2024-06-18 06:42:18,445][12883] InferenceWorker_p0-w0: stopping experience collection (18700 times) +[2024-06-18 06:42:18,465][12862] Signal inference workers to resume experience collection... (18700 times) +[2024-06-18 06:42:18,466][12883] InferenceWorker_p0-w0: resuming experience collection (18700 times) +[2024-06-18 06:42:20,062][12883] Updated weights for policy 0, policy_version 78951 (0.0029) +[2024-06-18 06:42:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1293598720. Throughput: 0: 42495.1. Samples: 1293697380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:21,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 06:42:24,669][12883] Updated weights for policy 0, policy_version 78961 (0.0039) +[2024-06-18 06:42:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1293811712. Throughput: 0: 42664.0. Samples: 1293964260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:26,994][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 06:42:27,726][12883] Updated weights for policy 0, policy_version 78971 (0.0034) +[2024-06-18 06:42:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1294008320. Throughput: 0: 42687.1. Samples: 1294090240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:31,994][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 06:42:32,473][12883] Updated weights for policy 0, policy_version 78981 (0.0038) +[2024-06-18 06:42:35,316][12883] Updated weights for policy 0, policy_version 78991 (0.0030) +[2024-06-18 06:42:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1294221312. Throughput: 0: 42519.0. Samples: 1294340800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:36,994][12645] Avg episode reward: [(0, '0.072')] +[2024-06-18 06:42:37,068][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078994_1294237696.pth... +[2024-06-18 06:42:37,118][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078371_1284030464.pth +[2024-06-18 06:42:40,019][12883] Updated weights for policy 0, policy_version 79001 (0.0043) +[2024-06-18 06:42:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42543.1). Total num frames: 1294434304. Throughput: 0: 42600.9. Samples: 1294602080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:41,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 06:42:42,963][12883] Updated weights for policy 0, policy_version 79011 (0.0034) +[2024-06-18 06:42:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1294647296. Throughput: 0: 42588.0. Samples: 1294728600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:46,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 06:42:47,663][12883] Updated weights for policy 0, policy_version 79021 (0.0034) +[2024-06-18 06:42:50,555][12883] Updated weights for policy 0, policy_version 79031 (0.0031) +[2024-06-18 06:42:51,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1294876672. Throughput: 0: 42536.6. Samples: 1294973700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:51,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 06:42:55,560][12883] Updated weights for policy 0, policy_version 79041 (0.0039) +[2024-06-18 06:42:56,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.9, 300 sec: 42542.6). Total num frames: 1295089664. Throughput: 0: 42631.3. Samples: 1295239200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 06:42:56,996][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 06:42:58,159][12883] Updated weights for policy 0, policy_version 79051 (0.0029) +[2024-06-18 06:43:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42432.7). Total num frames: 1295286272. Throughput: 0: 42604.0. Samples: 1295364600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:01,994][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 06:43:03,219][12883] Updated weights for policy 0, policy_version 79061 (0.0036) +[2024-06-18 06:43:05,895][12883] Updated weights for policy 0, policy_version 79071 (0.0038) +[2024-06-18 06:43:06,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1295515648. Throughput: 0: 42610.2. Samples: 1295614840. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:06,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 06:43:10,974][12883] Updated weights for policy 0, policy_version 79081 (0.0033) +[2024-06-18 06:43:11,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1295728640. Throughput: 0: 42580.4. Samples: 1295880380. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:11,995][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 06:43:13,619][12883] Updated weights for policy 0, policy_version 79091 (0.0041) +[2024-06-18 06:43:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1295908864. Throughput: 0: 42443.5. Samples: 1296000200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:16,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 06:43:18,763][12883] Updated weights for policy 0, policy_version 79101 (0.0038) +[2024-06-18 06:43:21,719][12883] Updated weights for policy 0, policy_version 79111 (0.0029) +[2024-06-18 06:43:21,997][12645] Fps is (10 sec: 42584.1, 60 sec: 42595.9, 300 sec: 42542.7). Total num frames: 1296154624. Throughput: 0: 42425.6. Samples: 1296250100. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:21,998][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 06:43:26,370][12883] Updated weights for policy 0, policy_version 79121 (0.0027) +[2024-06-18 06:43:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1296351232. Throughput: 0: 42380.4. Samples: 1296509200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:26,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 06:43:29,608][12883] Updated weights for policy 0, policy_version 79131 (0.0036) +[2024-06-18 06:43:31,994][12645] Fps is (10 sec: 37697.0, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1296531456. Throughput: 0: 42270.8. Samples: 1296630780. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:31,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 06:43:33,866][12883] Updated weights for policy 0, policy_version 79141 (0.0040) +[2024-06-18 06:43:34,578][12862] Signal inference workers to stop experience collection... (18750 times) +[2024-06-18 06:43:34,578][12862] Signal inference workers to resume experience collection... (18750 times) +[2024-06-18 06:43:34,593][12883] InferenceWorker_p0-w0: stopping experience collection (18750 times) +[2024-06-18 06:43:34,593][12883] InferenceWorker_p0-w0: resuming experience collection (18750 times) +[2024-06-18 06:43:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1296777216. Throughput: 0: 42502.6. Samples: 1296886320. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:36,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 06:43:37,282][12883] Updated weights for policy 0, policy_version 79151 (0.0025) +[2024-06-18 06:43:41,759][12883] Updated weights for policy 0, policy_version 79161 (0.0031) +[2024-06-18 06:43:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1296973824. Throughput: 0: 42410.1. Samples: 1297147560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:41,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 06:43:45,032][12883] Updated weights for policy 0, policy_version 79171 (0.0029) +[2024-06-18 06:43:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1297186816. Throughput: 0: 42389.7. Samples: 1297272140. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) +[2024-06-18 06:43:46,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 06:43:49,349][12883] Updated weights for policy 0, policy_version 79181 (0.0039) +[2024-06-18 06:43:51,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1297432576. Throughput: 0: 42447.2. Samples: 1297524960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:43:51,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 06:43:52,646][12883] Updated weights for policy 0, policy_version 79191 (0.0037) +[2024-06-18 06:43:56,952][12883] Updated weights for policy 0, policy_version 79201 (0.0029) +[2024-06-18 06:43:56,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42325.3, 300 sec: 42542.5). Total num frames: 1297629184. Throughput: 0: 42320.7. Samples: 1297784900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:43:56,997][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 06:44:00,388][12883] Updated weights for policy 0, policy_version 79211 (0.0039) +[2024-06-18 06:44:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1297809408. Throughput: 0: 42331.7. Samples: 1297905120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:44:01,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 06:44:04,606][12883] Updated weights for policy 0, policy_version 79221 (0.0023) +[2024-06-18 06:44:06,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1298071552. Throughput: 0: 42574.9. Samples: 1298165820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:44:06,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 06:44:07,951][12883] Updated weights for policy 0, policy_version 79231 (0.0024) +[2024-06-18 06:44:11,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1298268160. Throughput: 0: 42611.2. Samples: 1298426700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:44:11,994][12645] Avg episode reward: [(0, '0.060')] +[2024-06-18 06:44:12,476][12883] Updated weights for policy 0, policy_version 79241 (0.0026) +[2024-06-18 06:44:15,583][12883] Updated weights for policy 0, policy_version 79251 (0.0028) +[2024-06-18 06:44:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1298464768. Throughput: 0: 42647.7. Samples: 1298549940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:44:16,994][12645] Avg episode reward: [(0, '0.090')] +[2024-06-18 06:44:20,318][12883] Updated weights for policy 0, policy_version 79261 (0.0042) +[2024-06-18 06:44:21,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42326.2, 300 sec: 42487.0). Total num frames: 1298694144. Throughput: 0: 42658.3. Samples: 1298806040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:44:21,996][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 06:44:23,610][12883] Updated weights for policy 0, policy_version 79271 (0.0037) +[2024-06-18 06:44:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1298890752. Throughput: 0: 42453.7. Samples: 1299057980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:44:26,994][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 06:44:28,097][12883] Updated weights for policy 0, policy_version 79281 (0.0040) +[2024-06-18 06:44:31,678][12883] Updated weights for policy 0, policy_version 79291 (0.0032) +[2024-06-18 06:44:32,000][12645] Fps is (10 sec: 40943.7, 60 sec: 42866.9, 300 sec: 42486.4). Total num frames: 1299103744. Throughput: 0: 42397.3. Samples: 1299180280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:44:32,000][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 06:44:35,581][12883] Updated weights for policy 0, policy_version 79301 (0.0038) +[2024-06-18 06:44:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1299333120. Throughput: 0: 42493.7. Samples: 1299437180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:44:36,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 06:44:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079305_1299333120.pth... +[2024-06-18 06:44:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078681_1289109504.pth +[2024-06-18 06:44:39,299][12883] Updated weights for policy 0, policy_version 79311 (0.0029) +[2024-06-18 06:44:41,994][12645] Fps is (10 sec: 42624.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1299529728. Throughput: 0: 42583.4. Samples: 1299701060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:44:42,000][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 06:44:43,146][12883] Updated weights for policy 0, policy_version 79321 (0.0029) +[2024-06-18 06:44:46,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 1299742720. Throughput: 0: 42545.8. Samples: 1299819780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:44:46,997][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 06:44:47,218][12883] Updated weights for policy 0, policy_version 79331 (0.0026) +[2024-06-18 06:44:50,832][12883] Updated weights for policy 0, policy_version 79341 (0.0028) +[2024-06-18 06:44:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1299988480. Throughput: 0: 42565.7. Samples: 1300081280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:44:51,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 06:44:54,745][12883] Updated weights for policy 0, policy_version 79351 (0.0043) +[2024-06-18 06:44:56,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42326.9, 300 sec: 42432.1). Total num frames: 1300168704. Throughput: 0: 42569.9. Samples: 1300342340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:44:56,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 06:44:58,473][12883] Updated weights for policy 0, policy_version 79361 (0.0036) +[2024-06-18 06:45:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1300381696. Throughput: 0: 42534.3. Samples: 1300463980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:45:01,996][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 06:45:02,571][12883] Updated weights for policy 0, policy_version 79371 (0.0043) +[2024-06-18 06:45:06,091][12883] Updated weights for policy 0, policy_version 79381 (0.0034) +[2024-06-18 06:45:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1300611072. Throughput: 0: 42625.6. Samples: 1300724100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:45:06,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 06:45:09,944][12883] Updated weights for policy 0, policy_version 79391 (0.0037) +[2024-06-18 06:45:11,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 1300807680. Throughput: 0: 42750.9. Samples: 1300981860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:45:11,996][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 06:45:12,576][12862] Signal inference workers to stop experience collection... (18800 times) +[2024-06-18 06:45:12,616][12883] InferenceWorker_p0-w0: stopping experience collection (18800 times) +[2024-06-18 06:45:12,639][12862] Signal inference workers to resume experience collection... (18800 times) +[2024-06-18 06:45:12,644][12883] InferenceWorker_p0-w0: resuming experience collection (18800 times) +[2024-06-18 06:45:13,830][12883] Updated weights for policy 0, policy_version 79401 (0.0040) +[2024-06-18 06:45:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1301037056. Throughput: 0: 42804.1. Samples: 1301106200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:45:16,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 06:45:17,446][12883] Updated weights for policy 0, policy_version 79411 (0.0030) +[2024-06-18 06:45:21,651][12883] Updated weights for policy 0, policy_version 79421 (0.0032) +[2024-06-18 06:45:21,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 1301250048. Throughput: 0: 42875.6. Samples: 1301366580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 06:45:21,994][12645] Avg episode reward: [(0, '0.228')] +[2024-06-18 06:45:24,867][12883] Updated weights for policy 0, policy_version 79431 (0.0047) +[2024-06-18 06:45:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1301446656. Throughput: 0: 42742.7. Samples: 1301624480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:45:26,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 06:45:29,072][12883] Updated weights for policy 0, policy_version 79441 (0.0031) +[2024-06-18 06:45:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 1301692416. Throughput: 0: 42934.2. Samples: 1301751720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:45:31,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 06:45:32,791][12883] Updated weights for policy 0, policy_version 79451 (0.0025) +[2024-06-18 06:45:36,931][12883] Updated weights for policy 0, policy_version 79461 (0.0032) +[2024-06-18 06:45:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1301889024. Throughput: 0: 42840.4. Samples: 1302009100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:45:36,998][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 06:45:40,335][12883] Updated weights for policy 0, policy_version 79471 (0.0037) +[2024-06-18 06:45:41,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1302085632. Throughput: 0: 42832.8. Samples: 1302269820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:45:41,995][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 06:45:44,453][12883] Updated weights for policy 0, policy_version 79481 (0.0035) +[2024-06-18 06:45:46,995][12645] Fps is (10 sec: 44231.2, 60 sec: 43145.2, 300 sec: 42653.8). Total num frames: 1302331392. Throughput: 0: 42826.8. Samples: 1302391240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:45:46,996][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 06:45:48,025][12883] Updated weights for policy 0, policy_version 79491 (0.0030) +[2024-06-18 06:45:51,992][12883] Updated weights for policy 0, policy_version 79501 (0.0029) +[2024-06-18 06:45:51,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1302544384. Throughput: 0: 42718.7. Samples: 1302646440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:45:51,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 06:45:55,646][12883] Updated weights for policy 0, policy_version 79511 (0.0031) +[2024-06-18 06:45:56,994][12645] Fps is (10 sec: 39326.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1302724608. Throughput: 0: 42583.8. Samples: 1302898040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:45:56,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 06:45:59,549][12883] Updated weights for policy 0, policy_version 79521 (0.0040) +[2024-06-18 06:46:01,993][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1302953984. Throughput: 0: 42608.6. Samples: 1303023580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:46:01,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 06:46:03,504][12883] Updated weights for policy 0, policy_version 79531 (0.0038) +[2024-06-18 06:46:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1303183360. Throughput: 0: 42556.1. Samples: 1303281600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:46:06,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 06:46:07,430][12883] Updated weights for policy 0, policy_version 79541 (0.0027) +[2024-06-18 06:46:11,308][12883] Updated weights for policy 0, policy_version 79551 (0.0033) +[2024-06-18 06:46:11,994][12645] Fps is (10 sec: 42597.2, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 1303379968. Throughput: 0: 42446.1. Samples: 1303534560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 06:46:11,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 06:46:15,011][12883] Updated weights for policy 0, policy_version 79561 (0.0027) +[2024-06-18 06:46:16,994][12645] Fps is (10 sec: 37682.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1303560192. Throughput: 0: 42407.8. Samples: 1303660080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:16,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 06:46:19,006][12883] Updated weights for policy 0, policy_version 79571 (0.0032) +[2024-06-18 06:46:21,996][12645] Fps is (10 sec: 42589.5, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1303805952. Throughput: 0: 42300.2. Samples: 1303912700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:21,996][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 06:46:22,719][12883] Updated weights for policy 0, policy_version 79581 (0.0029) +[2024-06-18 06:46:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1304002560. Throughput: 0: 42223.7. Samples: 1304169880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:26,996][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 06:46:27,071][12883] Updated weights for policy 0, policy_version 79591 (0.0035) +[2024-06-18 06:46:29,361][12862] Signal inference workers to stop experience collection... (18850 times) +[2024-06-18 06:46:29,388][12883] InferenceWorker_p0-w0: stopping experience collection (18850 times) +[2024-06-18 06:46:29,423][12862] Signal inference workers to resume experience collection... (18850 times) +[2024-06-18 06:46:29,424][12883] InferenceWorker_p0-w0: resuming experience collection (18850 times) +[2024-06-18 06:46:30,273][12883] Updated weights for policy 0, policy_version 79601 (0.0030) +[2024-06-18 06:46:31,994][12645] Fps is (10 sec: 39330.7, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 1304199168. Throughput: 0: 42268.0. Samples: 1304293240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:31,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 06:46:34,873][12883] Updated weights for policy 0, policy_version 79611 (0.0044) +[2024-06-18 06:46:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1304428544. Throughput: 0: 42284.5. Samples: 1304549240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:36,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 06:46:37,035][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079617_1304444928.pth... +[2024-06-18 06:46:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078994_1294237696.pth +[2024-06-18 06:46:37,909][12883] Updated weights for policy 0, policy_version 79621 (0.0028) +[2024-06-18 06:46:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 1304625152. Throughput: 0: 42421.9. Samples: 1304807020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:41,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 06:46:42,672][12883] Updated weights for policy 0, policy_version 79631 (0.0038) +[2024-06-18 06:46:45,641][12883] Updated weights for policy 0, policy_version 79641 (0.0033) +[2024-06-18 06:46:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42053.2, 300 sec: 42487.3). Total num frames: 1304854528. Throughput: 0: 42458.6. Samples: 1304934220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:46,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 06:46:50,298][12883] Updated weights for policy 0, policy_version 79651 (0.0022) +[2024-06-18 06:46:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1305067520. Throughput: 0: 42408.4. Samples: 1305189980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:51,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 06:46:53,404][12883] Updated weights for policy 0, policy_version 79661 (0.0036) +[2024-06-18 06:46:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1305280512. Throughput: 0: 42470.3. Samples: 1305445720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 06:46:56,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 06:46:57,884][12883] Updated weights for policy 0, policy_version 79671 (0.0037) +[2024-06-18 06:47:00,971][12883] Updated weights for policy 0, policy_version 79681 (0.0033) +[2024-06-18 06:47:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1305509888. Throughput: 0: 42539.1. Samples: 1305574340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:01,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 06:47:05,744][12883] Updated weights for policy 0, policy_version 79691 (0.0029) +[2024-06-18 06:47:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.1, 300 sec: 42542.9). Total num frames: 1305722880. Throughput: 0: 42628.6. Samples: 1305830900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:07,000][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 06:47:08,696][12883] Updated weights for policy 0, policy_version 79701 (0.0021) +[2024-06-18 06:47:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1305919488. Throughput: 0: 42587.8. Samples: 1306086340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:11,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 06:47:13,512][12883] Updated weights for policy 0, policy_version 79711 (0.0033) +[2024-06-18 06:47:16,413][12883] Updated weights for policy 0, policy_version 79721 (0.0030) +[2024-06-18 06:47:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1306165248. Throughput: 0: 42654.9. Samples: 1306212720. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:16,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 06:47:21,056][12883] Updated weights for policy 0, policy_version 79731 (0.0028) +[2024-06-18 06:47:21,994][12645] Fps is (10 sec: 45876.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 1306378240. Throughput: 0: 42917.2. Samples: 1306480520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:21,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 06:47:24,075][12883] Updated weights for policy 0, policy_version 79741 (0.0036) +[2024-06-18 06:47:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1306558464. Throughput: 0: 42738.2. Samples: 1306730240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:26,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 06:47:28,685][12883] Updated weights for policy 0, policy_version 79751 (0.0027) +[2024-06-18 06:47:31,746][12883] Updated weights for policy 0, policy_version 79761 (0.0046) +[2024-06-18 06:47:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1306804224. Throughput: 0: 42651.4. Samples: 1306853540. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:31,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 06:47:36,282][12883] Updated weights for policy 0, policy_version 79771 (0.0034) +[2024-06-18 06:47:36,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1307000832. Throughput: 0: 42816.7. Samples: 1307116740. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:36,995][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 06:47:39,560][12883] Updated weights for policy 0, policy_version 79781 (0.0035) +[2024-06-18 06:47:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1307197440. Throughput: 0: 42676.0. Samples: 1307366140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:41,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 06:47:43,893][12883] Updated weights for policy 0, policy_version 79791 (0.0034) +[2024-06-18 06:47:45,935][12862] Signal inference workers to stop experience collection... (18900 times) +[2024-06-18 06:47:46,000][12883] InferenceWorker_p0-w0: stopping experience collection (18900 times) +[2024-06-18 06:47:46,051][12862] Signal inference workers to resume experience collection... (18900 times) +[2024-06-18 06:47:46,052][12883] InferenceWorker_p0-w0: resuming experience collection (18900 times) +[2024-06-18 06:47:46,994][12645] Fps is (10 sec: 40961.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1307410432. Throughput: 0: 42607.7. Samples: 1307491680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) +[2024-06-18 06:47:46,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 06:47:47,528][12883] Updated weights for policy 0, policy_version 79801 (0.0027) +[2024-06-18 06:47:51,671][12883] Updated weights for policy 0, policy_version 79811 (0.0037) +[2024-06-18 06:47:51,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42869.8, 300 sec: 42542.9). Total num frames: 1307639808. Throughput: 0: 42719.4. Samples: 1307753360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:47:51,997][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 06:47:55,270][12883] Updated weights for policy 0, policy_version 79821 (0.0034) +[2024-06-18 06:47:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1307852800. Throughput: 0: 42529.2. Samples: 1308000140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:47:56,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 06:47:59,480][12883] Updated weights for policy 0, policy_version 79831 (0.0025) +[2024-06-18 06:48:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1308065792. Throughput: 0: 42597.8. Samples: 1308129620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:01,998][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 06:48:02,846][12883] Updated weights for policy 0, policy_version 79841 (0.0046) +[2024-06-18 06:48:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 1308262400. Throughput: 0: 42378.7. Samples: 1308387560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:06,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 06:48:07,089][12883] Updated weights for policy 0, policy_version 79851 (0.0033) +[2024-06-18 06:48:10,732][12883] Updated weights for policy 0, policy_version 79861 (0.0034) +[2024-06-18 06:48:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1308491776. Throughput: 0: 42270.1. Samples: 1308632400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:11,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 06:48:14,841][12883] Updated weights for policy 0, policy_version 79871 (0.0044) +[2024-06-18 06:48:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42487.8). Total num frames: 1308688384. Throughput: 0: 42549.3. Samples: 1308768260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:16,995][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 06:48:18,366][12883] Updated weights for policy 0, policy_version 79881 (0.0033) +[2024-06-18 06:48:21,994][12645] Fps is (10 sec: 37683.8, 60 sec: 41506.2, 300 sec: 42431.8). Total num frames: 1308868608. Throughput: 0: 42297.1. Samples: 1309020100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 06:48:22,572][12883] Updated weights for policy 0, policy_version 79891 (0.0026) +[2024-06-18 06:48:26,178][12883] Updated weights for policy 0, policy_version 79901 (0.0033) +[2024-06-18 06:48:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1309130752. Throughput: 0: 42400.4. Samples: 1309274160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:26,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 06:48:30,254][12883] Updated weights for policy 0, policy_version 79911 (0.0035) +[2024-06-18 06:48:31,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42050.8, 300 sec: 42542.5). Total num frames: 1309327360. Throughput: 0: 42579.6. Samples: 1309407860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:31,996][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 06:48:33,930][12883] Updated weights for policy 0, policy_version 79921 (0.0043) +[2024-06-18 06:48:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1309507584. Throughput: 0: 42156.2. Samples: 1309650300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:36,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 06:48:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079926_1309507584.pth... +[2024-06-18 06:48:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079305_1299333120.pth +[2024-06-18 06:48:37,988][12883] Updated weights for policy 0, policy_version 79931 (0.0032) +[2024-06-18 06:48:41,661][12883] Updated weights for policy 0, policy_version 79941 (0.0036) +[2024-06-18 06:48:41,995][12645] Fps is (10 sec: 42603.9, 60 sec: 42597.8, 300 sec: 42598.3). Total num frames: 1309753344. Throughput: 0: 42334.6. Samples: 1309905240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:41,995][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 06:48:45,788][12883] Updated weights for policy 0, policy_version 79951 (0.0032) +[2024-06-18 06:48:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 1309966336. Throughput: 0: 42440.4. Samples: 1310039440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:46,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 06:48:49,422][12883] Updated weights for policy 0, policy_version 79961 (0.0024) +[2024-06-18 06:48:51,994][12645] Fps is (10 sec: 40963.4, 60 sec: 42053.8, 300 sec: 42487.6). Total num frames: 1310162944. Throughput: 0: 42335.9. Samples: 1310292680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:51,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 06:48:53,431][12883] Updated weights for policy 0, policy_version 79971 (0.0039) +[2024-06-18 06:48:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1310392320. Throughput: 0: 42455.7. Samples: 1310542900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:48:56,994][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 06:48:57,053][12883] Updated weights for policy 0, policy_version 79981 (0.0039) +[2024-06-18 06:49:01,177][12883] Updated weights for policy 0, policy_version 79991 (0.0045) +[2024-06-18 06:49:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1310605312. Throughput: 0: 42388.2. Samples: 1310675720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:49:01,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 06:49:03,960][12862] Signal inference workers to stop experience collection... (18950 times) +[2024-06-18 06:49:04,006][12883] InferenceWorker_p0-w0: stopping experience collection (18950 times) +[2024-06-18 06:49:04,015][12862] Signal inference workers to resume experience collection... (18950 times) +[2024-06-18 06:49:04,028][12883] InferenceWorker_p0-w0: resuming experience collection (18950 times) +[2024-06-18 06:49:04,665][12883] Updated weights for policy 0, policy_version 80001 (0.0040) +[2024-06-18 06:49:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1310801920. Throughput: 0: 42427.4. Samples: 1310929340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:49:06,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 06:49:08,812][12883] Updated weights for policy 0, policy_version 80011 (0.0028) +[2024-06-18 06:49:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1311047680. Throughput: 0: 42321.8. Samples: 1311178640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:49:11,999][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 06:49:12,346][12883] Updated weights for policy 0, policy_version 80021 (0.0042) +[2024-06-18 06:49:16,395][12883] Updated weights for policy 0, policy_version 80031 (0.0028) +[2024-06-18 06:49:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 1311244288. Throughput: 0: 42246.9. Samples: 1311308880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:49:16,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 06:49:20,456][12883] Updated weights for policy 0, policy_version 80041 (0.0030) +[2024-06-18 06:49:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1311440896. Throughput: 0: 42568.9. Samples: 1311565900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:49:21,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 06:49:23,882][12883] Updated weights for policy 0, policy_version 80051 (0.0037) +[2024-06-18 06:49:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 1311670272. Throughput: 0: 42556.4. Samples: 1311820240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 06:49:26,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 06:49:28,026][12883] Updated weights for policy 0, policy_version 80061 (0.0035) +[2024-06-18 06:49:31,991][12883] Updated weights for policy 0, policy_version 80071 (0.0038) +[2024-06-18 06:49:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 1311883264. Throughput: 0: 42430.3. Samples: 1311948800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:49:31,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 06:49:35,924][12883] Updated weights for policy 0, policy_version 80081 (0.0033) +[2024-06-18 06:49:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1312079872. Throughput: 0: 42440.0. Samples: 1312202480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:49:36,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 06:49:39,810][12883] Updated weights for policy 0, policy_version 80091 (0.0021) +[2024-06-18 06:49:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42599.0, 300 sec: 42598.7). Total num frames: 1312309248. Throughput: 0: 42408.8. Samples: 1312451300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:49:41,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 06:49:44,095][12883] Updated weights for policy 0, policy_version 80101 (0.0028) +[2024-06-18 06:49:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1312522240. Throughput: 0: 42439.0. Samples: 1312585480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:49:46,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 06:49:47,454][12883] Updated weights for policy 0, policy_version 80111 (0.0033) +[2024-06-18 06:49:51,994][12645] Fps is (10 sec: 37683.9, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1312686080. Throughput: 0: 42241.1. Samples: 1312830180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:49:51,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 06:49:52,074][12883] Updated weights for policy 0, policy_version 80121 (0.0030) +[2024-06-18 06:49:55,085][12883] Updated weights for policy 0, policy_version 80131 (0.0031) +[2024-06-18 06:49:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1312948224. Throughput: 0: 42319.6. Samples: 1313083020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:49:56,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 06:50:00,005][12883] Updated weights for policy 0, policy_version 80141 (0.0035) +[2024-06-18 06:50:01,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1313144832. Throughput: 0: 42490.2. Samples: 1313220940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:50:01,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 06:50:03,164][12883] Updated weights for policy 0, policy_version 80151 (0.0050) +[2024-06-18 06:50:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42487.6). Total num frames: 1313341440. Throughput: 0: 42181.4. Samples: 1313464060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:50:06,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 06:50:07,529][12883] Updated weights for policy 0, policy_version 80161 (0.0041) +[2024-06-18 06:50:10,784][12883] Updated weights for policy 0, policy_version 80171 (0.0028) +[2024-06-18 06:50:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1313587200. Throughput: 0: 42229.7. Samples: 1313720580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) +[2024-06-18 06:50:11,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 06:50:15,091][12883] Updated weights for policy 0, policy_version 80181 (0.0038) +[2024-06-18 06:50:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1313767424. Throughput: 0: 42290.3. Samples: 1313851860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:16,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 06:50:18,679][12883] Updated weights for policy 0, policy_version 80191 (0.0043) +[2024-06-18 06:50:19,419][12862] Signal inference workers to stop experience collection... (19000 times) +[2024-06-18 06:50:19,451][12883] InferenceWorker_p0-w0: stopping experience collection (19000 times) +[2024-06-18 06:50:19,537][12862] Signal inference workers to resume experience collection... (19000 times) +[2024-06-18 06:50:19,537][12883] InferenceWorker_p0-w0: resuming experience collection (19000 times) +[2024-06-18 06:50:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1313996800. Throughput: 0: 42091.7. Samples: 1314096600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:21,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 06:50:22,587][12883] Updated weights for policy 0, policy_version 80201 (0.0036) +[2024-06-18 06:50:26,248][12883] Updated weights for policy 0, policy_version 80211 (0.0028) +[2024-06-18 06:50:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1314209792. Throughput: 0: 42373.4. Samples: 1314358100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:26,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 06:50:30,180][12883] Updated weights for policy 0, policy_version 80221 (0.0036) +[2024-06-18 06:50:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1314390016. Throughput: 0: 42141.8. Samples: 1314481860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:31,994][12645] Avg episode reward: [(0, '0.666')] +[2024-06-18 06:50:33,926][12883] Updated weights for policy 0, policy_version 80231 (0.0042) +[2024-06-18 06:50:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1314635776. Throughput: 0: 42317.6. Samples: 1314734480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:36,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 06:50:37,086][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080240_1314652160.pth... +[2024-06-18 06:50:37,132][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079617_1304444928.pth +[2024-06-18 06:50:37,654][12883] Updated weights for policy 0, policy_version 80241 (0.0042) +[2024-06-18 06:50:41,708][12883] Updated weights for policy 0, policy_version 80251 (0.0039) +[2024-06-18 06:50:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42432.0). Total num frames: 1314848768. Throughput: 0: 42556.8. Samples: 1314998080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:41,994][12645] Avg episode reward: [(0, '0.174')] +[2024-06-18 06:50:46,066][12883] Updated weights for policy 0, policy_version 80261 (0.0038) +[2024-06-18 06:50:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 1315028992. Throughput: 0: 42325.4. Samples: 1315125580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:46,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 06:50:49,347][12883] Updated weights for policy 0, policy_version 80271 (0.0037) +[2024-06-18 06:50:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1315241984. Throughput: 0: 42351.5. Samples: 1315369880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:51,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 06:50:53,766][12883] Updated weights for policy 0, policy_version 80281 (0.0027) +[2024-06-18 06:50:56,795][12883] Updated weights for policy 0, policy_version 80291 (0.0034) +[2024-06-18 06:50:56,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1315504128. Throughput: 0: 42464.9. Samples: 1315631500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:50:56,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 06:51:01,546][12883] Updated weights for policy 0, policy_version 80301 (0.0032) +[2024-06-18 06:51:01,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42050.7, 300 sec: 42320.4). Total num frames: 1315667968. Throughput: 0: 42456.5. Samples: 1315762500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 06:51:01,997][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 06:51:04,551][12883] Updated weights for policy 0, policy_version 80311 (0.0036) +[2024-06-18 06:51:06,996][12645] Fps is (10 sec: 39313.2, 60 sec: 42596.9, 300 sec: 42431.5). Total num frames: 1315897344. Throughput: 0: 42455.6. Samples: 1316007200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:06,996][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 06:51:09,379][12883] Updated weights for policy 0, policy_version 80321 (0.0023) +[2024-06-18 06:51:11,994][12645] Fps is (10 sec: 44247.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1316110336. Throughput: 0: 42290.8. Samples: 1316261180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:12,000][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 06:51:12,352][12883] Updated weights for policy 0, policy_version 80331 (0.0038) +[2024-06-18 06:51:16,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42052.3, 300 sec: 42321.0). Total num frames: 1316290560. Throughput: 0: 42331.7. Samples: 1316386780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:16,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 06:51:17,270][12883] Updated weights for policy 0, policy_version 80341 (0.0046) +[2024-06-18 06:51:20,310][12883] Updated weights for policy 0, policy_version 80351 (0.0045) +[2024-06-18 06:51:21,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 1316552704. Throughput: 0: 42315.9. Samples: 1316638700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:21,994][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 06:51:25,017][12883] Updated weights for policy 0, policy_version 80361 (0.0039) +[2024-06-18 06:51:26,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1316749312. Throughput: 0: 42282.6. Samples: 1316900800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:26,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 06:51:28,082][12883] Updated weights for policy 0, policy_version 80371 (0.0032) +[2024-06-18 06:51:31,994][12645] Fps is (10 sec: 36045.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1316913152. Throughput: 0: 42095.0. Samples: 1317019860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:31,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 06:51:33,163][12883] Updated weights for policy 0, policy_version 80381 (0.0043) +[2024-06-18 06:51:33,636][12862] Signal inference workers to stop experience collection... (19050 times) +[2024-06-18 06:51:33,636][12862] Signal inference workers to resume experience collection... (19050 times) +[2024-06-18 06:51:33,677][12883] InferenceWorker_p0-w0: stopping experience collection (19050 times) +[2024-06-18 06:51:33,677][12883] InferenceWorker_p0-w0: resuming experience collection (19050 times) +[2024-06-18 06:51:35,673][12883] Updated weights for policy 0, policy_version 80391 (0.0023) +[2024-06-18 06:51:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1317175296. Throughput: 0: 42326.8. Samples: 1317274580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:36,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 06:51:40,915][12883] Updated weights for policy 0, policy_version 80401 (0.0036) +[2024-06-18 06:51:41,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1317371904. Throughput: 0: 42208.1. Samples: 1317530860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:41,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 06:51:43,213][12883] Updated weights for policy 0, policy_version 80411 (0.0030) +[2024-06-18 06:51:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1317568512. Throughput: 0: 41996.8. Samples: 1317652260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:46,994][12645] Avg episode reward: [(0, '0.100')] +[2024-06-18 06:51:48,583][12883] Updated weights for policy 0, policy_version 80421 (0.0022) +[2024-06-18 06:51:50,807][12883] Updated weights for policy 0, policy_version 80431 (0.0032) +[2024-06-18 06:51:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1317830656. Throughput: 0: 42297.5. Samples: 1317910500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 06:51:51,994][12645] Avg episode reward: [(0, '0.110')] +[2024-06-18 06:51:56,236][12883] Updated weights for policy 0, policy_version 80441 (0.0031) +[2024-06-18 06:51:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 42265.2). Total num frames: 1317978112. Throughput: 0: 42492.7. Samples: 1318173360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:51:56,995][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 06:51:58,573][12883] Updated weights for policy 0, policy_version 80451 (0.0031) +[2024-06-18 06:52:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42600.0, 300 sec: 42376.3). Total num frames: 1318223872. Throughput: 0: 42151.0. Samples: 1318283580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:01,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 06:52:03,861][12883] Updated weights for policy 0, policy_version 80461 (0.0027) +[2024-06-18 06:52:06,607][12883] Updated weights for policy 0, policy_version 80471 (0.0045) +[2024-06-18 06:52:06,994][12645] Fps is (10 sec: 49152.0, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 1318469632. Throughput: 0: 42389.8. Samples: 1318546240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:06,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 06:52:11,782][12883] Updated weights for policy 0, policy_version 80481 (0.0032) +[2024-06-18 06:52:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 1318600704. Throughput: 0: 42397.9. Samples: 1318808700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:11,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 06:52:14,223][12883] Updated weights for policy 0, policy_version 80491 (0.0033) +[2024-06-18 06:52:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 1318846464. Throughput: 0: 42323.1. Samples: 1318924400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:16,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 06:52:19,421][12883] Updated weights for policy 0, policy_version 80501 (0.0050) +[2024-06-18 06:52:21,923][12883] Updated weights for policy 0, policy_version 80511 (0.0045) +[2024-06-18 06:52:21,994][12645] Fps is (10 sec: 49151.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1319092224. Throughput: 0: 42391.0. Samples: 1319182180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:21,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 06:52:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 1319239680. Throughput: 0: 42559.4. Samples: 1319446040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:26,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 06:52:27,131][12883] Updated weights for policy 0, policy_version 80521 (0.0034) +[2024-06-18 06:52:29,374][12862] Signal inference workers to stop experience collection... (19100 times) +[2024-06-18 06:52:29,419][12883] InferenceWorker_p0-w0: stopping experience collection (19100 times) +[2024-06-18 06:52:29,427][12862] Signal inference workers to resume experience collection... (19100 times) +[2024-06-18 06:52:29,437][12883] InferenceWorker_p0-w0: resuming experience collection (19100 times) +[2024-06-18 06:52:29,591][12883] Updated weights for policy 0, policy_version 80531 (0.0025) +[2024-06-18 06:52:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42376.3). Total num frames: 1319501824. Throughput: 0: 42297.7. Samples: 1319555660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:31,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 06:52:34,673][12883] Updated weights for policy 0, policy_version 80541 (0.0033) +[2024-06-18 06:52:36,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1319714816. Throughput: 0: 42479.6. Samples: 1319822080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:36,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 06:52:37,137][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080551_1319747584.pth... +[2024-06-18 06:52:37,147][12883] Updated weights for policy 0, policy_version 80551 (0.0027) +[2024-06-18 06:52:37,188][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079926_1309507584.pth +[2024-06-18 06:52:41,994][12645] Fps is (10 sec: 36045.5, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 1319862272. Throughput: 0: 42351.3. Samples: 1320079160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 06:52:41,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 06:52:42,438][12883] Updated weights for policy 0, policy_version 80561 (0.0052) +[2024-06-18 06:52:45,239][12883] Updated weights for policy 0, policy_version 80571 (0.0030) +[2024-06-18 06:52:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42376.6). Total num frames: 1320140800. Throughput: 0: 42420.9. Samples: 1320192520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:52:46,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 06:52:50,226][12883] Updated weights for policy 0, policy_version 80581 (0.0037) +[2024-06-18 06:52:51,994][12645] Fps is (10 sec: 47513.5, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 1320337408. Throughput: 0: 42350.0. Samples: 1320451980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:52:51,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 06:52:53,195][12883] Updated weights for policy 0, policy_version 80591 (0.0034) +[2024-06-18 06:52:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1320517632. Throughput: 0: 42050.1. Samples: 1320700960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:52:56,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 06:52:58,199][12883] Updated weights for policy 0, policy_version 80601 (0.0024) +[2024-06-18 06:53:01,125][12883] Updated weights for policy 0, policy_version 80611 (0.0034) +[2024-06-18 06:53:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1320779776. Throughput: 0: 42213.9. Samples: 1320824020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:53:01,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 06:53:06,055][12883] Updated weights for policy 0, policy_version 80621 (0.0025) +[2024-06-18 06:53:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 1320960000. Throughput: 0: 42307.6. Samples: 1321086020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:53:06,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 06:53:08,706][12883] Updated weights for policy 0, policy_version 80631 (0.0026) +[2024-06-18 06:53:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1321172992. Throughput: 0: 41922.3. Samples: 1321332540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:53:11,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 06:53:13,682][12883] Updated weights for policy 0, policy_version 80641 (0.0033) +[2024-06-18 06:53:16,709][12883] Updated weights for policy 0, policy_version 80651 (0.0038) +[2024-06-18 06:53:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1321385984. Throughput: 0: 42237.0. Samples: 1321456320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:53:16,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 06:53:21,322][12883] Updated weights for policy 0, policy_version 80661 (0.0039) +[2024-06-18 06:53:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 1321582592. Throughput: 0: 42028.1. Samples: 1321713340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:53:21,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 06:53:23,074][12862] Signal inference workers to stop experience collection... (19150 times) +[2024-06-18 06:53:23,075][12862] Signal inference workers to resume experience collection... (19150 times) +[2024-06-18 06:53:23,091][12883] InferenceWorker_p0-w0: stopping experience collection (19150 times) +[2024-06-18 06:53:23,091][12883] InferenceWorker_p0-w0: resuming experience collection (19150 times) +[2024-06-18 06:53:24,462][12883] Updated weights for policy 0, policy_version 80671 (0.0038) +[2024-06-18 06:53:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42210.0). Total num frames: 1321779200. Throughput: 0: 41827.1. Samples: 1321961380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) +[2024-06-18 06:53:26,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 06:53:29,031][12883] Updated weights for policy 0, policy_version 80681 (0.0038) +[2024-06-18 06:53:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1322024960. Throughput: 0: 42147.1. Samples: 1322089140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:53:31,994][12645] Avg episode reward: [(0, '0.160')] +[2024-06-18 06:53:32,144][12883] Updated weights for policy 0, policy_version 80691 (0.0051) +[2024-06-18 06:53:36,748][12883] Updated weights for policy 0, policy_version 80701 (0.0034) +[2024-06-18 06:53:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.3, 300 sec: 42209.8). Total num frames: 1322205184. Throughput: 0: 42160.5. Samples: 1322349200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:53:36,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 06:53:40,194][12883] Updated weights for policy 0, policy_version 80711 (0.0025) +[2024-06-18 06:53:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 1322418176. Throughput: 0: 42080.9. Samples: 1322594600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:53:41,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 06:53:44,530][12883] Updated weights for policy 0, policy_version 80721 (0.0024) +[2024-06-18 06:53:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1322647552. Throughput: 0: 42108.4. Samples: 1322718900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:53:46,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 06:53:47,936][12883] Updated weights for policy 0, policy_version 80731 (0.0031) +[2024-06-18 06:53:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1322844160. Throughput: 0: 42020.1. Samples: 1322976920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:53:51,994][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 06:53:52,055][12883] Updated weights for policy 0, policy_version 80741 (0.0030) +[2024-06-18 06:53:55,565][12883] Updated weights for policy 0, policy_version 80751 (0.0033) +[2024-06-18 06:53:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1323073536. Throughput: 0: 42109.3. Samples: 1323227460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:53:56,994][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 06:53:59,665][12883] Updated weights for policy 0, policy_version 80761 (0.0045) +[2024-06-18 06:54:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1323302912. Throughput: 0: 42271.2. Samples: 1323358520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:54:01,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 06:54:03,182][12883] Updated weights for policy 0, policy_version 80771 (0.0036) +[2024-06-18 06:54:06,994][12645] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 1323466752. Throughput: 0: 42247.4. Samples: 1323614480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:54:06,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 06:54:07,666][12883] Updated weights for policy 0, policy_version 80781 (0.0037) +[2024-06-18 06:54:11,027][12883] Updated weights for policy 0, policy_version 80791 (0.0033) +[2024-06-18 06:54:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1323712512. Throughput: 0: 42168.8. Samples: 1323858980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:54:11,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 06:54:15,410][12883] Updated weights for policy 0, policy_version 80801 (0.0030) +[2024-06-18 06:54:16,996][12645] Fps is (10 sec: 47503.7, 60 sec: 42596.8, 300 sec: 42375.9). Total num frames: 1323941888. Throughput: 0: 42329.5. Samples: 1323994060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 06:54:16,996][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 06:54:18,641][12883] Updated weights for policy 0, policy_version 80811 (0.0047) +[2024-06-18 06:54:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1324105728. Throughput: 0: 42180.8. Samples: 1324247340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:54:21,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 06:54:23,148][12883] Updated weights for policy 0, policy_version 80821 (0.0045) +[2024-06-18 06:54:26,198][12883] Updated weights for policy 0, policy_version 80831 (0.0043) +[2024-06-18 06:54:26,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42871.3, 300 sec: 42265.2). Total num frames: 1324351488. Throughput: 0: 42331.0. Samples: 1324499500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:54:26,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 06:54:30,679][12883] Updated weights for policy 0, policy_version 80841 (0.0033) +[2024-06-18 06:54:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1324564480. Throughput: 0: 42591.9. Samples: 1324635540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:54:31,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 06:54:34,270][12883] Updated weights for policy 0, policy_version 80851 (0.0047) +[2024-06-18 06:54:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 1324744704. Throughput: 0: 42316.8. Samples: 1324881180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:54:36,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 06:54:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080856_1324744704.pth... +[2024-06-18 06:54:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080240_1314652160.pth +[2024-06-18 06:54:38,456][12883] Updated weights for policy 0, policy_version 80861 (0.0026) +[2024-06-18 06:54:41,846][12883] Updated weights for policy 0, policy_version 80871 (0.0033) +[2024-06-18 06:54:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 1324990464. Throughput: 0: 42306.2. Samples: 1325131240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:54:41,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 06:54:46,102][12883] Updated weights for policy 0, policy_version 80881 (0.0040) +[2024-06-18 06:54:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1325187072. Throughput: 0: 42317.2. Samples: 1325262800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:54:46,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 06:54:49,430][12883] Updated weights for policy 0, policy_version 80891 (0.0030) +[2024-06-18 06:54:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1325383680. Throughput: 0: 42340.5. Samples: 1325519800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:54:51,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 06:54:53,575][12883] Updated weights for policy 0, policy_version 80901 (0.0021) +[2024-06-18 06:54:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1325629440. Throughput: 0: 42511.1. Samples: 1325771980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:54:56,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 06:54:57,383][12883] Updated weights for policy 0, policy_version 80911 (0.0041) +[2024-06-18 06:55:01,076][12883] Updated weights for policy 0, policy_version 80921 (0.0040) +[2024-06-18 06:55:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1325842432. Throughput: 0: 42532.8. Samples: 1325907940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:55:01,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 06:55:02,507][12862] Signal inference workers to stop experience collection... (19200 times) +[2024-06-18 06:55:02,537][12883] InferenceWorker_p0-w0: stopping experience collection (19200 times) +[2024-06-18 06:55:02,572][12862] Signal inference workers to resume experience collection... (19200 times) +[2024-06-18 06:55:02,573][12883] InferenceWorker_p0-w0: resuming experience collection (19200 times) +[2024-06-18 06:55:05,083][12883] Updated weights for policy 0, policy_version 80931 (0.0045) +[2024-06-18 06:55:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 1326039040. Throughput: 0: 42509.7. Samples: 1326160280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 06:55:06,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 06:55:09,009][12883] Updated weights for policy 0, policy_version 80941 (0.0038) +[2024-06-18 06:55:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1326268416. Throughput: 0: 42351.6. Samples: 1326405320. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:11,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 06:55:13,106][12883] Updated weights for policy 0, policy_version 80951 (0.0036) +[2024-06-18 06:55:16,816][12883] Updated weights for policy 0, policy_version 80961 (0.0039) +[2024-06-18 06:55:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42053.8, 300 sec: 42265.2). Total num frames: 1326465024. Throughput: 0: 42271.2. Samples: 1326537740. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:16,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 06:55:20,808][12883] Updated weights for policy 0, policy_version 80971 (0.0038) +[2024-06-18 06:55:21,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42596.8, 300 sec: 42209.3). Total num frames: 1326661632. Throughput: 0: 42469.5. Samples: 1326792400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:21,997][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 06:55:24,412][12883] Updated weights for policy 0, policy_version 80981 (0.0036) +[2024-06-18 06:55:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 1326891008. Throughput: 0: 42563.3. Samples: 1327046580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:26,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 06:55:28,345][12883] Updated weights for policy 0, policy_version 80991 (0.0033) +[2024-06-18 06:55:31,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1327104000. Throughput: 0: 42568.4. Samples: 1327178380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:31,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 06:55:32,423][12883] Updated weights for policy 0, policy_version 81001 (0.0022) +[2024-06-18 06:55:36,028][12883] Updated weights for policy 0, policy_version 81011 (0.0035) +[2024-06-18 06:55:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1327316992. Throughput: 0: 42546.6. Samples: 1327434400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:36,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 06:55:39,852][12883] Updated weights for policy 0, policy_version 81021 (0.0031) +[2024-06-18 06:55:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1327529984. Throughput: 0: 42604.8. Samples: 1327689200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:41,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 06:55:43,872][12883] Updated weights for policy 0, policy_version 81031 (0.0038) +[2024-06-18 06:55:46,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42869.9, 300 sec: 42431.5). Total num frames: 1327759360. Throughput: 0: 42462.3. Samples: 1327818840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:46,997][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 06:55:47,442][12883] Updated weights for policy 0, policy_version 81041 (0.0028) +[2024-06-18 06:55:51,618][12883] Updated weights for policy 0, policy_version 81051 (0.0036) +[2024-06-18 06:55:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 1327955968. Throughput: 0: 42455.3. Samples: 1328070760. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:51,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 06:55:55,363][12883] Updated weights for policy 0, policy_version 81061 (0.0035) +[2024-06-18 06:55:56,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 1328168960. Throughput: 0: 42580.4. Samples: 1328321440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) +[2024-06-18 06:55:56,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 06:55:59,274][12883] Updated weights for policy 0, policy_version 81071 (0.0032) +[2024-06-18 06:56:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42376.6). Total num frames: 1328398336. Throughput: 0: 42570.2. Samples: 1328453400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:01,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 06:56:02,949][12883] Updated weights for policy 0, policy_version 81081 (0.0031) +[2024-06-18 06:56:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 1328578560. Throughput: 0: 42579.9. Samples: 1328708400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:07,004][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 06:56:07,195][12883] Updated weights for policy 0, policy_version 81091 (0.0031) +[2024-06-18 06:56:10,627][12883] Updated weights for policy 0, policy_version 81101 (0.0041) +[2024-06-18 06:56:11,993][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1328807936. Throughput: 0: 42503.6. Samples: 1328959240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:11,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 06:56:14,885][12883] Updated weights for policy 0, policy_version 81111 (0.0044) +[2024-06-18 06:56:16,996][12645] Fps is (10 sec: 45865.3, 60 sec: 42869.9, 300 sec: 42320.4). Total num frames: 1329037312. Throughput: 0: 42527.8. Samples: 1329092220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:16,996][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 06:56:18,354][12883] Updated weights for policy 0, policy_version 81121 (0.0033) +[2024-06-18 06:56:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 1329201152. Throughput: 0: 42406.3. Samples: 1329342680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:21,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 06:56:22,575][12883] Updated weights for policy 0, policy_version 81131 (0.0036) +[2024-06-18 06:56:26,515][12883] Updated weights for policy 0, policy_version 81141 (0.0035) +[2024-06-18 06:56:26,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1329446912. Throughput: 0: 42431.7. Samples: 1329598620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:26,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 06:56:30,110][12862] Signal inference workers to stop experience collection... (19250 times) +[2024-06-18 06:56:30,110][12862] Signal inference workers to resume experience collection... (19250 times) +[2024-06-18 06:56:30,162][12883] InferenceWorker_p0-w0: stopping experience collection (19250 times) +[2024-06-18 06:56:30,162][12883] InferenceWorker_p0-w0: resuming experience collection (19250 times) +[2024-06-18 06:56:30,579][12883] Updated weights for policy 0, policy_version 81151 (0.0041) +[2024-06-18 06:56:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1329643520. Throughput: 0: 42461.3. Samples: 1329729500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:31,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 06:56:34,035][12883] Updated weights for policy 0, policy_version 81161 (0.0030) +[2024-06-18 06:56:36,994][12645] Fps is (10 sec: 37682.7, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1329823744. Throughput: 0: 42333.2. Samples: 1329975760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:36,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 06:56:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081166_1329823744.pth... +[2024-06-18 06:56:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080551_1319747584.pth +[2024-06-18 06:56:38,253][12883] Updated weights for policy 0, policy_version 81171 (0.0030) +[2024-06-18 06:56:41,543][12883] Updated weights for policy 0, policy_version 81181 (0.0044) +[2024-06-18 06:56:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 1330102272. Throughput: 0: 42276.2. Samples: 1330223860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:41,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 06:56:46,063][12883] Updated weights for policy 0, policy_version 81191 (0.0037) +[2024-06-18 06:56:46,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1330282496. Throughput: 0: 42520.9. Samples: 1330366840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 06:56:46,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 06:56:49,175][12883] Updated weights for policy 0, policy_version 81201 (0.0034) +[2024-06-18 06:56:51,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1330479104. Throughput: 0: 42227.5. Samples: 1330608640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:56:51,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 06:56:53,679][12883] Updated weights for policy 0, policy_version 81211 (0.0027) +[2024-06-18 06:56:56,825][12883] Updated weights for policy 0, policy_version 81221 (0.0035) +[2024-06-18 06:56:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1330724864. Throughput: 0: 42343.4. Samples: 1330864700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:56:56,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 06:57:01,290][12883] Updated weights for policy 0, policy_version 81231 (0.0031) +[2024-06-18 06:57:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1330905088. Throughput: 0: 42399.8. Samples: 1331000120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:57:01,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 06:57:04,815][12883] Updated weights for policy 0, policy_version 81241 (0.0032) +[2024-06-18 06:57:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1331118080. Throughput: 0: 42284.4. Samples: 1331245480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:57:06,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 06:57:08,898][12883] Updated weights for policy 0, policy_version 81251 (0.0028) +[2024-06-18 06:57:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1331347456. Throughput: 0: 42374.7. Samples: 1331505480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:57:11,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 06:57:12,314][12883] Updated weights for policy 0, policy_version 81261 (0.0029) +[2024-06-18 06:57:16,592][12883] Updated weights for policy 0, policy_version 81271 (0.0038) +[2024-06-18 06:57:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42053.9, 300 sec: 42265.2). Total num frames: 1331560448. Throughput: 0: 42392.9. Samples: 1331637180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:57:16,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 06:57:20,050][12883] Updated weights for policy 0, policy_version 81281 (0.0049) +[2024-06-18 06:57:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1331773440. Throughput: 0: 42490.8. Samples: 1331887840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:57:21,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 06:57:24,185][12883] Updated weights for policy 0, policy_version 81291 (0.0027) +[2024-06-18 06:57:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1332002816. Throughput: 0: 42737.3. Samples: 1332147040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:57:26,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 06:57:27,755][12883] Updated weights for policy 0, policy_version 81301 (0.0035) +[2024-06-18 06:57:31,848][12883] Updated weights for policy 0, policy_version 81311 (0.0038) +[2024-06-18 06:57:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1332199424. Throughput: 0: 42351.2. Samples: 1332272640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:57:31,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 06:57:35,306][12883] Updated weights for policy 0, policy_version 81321 (0.0027) +[2024-06-18 06:57:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42542.8). Total num frames: 1332412416. Throughput: 0: 42585.0. Samples: 1332524960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 06:57:36,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 06:57:39,681][12883] Updated weights for policy 0, policy_version 81331 (0.0032) +[2024-06-18 06:57:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1332625408. Throughput: 0: 42678.3. Samples: 1332785220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:57:41,994][12645] Avg episode reward: [(0, '0.117')] +[2024-06-18 06:57:42,992][12883] Updated weights for policy 0, policy_version 81341 (0.0038) +[2024-06-18 06:57:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1332822016. Throughput: 0: 42425.7. Samples: 1332909280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:57:46,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 06:57:47,262][12883] Updated weights for policy 0, policy_version 81351 (0.0031) +[2024-06-18 06:57:50,715][12883] Updated weights for policy 0, policy_version 81361 (0.0022) +[2024-06-18 06:57:51,740][12862] Signal inference workers to stop experience collection... (19300 times) +[2024-06-18 06:57:51,740][12862] Signal inference workers to resume experience collection... (19300 times) +[2024-06-18 06:57:51,765][12883] InferenceWorker_p0-w0: stopping experience collection (19300 times) +[2024-06-18 06:57:51,765][12883] InferenceWorker_p0-w0: resuming experience collection (19300 times) +[2024-06-18 06:57:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1333067776. Throughput: 0: 42647.1. Samples: 1333164600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:57:51,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 06:57:54,822][12883] Updated weights for policy 0, policy_version 81371 (0.0047) +[2024-06-18 06:57:56,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1333264384. Throughput: 0: 42777.3. Samples: 1333430460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:57:56,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 06:57:58,235][12883] Updated weights for policy 0, policy_version 81381 (0.0032) +[2024-06-18 06:58:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1333460992. Throughput: 0: 42520.2. Samples: 1333550600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:58:01,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 06:58:02,506][12883] Updated weights for policy 0, policy_version 81391 (0.0032) +[2024-06-18 06:58:06,590][12883] Updated weights for policy 0, policy_version 81401 (0.0036) +[2024-06-18 06:58:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 1333706752. Throughput: 0: 42621.7. Samples: 1333805820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:58:06,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 06:58:10,251][12883] Updated weights for policy 0, policy_version 81411 (0.0038) +[2024-06-18 06:58:11,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1333903360. Throughput: 0: 42622.2. Samples: 1334065040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:58:11,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 06:58:14,157][12883] Updated weights for policy 0, policy_version 81421 (0.0034) +[2024-06-18 06:58:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1334083584. Throughput: 0: 42439.9. Samples: 1334182440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:58:16,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 06:58:17,986][12883] Updated weights for policy 0, policy_version 81431 (0.0037) +[2024-06-18 06:58:21,755][12883] Updated weights for policy 0, policy_version 81441 (0.0047) +[2024-06-18 06:58:21,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1334329344. Throughput: 0: 42674.1. Samples: 1334445300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:58:21,995][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 06:58:25,619][12883] Updated weights for policy 0, policy_version 81451 (0.0039) +[2024-06-18 06:58:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1334542336. Throughput: 0: 42611.0. Samples: 1334702720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 06:58:26,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 06:58:29,254][12883] Updated weights for policy 0, policy_version 81461 (0.0040) +[2024-06-18 06:58:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1334738944. Throughput: 0: 42621.8. Samples: 1334827260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:58:31,994][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 06:58:33,129][12883] Updated weights for policy 0, policy_version 81471 (0.0029) +[2024-06-18 06:58:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1334968320. Throughput: 0: 42649.9. Samples: 1335083840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:58:36,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 06:58:37,063][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081481_1334984704.pth... +[2024-06-18 06:58:37,075][12883] Updated weights for policy 0, policy_version 81481 (0.0041) +[2024-06-18 06:58:37,131][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080856_1324744704.pth +[2024-06-18 06:58:41,214][12883] Updated weights for policy 0, policy_version 81491 (0.0038) +[2024-06-18 06:58:41,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1335197696. Throughput: 0: 42472.3. Samples: 1335341720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:58:41,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 06:58:44,715][12883] Updated weights for policy 0, policy_version 81501 (0.0031) +[2024-06-18 06:58:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1335394304. Throughput: 0: 42677.8. Samples: 1335471100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:58:47,000][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 06:58:48,829][12883] Updated weights for policy 0, policy_version 81511 (0.0049) +[2024-06-18 06:58:51,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1335590912. Throughput: 0: 42613.0. Samples: 1335723400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:58:51,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 06:58:52,774][12883] Updated weights for policy 0, policy_version 81521 (0.0047) +[2024-06-18 06:58:56,559][12883] Updated weights for policy 0, policy_version 81531 (0.0050) +[2024-06-18 06:58:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1335803904. Throughput: 0: 42494.0. Samples: 1335977280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:58:56,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 06:59:00,461][12883] Updated weights for policy 0, policy_version 81541 (0.0035) +[2024-06-18 06:59:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1336033280. Throughput: 0: 42687.2. Samples: 1336103360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:59:01,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 06:59:02,717][12862] Signal inference workers to stop experience collection... (19350 times) +[2024-06-18 06:59:02,765][12862] Signal inference workers to resume experience collection... (19350 times) +[2024-06-18 06:59:02,766][12883] InferenceWorker_p0-w0: stopping experience collection (19350 times) +[2024-06-18 06:59:02,792][12883] InferenceWorker_p0-w0: resuming experience collection (19350 times) +[2024-06-18 06:59:04,339][12883] Updated weights for policy 0, policy_version 81551 (0.0030) +[2024-06-18 06:59:06,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42323.8, 300 sec: 42487.0). Total num frames: 1336246272. Throughput: 0: 42451.8. Samples: 1336355720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:59:06,996][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 06:59:08,278][12883] Updated weights for policy 0, policy_version 81561 (0.0035) +[2024-06-18 06:59:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 1336442880. Throughput: 0: 42479.7. Samples: 1336614300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:59:11,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 06:59:12,034][12883] Updated weights for policy 0, policy_version 81571 (0.0033) +[2024-06-18 06:59:16,111][12883] Updated weights for policy 0, policy_version 81581 (0.0043) +[2024-06-18 06:59:16,993][12645] Fps is (10 sec: 40969.9, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1336655872. Throughput: 0: 42554.9. Samples: 1336742220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 06:59:16,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 06:59:19,623][12883] Updated weights for policy 0, policy_version 81591 (0.0036) +[2024-06-18 06:59:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1336868864. Throughput: 0: 42324.4. Samples: 1336988440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:59:21,994][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 06:59:23,887][12883] Updated weights for policy 0, policy_version 81601 (0.0031) +[2024-06-18 06:59:26,995][12645] Fps is (10 sec: 44229.6, 60 sec: 42597.4, 300 sec: 42487.1). Total num frames: 1337098240. Throughput: 0: 42444.9. Samples: 1337251800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:59:26,996][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 06:59:27,431][12883] Updated weights for policy 0, policy_version 81611 (0.0036) +[2024-06-18 06:59:31,738][12883] Updated weights for policy 0, policy_version 81621 (0.0027) +[2024-06-18 06:59:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1337294848. Throughput: 0: 42372.6. Samples: 1337377860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:59:31,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 06:59:34,948][12883] Updated weights for policy 0, policy_version 81631 (0.0043) +[2024-06-18 06:59:36,994][12645] Fps is (10 sec: 40966.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1337507840. Throughput: 0: 42287.0. Samples: 1337626320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:59:36,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 06:59:39,238][12883] Updated weights for policy 0, policy_version 81641 (0.0039) +[2024-06-18 06:59:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1337737216. Throughput: 0: 42426.4. Samples: 1337886460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:59:41,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 06:59:42,483][12883] Updated weights for policy 0, policy_version 81651 (0.0026) +[2024-06-18 06:59:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 1337917440. Throughput: 0: 42483.9. Samples: 1338015140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:59:46,994][12645] Avg episode reward: [(0, '0.119')] +[2024-06-18 06:59:47,071][12883] Updated weights for policy 0, policy_version 81661 (0.0030) +[2024-06-18 06:59:50,308][12883] Updated weights for policy 0, policy_version 81671 (0.0027) +[2024-06-18 06:59:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1338163200. Throughput: 0: 42622.7. Samples: 1338273640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:59:51,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 06:59:54,633][12883] Updated weights for policy 0, policy_version 81681 (0.0028) +[2024-06-18 06:59:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1338376192. Throughput: 0: 42607.3. Samples: 1338531640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 06:59:56,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 06:59:57,917][12883] Updated weights for policy 0, policy_version 81691 (0.0035) +[2024-06-18 07:00:01,994][12645] Fps is (10 sec: 39320.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1338556416. Throughput: 0: 42435.3. Samples: 1338651820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:00:01,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 07:00:02,585][12883] Updated weights for policy 0, policy_version 81701 (0.0039) +[2024-06-18 07:00:05,490][12883] Updated weights for policy 0, policy_version 81711 (0.0024) +[2024-06-18 07:00:06,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 1338802176. Throughput: 0: 42704.9. Samples: 1338910160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:06,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 07:00:10,199][12883] Updated weights for policy 0, policy_version 81721 (0.0028) +[2024-06-18 07:00:11,994][12645] Fps is (10 sec: 47513.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1339031552. Throughput: 0: 42597.8. Samples: 1339168640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:11,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 07:00:13,357][12883] Updated weights for policy 0, policy_version 81731 (0.0038) +[2024-06-18 07:00:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42543.2). Total num frames: 1339211776. Throughput: 0: 42642.5. Samples: 1339296780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:16,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 07:00:17,897][12883] Updated weights for policy 0, policy_version 81741 (0.0029) +[2024-06-18 07:00:20,927][12883] Updated weights for policy 0, policy_version 81751 (0.0025) +[2024-06-18 07:00:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1339457536. Throughput: 0: 42784.8. Samples: 1339551640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:21,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 07:00:25,588][12883] Updated weights for policy 0, policy_version 81761 (0.0023) +[2024-06-18 07:00:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42599.5, 300 sec: 42542.9). Total num frames: 1339654144. Throughput: 0: 42696.0. Samples: 1339807780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:26,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 07:00:28,460][12883] Updated weights for policy 0, policy_version 81771 (0.0026) +[2024-06-18 07:00:29,877][12862] Signal inference workers to stop experience collection... (19400 times) +[2024-06-18 07:00:29,878][12862] Signal inference workers to resume experience collection... (19400 times) +[2024-06-18 07:00:29,917][12883] InferenceWorker_p0-w0: stopping experience collection (19400 times) +[2024-06-18 07:00:29,918][12883] InferenceWorker_p0-w0: resuming experience collection (19400 times) +[2024-06-18 07:00:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1339850752. Throughput: 0: 42768.4. Samples: 1339939720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:31,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 07:00:33,229][12883] Updated weights for policy 0, policy_version 81781 (0.0034) +[2024-06-18 07:00:36,015][12883] Updated weights for policy 0, policy_version 81791 (0.0029) +[2024-06-18 07:00:36,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1340080128. Throughput: 0: 42663.7. Samples: 1340193520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:36,995][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 07:00:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081792_1340080128.pth... +[2024-06-18 07:00:37,054][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081166_1329823744.pth +[2024-06-18 07:00:40,850][12883] Updated weights for policy 0, policy_version 81801 (0.0026) +[2024-06-18 07:00:41,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 1340293120. Throughput: 0: 42710.5. Samples: 1340453600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:41,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 07:00:43,743][12883] Updated weights for policy 0, policy_version 81811 (0.0039) +[2024-06-18 07:00:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1340489728. Throughput: 0: 42850.6. Samples: 1340580100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:46,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 07:00:48,427][12883] Updated weights for policy 0, policy_version 81821 (0.0029) +[2024-06-18 07:00:51,528][12883] Updated weights for policy 0, policy_version 81831 (0.0046) +[2024-06-18 07:00:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1340719104. Throughput: 0: 42692.0. Samples: 1340831300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:00:51,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 07:00:55,870][12883] Updated weights for policy 0, policy_version 81841 (0.0028) +[2024-06-18 07:00:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1340915712. Throughput: 0: 42988.1. Samples: 1341103100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:00:56,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 07:00:59,036][12883] Updated weights for policy 0, policy_version 81851 (0.0040) +[2024-06-18 07:01:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1341128704. Throughput: 0: 42740.2. Samples: 1341220080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:01,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 07:01:03,262][12883] Updated weights for policy 0, policy_version 81861 (0.0039) +[2024-06-18 07:01:06,587][12883] Updated weights for policy 0, policy_version 81871 (0.0032) +[2024-06-18 07:01:06,994][12645] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1341390848. Throughput: 0: 42920.5. Samples: 1341483060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:06,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 07:01:10,791][12883] Updated weights for policy 0, policy_version 81881 (0.0042) +[2024-06-18 07:01:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42432.1). Total num frames: 1341554688. Throughput: 0: 43033.8. Samples: 1341744300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:11,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 07:01:14,222][12883] Updated weights for policy 0, policy_version 81891 (0.0024) +[2024-06-18 07:01:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1341784064. Throughput: 0: 42735.0. Samples: 1341862800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:16,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 07:01:18,781][12883] Updated weights for policy 0, policy_version 81901 (0.0023) +[2024-06-18 07:01:21,860][12883] Updated weights for policy 0, policy_version 81911 (0.0037) +[2024-06-18 07:01:21,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1342029824. Throughput: 0: 42944.1. Samples: 1342126000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:21,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 07:01:26,536][12883] Updated weights for policy 0, policy_version 81921 (0.0040) +[2024-06-18 07:01:26,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1342210048. Throughput: 0: 42923.5. Samples: 1342385160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:26,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 07:01:29,544][12883] Updated weights for policy 0, policy_version 81931 (0.0030) +[2024-06-18 07:01:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1342439424. Throughput: 0: 42793.9. Samples: 1342505820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:31,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 07:01:33,931][12883] Updated weights for policy 0, policy_version 81941 (0.0035) +[2024-06-18 07:01:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 1342668800. Throughput: 0: 43144.8. Samples: 1342772820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:36,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 07:01:37,555][12883] Updated weights for policy 0, policy_version 81951 (0.0031) +[2024-06-18 07:01:41,422][12883] Updated weights for policy 0, policy_version 81961 (0.0034) +[2024-06-18 07:01:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1342865408. Throughput: 0: 42799.1. Samples: 1343029060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 07:01:41,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 07:01:42,642][12862] Signal inference workers to stop experience collection... (19450 times) +[2024-06-18 07:01:42,642][12862] Signal inference workers to resume experience collection... (19450 times) +[2024-06-18 07:01:42,661][12883] InferenceWorker_p0-w0: stopping experience collection (19450 times) +[2024-06-18 07:01:42,662][12883] InferenceWorker_p0-w0: resuming experience collection (19450 times) +[2024-06-18 07:01:44,979][12883] Updated weights for policy 0, policy_version 81971 (0.0037) +[2024-06-18 07:01:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1343094784. Throughput: 0: 43001.6. Samples: 1343155160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:01:46,995][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 07:01:48,904][12883] Updated weights for policy 0, policy_version 81981 (0.0039) +[2024-06-18 07:01:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1343291392. Throughput: 0: 42938.4. Samples: 1343415280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:01:51,994][12645] Avg episode reward: [(0, '0.142')] +[2024-06-18 07:01:52,968][12883] Updated weights for policy 0, policy_version 81991 (0.0031) +[2024-06-18 07:01:56,525][12883] Updated weights for policy 0, policy_version 82001 (0.0040) +[2024-06-18 07:01:56,996][12645] Fps is (10 sec: 40951.1, 60 sec: 43142.9, 300 sec: 42709.2). Total num frames: 1343504384. Throughput: 0: 42586.3. Samples: 1343660780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:01:56,996][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 07:02:00,609][12883] Updated weights for policy 0, policy_version 82011 (0.0047) +[2024-06-18 07:02:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1343717376. Throughput: 0: 42998.0. Samples: 1343797700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:02:01,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 07:02:04,599][12883] Updated weights for policy 0, policy_version 82021 (0.0028) +[2024-06-18 07:02:06,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1343930368. Throughput: 0: 42860.0. Samples: 1344054700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:02:06,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 07:02:08,166][12883] Updated weights for policy 0, policy_version 82031 (0.0044) +[2024-06-18 07:02:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1344143360. Throughput: 0: 42701.3. Samples: 1344306720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:02:11,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 07:02:12,412][12883] Updated weights for policy 0, policy_version 82041 (0.0032) +[2024-06-18 07:02:15,982][12883] Updated weights for policy 0, policy_version 82051 (0.0033) +[2024-06-18 07:02:16,993][12645] Fps is (10 sec: 42599.5, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 1344356352. Throughput: 0: 42920.2. Samples: 1344437220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:02:16,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 07:02:20,158][12883] Updated weights for policy 0, policy_version 82061 (0.0033) +[2024-06-18 07:02:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1344585728. Throughput: 0: 42753.3. Samples: 1344696720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:02:21,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 07:02:23,438][12883] Updated weights for policy 0, policy_version 82071 (0.0038) +[2024-06-18 07:02:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1344782336. Throughput: 0: 42809.3. Samples: 1344955480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:02:26,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 07:02:27,838][12883] Updated weights for policy 0, policy_version 82081 (0.0035) +[2024-06-18 07:02:31,299][12883] Updated weights for policy 0, policy_version 82091 (0.0031) +[2024-06-18 07:02:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1344995328. Throughput: 0: 42703.7. Samples: 1345076820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 07:02:31,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 07:02:35,342][12883] Updated weights for policy 0, policy_version 82101 (0.0035) +[2024-06-18 07:02:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1345224704. Throughput: 0: 42665.7. Samples: 1345335240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:02:36,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 07:02:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082106_1345224704.pth... +[2024-06-18 07:02:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081481_1334984704.pth +[2024-06-18 07:02:38,832][12883] Updated weights for policy 0, policy_version 82111 (0.0042) +[2024-06-18 07:02:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1345421312. Throughput: 0: 42937.7. Samples: 1345592880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:02:41,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 07:02:42,891][12883] Updated weights for policy 0, policy_version 82121 (0.0043) +[2024-06-18 07:02:46,438][12883] Updated weights for policy 0, policy_version 82131 (0.0038) +[2024-06-18 07:02:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1345634304. Throughput: 0: 42656.4. Samples: 1345717240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:02:46,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 07:02:50,343][12883] Updated weights for policy 0, policy_version 82141 (0.0034) +[2024-06-18 07:02:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1345847296. Throughput: 0: 42650.8. Samples: 1345973980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:02:51,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 07:02:54,212][12883] Updated weights for policy 0, policy_version 82151 (0.0027) +[2024-06-18 07:02:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 1346076672. Throughput: 0: 42808.8. Samples: 1346233120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:02:56,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 07:02:57,906][12883] Updated weights for policy 0, policy_version 82161 (0.0029) +[2024-06-18 07:03:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1346289664. Throughput: 0: 42782.5. Samples: 1346362440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:03:01,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 07:03:02,000][12883] Updated weights for policy 0, policy_version 82171 (0.0032) +[2024-06-18 07:03:05,838][12883] Updated weights for policy 0, policy_version 82181 (0.0032) +[2024-06-18 07:03:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1346486272. Throughput: 0: 42753.4. Samples: 1346620620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:03:06,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 07:03:09,521][12883] Updated weights for policy 0, policy_version 82191 (0.0030) +[2024-06-18 07:03:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1346699264. Throughput: 0: 42628.8. Samples: 1346873780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:03:11,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 07:03:13,655][12883] Updated weights for policy 0, policy_version 82201 (0.0036) +[2024-06-18 07:03:14,741][12862] Signal inference workers to stop experience collection... (19500 times) +[2024-06-18 07:03:14,742][12862] Signal inference workers to resume experience collection... (19500 times) +[2024-06-18 07:03:14,796][12883] InferenceWorker_p0-w0: stopping experience collection (19500 times) +[2024-06-18 07:03:14,796][12883] InferenceWorker_p0-w0: resuming experience collection (19500 times) +[2024-06-18 07:03:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1346928640. Throughput: 0: 42779.5. Samples: 1347001900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:03:16,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 07:03:17,113][12883] Updated weights for policy 0, policy_version 82211 (0.0039) +[2024-06-18 07:03:21,252][12883] Updated weights for policy 0, policy_version 82221 (0.0029) +[2024-06-18 07:03:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1347125248. Throughput: 0: 42664.0. Samples: 1347255120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) +[2024-06-18 07:03:21,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 07:03:24,763][12883] Updated weights for policy 0, policy_version 82231 (0.0037) +[2024-06-18 07:03:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1347338240. Throughput: 0: 42760.9. Samples: 1347517120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:03:26,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 07:03:28,914][12883] Updated weights for policy 0, policy_version 82241 (0.0044) +[2024-06-18 07:03:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1347584000. Throughput: 0: 42912.4. Samples: 1347648300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:03:31,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 07:03:32,390][12883] Updated weights for policy 0, policy_version 82251 (0.0027) +[2024-06-18 07:03:36,564][12883] Updated weights for policy 0, policy_version 82261 (0.0028) +[2024-06-18 07:03:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1347764224. Throughput: 0: 42817.2. Samples: 1347900760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:03:36,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 07:03:39,951][12883] Updated weights for policy 0, policy_version 82271 (0.0030) +[2024-06-18 07:03:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1347977216. Throughput: 0: 42746.6. Samples: 1348156720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:03:41,994][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 07:03:44,291][12883] Updated weights for policy 0, policy_version 82281 (0.0030) +[2024-06-18 07:03:46,994][12645] Fps is (10 sec: 47514.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1348239360. Throughput: 0: 42625.4. Samples: 1348280580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:03:46,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 07:03:47,636][12883] Updated weights for policy 0, policy_version 82291 (0.0027) +[2024-06-18 07:03:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1348386816. Throughput: 0: 42421.5. Samples: 1348529600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:03:51,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 07:03:52,399][12883] Updated weights for policy 0, policy_version 82301 (0.0032) +[2024-06-18 07:03:55,409][12883] Updated weights for policy 0, policy_version 82311 (0.0037) +[2024-06-18 07:03:56,994][12645] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1348616192. Throughput: 0: 42407.0. Samples: 1348782100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:03:56,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 07:04:00,049][12883] Updated weights for policy 0, policy_version 82321 (0.0032) +[2024-06-18 07:04:01,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 1348861952. Throughput: 0: 42530.6. Samples: 1348915780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:04:01,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 07:04:03,585][12883] Updated weights for policy 0, policy_version 82331 (0.0046) +[2024-06-18 07:04:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1349025792. Throughput: 0: 42388.9. Samples: 1349162620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:04:06,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 07:04:07,977][12883] Updated weights for policy 0, policy_version 82341 (0.0026) +[2024-06-18 07:04:11,155][12883] Updated weights for policy 0, policy_version 82351 (0.0029) +[2024-06-18 07:04:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1349255168. Throughput: 0: 42249.2. Samples: 1349418340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:04:11,995][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 07:04:15,764][12883] Updated weights for policy 0, policy_version 82361 (0.0036) +[2024-06-18 07:04:16,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1349468160. Throughput: 0: 42141.7. Samples: 1349544680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:16,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 07:04:18,757][12883] Updated weights for policy 0, policy_version 82371 (0.0042) +[2024-06-18 07:04:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42654.1). Total num frames: 1349681152. Throughput: 0: 42029.8. Samples: 1349792100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:21,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 07:04:23,411][12883] Updated weights for policy 0, policy_version 82381 (0.0031) +[2024-06-18 07:04:26,689][12883] Updated weights for policy 0, policy_version 82391 (0.0035) +[2024-06-18 07:04:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1349894144. Throughput: 0: 41929.0. Samples: 1350043520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:26,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 07:04:30,899][12883] Updated weights for policy 0, policy_version 82401 (0.0041) +[2024-06-18 07:04:31,996][12645] Fps is (10 sec: 40951.2, 60 sec: 41777.7, 300 sec: 42653.6). Total num frames: 1350090752. Throughput: 0: 42076.9. Samples: 1350174140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:31,997][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 07:04:34,626][12883] Updated weights for policy 0, policy_version 82411 (0.0033) +[2024-06-18 07:04:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1350320128. Throughput: 0: 42233.4. Samples: 1350430100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:36,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 07:04:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082417_1350320128.pth... +[2024-06-18 07:04:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081792_1340080128.pth +[2024-06-18 07:04:38,462][12883] Updated weights for policy 0, policy_version 82421 (0.0024) +[2024-06-18 07:04:40,769][12862] Signal inference workers to stop experience collection... (19550 times) +[2024-06-18 07:04:40,819][12883] InferenceWorker_p0-w0: stopping experience collection (19550 times) +[2024-06-18 07:04:40,883][12862] Signal inference workers to resume experience collection... (19550 times) +[2024-06-18 07:04:40,883][12883] InferenceWorker_p0-w0: resuming experience collection (19550 times) +[2024-06-18 07:04:41,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1350533120. Throughput: 0: 42334.6. Samples: 1350687160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:41,994][12645] Avg episode reward: [(0, '0.147')] +[2024-06-18 07:04:42,544][12883] Updated weights for policy 0, policy_version 82431 (0.0031) +[2024-06-18 07:04:46,318][12883] Updated weights for policy 0, policy_version 82441 (0.0027) +[2024-06-18 07:04:46,994][12645] Fps is (10 sec: 42596.9, 60 sec: 41778.8, 300 sec: 42653.9). Total num frames: 1350746112. Throughput: 0: 42076.1. Samples: 1350809220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:46,995][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 07:04:50,341][12883] Updated weights for policy 0, policy_version 82451 (0.0036) +[2024-06-18 07:04:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1350959104. Throughput: 0: 42239.1. Samples: 1351063380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:51,995][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 07:04:54,183][12883] Updated weights for policy 0, policy_version 82461 (0.0043) +[2024-06-18 07:04:56,994][12645] Fps is (10 sec: 40961.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1351155712. Throughput: 0: 42377.0. Samples: 1351325300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:04:56,994][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 07:04:58,119][12883] Updated weights for policy 0, policy_version 82471 (0.0040) +[2024-06-18 07:05:01,875][12883] Updated weights for policy 0, policy_version 82481 (0.0027) +[2024-06-18 07:05:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1351368704. Throughput: 0: 42165.1. Samples: 1351442100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:05:01,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 07:05:05,648][12883] Updated weights for policy 0, policy_version 82491 (0.0042) +[2024-06-18 07:05:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1351614464. Throughput: 0: 42486.7. Samples: 1351704000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 07:05:06,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 07:05:09,965][12883] Updated weights for policy 0, policy_version 82501 (0.0036) +[2024-06-18 07:05:11,993][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.6, 300 sec: 42654.0). Total num frames: 1351794688. Throughput: 0: 42533.9. Samples: 1351957540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:11,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 07:05:13,221][12883] Updated weights for policy 0, policy_version 82511 (0.0037) +[2024-06-18 07:05:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1352007680. Throughput: 0: 42310.5. Samples: 1352078020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:17,003][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 07:05:17,625][12883] Updated weights for policy 0, policy_version 82521 (0.0038) +[2024-06-18 07:05:20,702][12883] Updated weights for policy 0, policy_version 82531 (0.0028) +[2024-06-18 07:05:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1352237056. Throughput: 0: 42386.9. Samples: 1352337500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:21,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 07:05:25,178][12883] Updated weights for policy 0, policy_version 82541 (0.0035) +[2024-06-18 07:05:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1352417280. Throughput: 0: 42391.2. Samples: 1352594760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:26,994][12645] Avg episode reward: [(0, '0.164')] +[2024-06-18 07:05:28,407][12883] Updated weights for policy 0, policy_version 82551 (0.0031) +[2024-06-18 07:05:31,994][12645] Fps is (10 sec: 39320.4, 60 sec: 42326.8, 300 sec: 42542.9). Total num frames: 1352630272. Throughput: 0: 42447.8. Samples: 1352719360. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:31,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 07:05:32,834][12883] Updated weights for policy 0, policy_version 82561 (0.0053) +[2024-06-18 07:05:36,118][12883] Updated weights for policy 0, policy_version 82571 (0.0032) +[2024-06-18 07:05:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1352859648. Throughput: 0: 42508.9. Samples: 1352976280. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:36,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 07:05:40,425][12883] Updated weights for policy 0, policy_version 82581 (0.0043) +[2024-06-18 07:05:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1353072640. Throughput: 0: 42303.1. Samples: 1353228940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:41,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 07:05:44,047][12883] Updated weights for policy 0, policy_version 82591 (0.0037) +[2024-06-18 07:05:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.5, 300 sec: 42542.8). Total num frames: 1353269248. Throughput: 0: 42526.9. Samples: 1353355820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:46,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 07:05:48,241][12883] Updated weights for policy 0, policy_version 82601 (0.0027) +[2024-06-18 07:05:51,790][12883] Updated weights for policy 0, policy_version 82611 (0.0037) +[2024-06-18 07:05:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1353498624. Throughput: 0: 42387.5. Samples: 1353611440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:51,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 07:05:55,816][12883] Updated weights for policy 0, policy_version 82621 (0.0034) +[2024-06-18 07:05:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1353695232. Throughput: 0: 42249.2. Samples: 1353858760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) +[2024-06-18 07:05:56,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 07:05:58,935][12862] Signal inference workers to stop experience collection... (19600 times) +[2024-06-18 07:05:58,935][12862] Signal inference workers to resume experience collection... (19600 times) +[2024-06-18 07:05:58,945][12883] InferenceWorker_p0-w0: stopping experience collection (19600 times) +[2024-06-18 07:05:58,945][12883] InferenceWorker_p0-w0: resuming experience collection (19600 times) +[2024-06-18 07:05:59,590][12883] Updated weights for policy 0, policy_version 82631 (0.0025) +[2024-06-18 07:06:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1353924608. Throughput: 0: 42376.5. Samples: 1353984960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:01,994][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 07:06:03,533][12883] Updated weights for policy 0, policy_version 82641 (0.0041) +[2024-06-18 07:06:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1354121216. Throughput: 0: 42384.4. Samples: 1354244800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:06,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 07:06:07,294][12883] Updated weights for policy 0, policy_version 82651 (0.0034) +[2024-06-18 07:06:11,263][12883] Updated weights for policy 0, policy_version 82661 (0.0022) +[2024-06-18 07:06:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.1, 300 sec: 42598.4). Total num frames: 1354350592. Throughput: 0: 42183.0. Samples: 1354493000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:11,995][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 07:06:15,304][12883] Updated weights for policy 0, policy_version 82671 (0.0039) +[2024-06-18 07:06:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1354563584. Throughput: 0: 42304.2. Samples: 1354623040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:16,995][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 07:06:18,965][12883] Updated weights for policy 0, policy_version 82681 (0.0041) +[2024-06-18 07:06:21,994][12645] Fps is (10 sec: 39322.6, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 1354743808. Throughput: 0: 42221.9. Samples: 1354876260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:21,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 07:06:23,033][12883] Updated weights for policy 0, policy_version 82691 (0.0037) +[2024-06-18 07:06:26,787][12883] Updated weights for policy 0, policy_version 82701 (0.0032) +[2024-06-18 07:06:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1354973184. Throughput: 0: 42283.6. Samples: 1355131700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:27,003][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 07:06:30,821][12883] Updated weights for policy 0, policy_version 82711 (0.0029) +[2024-06-18 07:06:31,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43144.6, 300 sec: 42542.8). Total num frames: 1355218944. Throughput: 0: 42386.3. Samples: 1355263200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:31,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 07:06:34,403][12883] Updated weights for policy 0, policy_version 82721 (0.0040) +[2024-06-18 07:06:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1355382784. Throughput: 0: 42304.9. Samples: 1355515160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:36,999][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 07:06:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082726_1355382784.pth... +[2024-06-18 07:06:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082106_1345224704.pth +[2024-06-18 07:06:38,487][12883] Updated weights for policy 0, policy_version 82731 (0.0034) +[2024-06-18 07:06:41,929][12883] Updated weights for policy 0, policy_version 82741 (0.0031) +[2024-06-18 07:06:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1355628544. Throughput: 0: 42512.2. Samples: 1355771820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:41,994][12645] Avg episode reward: [(0, '0.070')] +[2024-06-18 07:06:46,168][12883] Updated weights for policy 0, policy_version 82751 (0.0037) +[2024-06-18 07:06:46,994][12645] Fps is (10 sec: 47514.4, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 1355857920. Throughput: 0: 42570.3. Samples: 1355900620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) +[2024-06-18 07:06:46,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 07:06:49,673][12883] Updated weights for policy 0, policy_version 82761 (0.0035) +[2024-06-18 07:06:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42487.6). Total num frames: 1356038144. Throughput: 0: 42505.3. Samples: 1356157540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:06:51,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 07:06:53,776][12883] Updated weights for policy 0, policy_version 82772 (0.0034) +[2024-06-18 07:06:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1356267520. Throughput: 0: 42609.9. Samples: 1356410440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:06:56,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 07:06:58,415][12883] Updated weights for policy 0, policy_version 82782 (0.0029) +[2024-06-18 07:07:01,568][12883] Updated weights for policy 0, policy_version 82792 (0.0034) +[2024-06-18 07:07:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1356480512. Throughput: 0: 42581.5. Samples: 1356539200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:07:01,994][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 07:07:06,050][12883] Updated weights for policy 0, policy_version 82802 (0.0034) +[2024-06-18 07:07:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 1356677120. Throughput: 0: 42659.7. Samples: 1356795960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:07:06,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 07:07:09,123][12883] Updated weights for policy 0, policy_version 82812 (0.0025) +[2024-06-18 07:07:11,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.9, 300 sec: 42542.5). Total num frames: 1356906496. Throughput: 0: 42673.0. Samples: 1357052080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:07:11,996][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 07:07:13,603][12883] Updated weights for policy 0, policy_version 82822 (0.0024) +[2024-06-18 07:07:16,797][12883] Updated weights for policy 0, policy_version 82832 (0.0034) +[2024-06-18 07:07:16,997][12645] Fps is (10 sec: 45860.8, 60 sec: 42869.1, 300 sec: 42542.4). Total num frames: 1357135872. Throughput: 0: 42664.4. Samples: 1357183240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:07:16,998][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 07:07:21,087][12883] Updated weights for policy 0, policy_version 82842 (0.0039) +[2024-06-18 07:07:21,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1357316096. Throughput: 0: 42726.4. Samples: 1357437840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:07:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 07:07:24,926][12883] Updated weights for policy 0, policy_version 82852 (0.0025) +[2024-06-18 07:07:26,994][12645] Fps is (10 sec: 40974.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1357545472. Throughput: 0: 42583.8. Samples: 1357688080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:07:26,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 07:07:29,266][12883] Updated weights for policy 0, policy_version 82862 (0.0042) +[2024-06-18 07:07:29,284][12862] Signal inference workers to stop experience collection... (19650 times) +[2024-06-18 07:07:29,284][12862] Signal inference workers to resume experience collection... (19650 times) +[2024-06-18 07:07:29,302][12883] InferenceWorker_p0-w0: stopping experience collection (19650 times) +[2024-06-18 07:07:29,303][12883] InferenceWorker_p0-w0: resuming experience collection (19650 times) +[2024-06-18 07:07:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1357742080. Throughput: 0: 42688.4. Samples: 1357821600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:07:31,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 07:07:32,416][12883] Updated weights for policy 0, policy_version 82872 (0.0029) +[2024-06-18 07:07:36,971][12883] Updated weights for policy 0, policy_version 82882 (0.0031) +[2024-06-18 07:07:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1357938688. Throughput: 0: 42508.1. Samples: 1358070400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:07:36,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 07:07:39,976][12883] Updated weights for policy 0, policy_version 82892 (0.0028) +[2024-06-18 07:07:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1358184448. Throughput: 0: 42497.8. Samples: 1358322840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:07:41,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 07:07:44,473][12883] Updated weights for policy 0, policy_version 82902 (0.0035) +[2024-06-18 07:07:46,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 1358364672. Throughput: 0: 42659.0. Samples: 1358458860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:07:46,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 07:07:47,753][12883] Updated weights for policy 0, policy_version 82912 (0.0034) +[2024-06-18 07:07:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1358594048. Throughput: 0: 42345.4. Samples: 1358701500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:07:51,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 07:07:51,997][12883] Updated weights for policy 0, policy_version 82922 (0.0043) +[2024-06-18 07:07:55,928][12883] Updated weights for policy 0, policy_version 82932 (0.0033) +[2024-06-18 07:07:56,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1358807040. Throughput: 0: 42212.6. Samples: 1358951560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:07:56,995][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 07:07:59,725][12883] Updated weights for policy 0, policy_version 82942 (0.0034) +[2024-06-18 07:08:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1359003648. Throughput: 0: 42285.8. Samples: 1359085960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:08:01,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 07:08:03,634][12883] Updated weights for policy 0, policy_version 82952 (0.0029) +[2024-06-18 07:08:06,994][12645] Fps is (10 sec: 40961.3, 60 sec: 42325.6, 300 sec: 42431.8). Total num frames: 1359216640. Throughput: 0: 42254.7. Samples: 1359339300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:08:06,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 07:08:07,280][12883] Updated weights for policy 0, policy_version 82962 (0.0033) +[2024-06-18 07:08:11,216][12883] Updated weights for policy 0, policy_version 82972 (0.0036) +[2024-06-18 07:08:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42053.8, 300 sec: 42376.2). Total num frames: 1359429632. Throughput: 0: 42395.9. Samples: 1359595900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:08:11,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 07:08:15,035][12883] Updated weights for policy 0, policy_version 82982 (0.0038) +[2024-06-18 07:08:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41508.5, 300 sec: 42376.3). Total num frames: 1359626240. Throughput: 0: 42167.2. Samples: 1359719120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:08:16,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 07:08:18,819][12883] Updated weights for policy 0, policy_version 82992 (0.0025) +[2024-06-18 07:08:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1359872000. Throughput: 0: 42328.7. Samples: 1359975200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:08:21,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 07:08:23,099][12883] Updated weights for policy 0, policy_version 83002 (0.0035) +[2024-06-18 07:08:26,271][12883] Updated weights for policy 0, policy_version 83012 (0.0027) +[2024-06-18 07:08:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1360068608. Throughput: 0: 42499.6. Samples: 1360235320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 07:08:26,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 07:08:30,617][12883] Updated weights for policy 0, policy_version 83022 (0.0027) +[2024-06-18 07:08:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1360281600. Throughput: 0: 42264.1. Samples: 1360360740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:08:31,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 07:08:34,373][12883] Updated weights for policy 0, policy_version 83032 (0.0034) +[2024-06-18 07:08:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 1360510976. Throughput: 0: 42702.3. Samples: 1360623100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:08:36,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 07:08:37,104][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083040_1360527360.pth... +[2024-06-18 07:08:37,155][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082417_1350320128.pth +[2024-06-18 07:08:38,046][12883] Updated weights for policy 0, policy_version 83042 (0.0043) +[2024-06-18 07:08:41,995][12883] Updated weights for policy 0, policy_version 83052 (0.0042) +[2024-06-18 07:08:41,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42323.8, 300 sec: 42320.4). Total num frames: 1360723968. Throughput: 0: 42718.1. Samples: 1360873960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:08:41,996][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 07:08:45,633][12883] Updated weights for policy 0, policy_version 83062 (0.0045) +[2024-06-18 07:08:46,792][12862] Signal inference workers to stop experience collection... (19700 times) +[2024-06-18 07:08:46,796][12862] Signal inference workers to resume experience collection... (19700 times) +[2024-06-18 07:08:46,816][12883] InferenceWorker_p0-w0: stopping experience collection (19700 times) +[2024-06-18 07:08:46,844][12883] InferenceWorker_p0-w0: resuming experience collection (19700 times) +[2024-06-18 07:08:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1360936960. Throughput: 0: 42642.2. Samples: 1361004860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:08:46,994][12645] Avg episode reward: [(0, '0.746')] +[2024-06-18 07:08:47,002][12862] Saving new best policy, reward=0.746! +[2024-06-18 07:08:49,838][12883] Updated weights for policy 0, policy_version 83072 (0.0043) +[2024-06-18 07:08:51,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1361149952. Throughput: 0: 42735.4. Samples: 1361262400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:08:51,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 07:08:53,880][12883] Updated weights for policy 0, policy_version 83082 (0.0029) +[2024-06-18 07:08:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1361346560. Throughput: 0: 42585.4. Samples: 1361512240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:08:56,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 07:08:57,447][12883] Updated weights for policy 0, policy_version 83092 (0.0045) +[2024-06-18 07:09:01,453][12883] Updated weights for policy 0, policy_version 83102 (0.0038) +[2024-06-18 07:09:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1361559552. Throughput: 0: 42597.8. Samples: 1361636020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:09:01,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 07:09:05,482][12883] Updated weights for policy 0, policy_version 83112 (0.0034) +[2024-06-18 07:09:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1361772544. Throughput: 0: 42722.8. Samples: 1361897720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:09:06,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 07:09:09,028][12883] Updated weights for policy 0, policy_version 83122 (0.0033) +[2024-06-18 07:09:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1361985536. Throughput: 0: 42485.0. Samples: 1362147140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:09:11,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 07:09:13,093][12883] Updated weights for policy 0, policy_version 83132 (0.0045) +[2024-06-18 07:09:16,650][12883] Updated weights for policy 0, policy_version 83142 (0.0028) +[2024-06-18 07:09:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1362214912. Throughput: 0: 42558.2. Samples: 1362275860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 07:09:16,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 07:09:20,855][12883] Updated weights for policy 0, policy_version 83152 (0.0033) +[2024-06-18 07:09:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1362411520. Throughput: 0: 42459.6. Samples: 1362533780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:09:21,996][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 07:09:24,205][12883] Updated weights for policy 0, policy_version 83162 (0.0037) +[2024-06-18 07:09:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 1362624512. Throughput: 0: 42358.9. Samples: 1362780020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:09:26,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 07:09:28,807][12883] Updated weights for policy 0, policy_version 83172 (0.0042) +[2024-06-18 07:09:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1362837504. Throughput: 0: 42312.9. Samples: 1362908940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:09:31,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 07:09:32,259][12883] Updated weights for policy 0, policy_version 83182 (0.0039) +[2024-06-18 07:09:36,497][12883] Updated weights for policy 0, policy_version 83192 (0.0041) +[2024-06-18 07:09:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1363034112. Throughput: 0: 42320.1. Samples: 1363166800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:09:36,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 07:09:40,057][12883] Updated weights for policy 0, policy_version 83202 (0.0032) +[2024-06-18 07:09:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42600.0, 300 sec: 42487.4). Total num frames: 1363279872. Throughput: 0: 42308.5. Samples: 1363416120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:09:41,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 07:09:44,285][12883] Updated weights for policy 0, policy_version 83212 (0.0025) +[2024-06-18 07:09:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1363476480. Throughput: 0: 42511.9. Samples: 1363549060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:09:46,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 07:09:47,671][12883] Updated weights for policy 0, policy_version 83222 (0.0031) +[2024-06-18 07:09:51,963][12883] Updated weights for policy 0, policy_version 83232 (0.0030) +[2024-06-18 07:09:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1363673088. Throughput: 0: 42383.4. Samples: 1363804980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:09:51,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 07:09:55,145][12883] Updated weights for policy 0, policy_version 83242 (0.0044) +[2024-06-18 07:09:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1363902464. Throughput: 0: 42402.3. Samples: 1364055240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:09:56,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 07:09:59,835][12883] Updated weights for policy 0, policy_version 83252 (0.0023) +[2024-06-18 07:10:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1364115456. Throughput: 0: 42659.5. Samples: 1364195540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:10:01,996][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 07:10:02,688][12883] Updated weights for policy 0, policy_version 83262 (0.0031) +[2024-06-18 07:10:06,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42431.7). Total num frames: 1364312064. Throughput: 0: 42451.5. Samples: 1364444100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 07:10:06,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 07:10:07,433][12883] Updated weights for policy 0, policy_version 83272 (0.0033) +[2024-06-18 07:10:10,454][12883] Updated weights for policy 0, policy_version 83282 (0.0028) +[2024-06-18 07:10:12,004][12645] Fps is (10 sec: 44191.3, 60 sec: 42864.1, 300 sec: 42541.4). Total num frames: 1364557824. Throughput: 0: 42524.1. Samples: 1364694040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:12,009][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 07:10:15,020][12883] Updated weights for policy 0, policy_version 83292 (0.0039) +[2024-06-18 07:10:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1364754432. Throughput: 0: 42718.6. Samples: 1364831280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:16,998][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 07:10:18,003][12883] Updated weights for policy 0, policy_version 83302 (0.0032) +[2024-06-18 07:10:21,994][12645] Fps is (10 sec: 39362.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1364951040. Throughput: 0: 42570.2. Samples: 1365082460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:21,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 07:10:22,596][12883] Updated weights for policy 0, policy_version 83312 (0.0036) +[2024-06-18 07:10:25,693][12883] Updated weights for policy 0, policy_version 83322 (0.0029) +[2024-06-18 07:10:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1365196800. Throughput: 0: 42603.0. Samples: 1365333260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:26,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 07:10:30,524][12883] Updated weights for policy 0, policy_version 83332 (0.0039) +[2024-06-18 07:10:30,995][12862] Signal inference workers to stop experience collection... (19750 times) +[2024-06-18 07:10:30,996][12862] Signal inference workers to resume experience collection... (19750 times) +[2024-06-18 07:10:31,020][12883] InferenceWorker_p0-w0: stopping experience collection (19750 times) +[2024-06-18 07:10:31,020][12883] InferenceWorker_p0-w0: resuming experience collection (19750 times) +[2024-06-18 07:10:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1365393408. Throughput: 0: 42724.1. Samples: 1365471640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:31,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 07:10:33,232][12883] Updated weights for policy 0, policy_version 83342 (0.0040) +[2024-06-18 07:10:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1365606400. Throughput: 0: 42541.8. Samples: 1365719360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:36,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 07:10:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083350_1365606400.pth... +[2024-06-18 07:10:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082726_1355382784.pth +[2024-06-18 07:10:38,398][12883] Updated weights for policy 0, policy_version 83352 (0.0042) +[2024-06-18 07:10:41,283][12883] Updated weights for policy 0, policy_version 83362 (0.0035) +[2024-06-18 07:10:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1365835776. Throughput: 0: 42557.2. Samples: 1365970320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:41,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 07:10:46,141][12883] Updated weights for policy 0, policy_version 83372 (0.0023) +[2024-06-18 07:10:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1365999616. Throughput: 0: 42334.7. Samples: 1366100600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:46,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 07:10:48,877][12883] Updated weights for policy 0, policy_version 83382 (0.0041) +[2024-06-18 07:10:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1366261760. Throughput: 0: 42530.7. Samples: 1366357980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:51,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 07:10:53,886][12883] Updated weights for policy 0, policy_version 83392 (0.0036) +[2024-06-18 07:10:56,630][12883] Updated weights for policy 0, policy_version 83402 (0.0042) +[2024-06-18 07:10:56,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1366474752. Throughput: 0: 42642.7. Samples: 1366612520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:10:56,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 07:11:01,514][12883] Updated weights for policy 0, policy_version 83412 (0.0033) +[2024-06-18 07:11:01,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1366638592. Throughput: 0: 42381.0. Samples: 1366738420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) +[2024-06-18 07:11:01,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 07:11:04,334][12883] Updated weights for policy 0, policy_version 83422 (0.0028) +[2024-06-18 07:11:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1366900736. Throughput: 0: 42518.1. Samples: 1366995780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:06,995][12645] Avg episode reward: [(0, '0.102')] +[2024-06-18 07:11:09,285][12883] Updated weights for policy 0, policy_version 83432 (0.0041) +[2024-06-18 07:11:11,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42332.6, 300 sec: 42487.3). Total num frames: 1367097344. Throughput: 0: 42510.3. Samples: 1367246220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:11,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 07:11:12,217][12883] Updated weights for policy 0, policy_version 83442 (0.0032) +[2024-06-18 07:11:16,820][12883] Updated weights for policy 0, policy_version 83452 (0.0033) +[2024-06-18 07:11:16,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1367277568. Throughput: 0: 42258.6. Samples: 1367373280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:16,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 07:11:20,098][12883] Updated weights for policy 0, policy_version 83462 (0.0028) +[2024-06-18 07:11:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1367539712. Throughput: 0: 42518.7. Samples: 1367632700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:21,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 07:11:24,375][12883] Updated weights for policy 0, policy_version 83472 (0.0035) +[2024-06-18 07:11:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1367736320. Throughput: 0: 42699.6. Samples: 1367891800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:26,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 07:11:27,658][12883] Updated weights for policy 0, policy_version 83482 (0.0037) +[2024-06-18 07:11:31,871][12883] Updated weights for policy 0, policy_version 83492 (0.0037) +[2024-06-18 07:11:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1367932928. Throughput: 0: 42505.3. Samples: 1368013340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:31,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 07:11:35,411][12883] Updated weights for policy 0, policy_version 83502 (0.0041) +[2024-06-18 07:11:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1368178688. Throughput: 0: 42621.7. Samples: 1368275960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:36,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 07:11:40,026][12883] Updated weights for policy 0, policy_version 83512 (0.0033) +[2024-06-18 07:11:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1368358912. Throughput: 0: 42588.4. Samples: 1368529000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:41,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 07:11:43,098][12883] Updated weights for policy 0, policy_version 83522 (0.0032) +[2024-06-18 07:11:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1368571904. Throughput: 0: 42624.3. Samples: 1368656520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:46,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 07:11:47,697][12883] Updated weights for policy 0, policy_version 83532 (0.0038) +[2024-06-18 07:11:47,863][12862] Signal inference workers to stop experience collection... (19800 times) +[2024-06-18 07:11:47,863][12862] Signal inference workers to resume experience collection... (19800 times) +[2024-06-18 07:11:47,897][12883] InferenceWorker_p0-w0: stopping experience collection (19800 times) +[2024-06-18 07:11:47,897][12883] InferenceWorker_p0-w0: resuming experience collection (19800 times) +[2024-06-18 07:11:50,534][12883] Updated weights for policy 0, policy_version 83542 (0.0029) +[2024-06-18 07:11:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1368817664. Throughput: 0: 42693.1. Samples: 1368916960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 07:11:51,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 07:11:55,207][12883] Updated weights for policy 0, policy_version 83552 (0.0029) +[2024-06-18 07:11:56,996][12645] Fps is (10 sec: 44228.8, 60 sec: 42324.0, 300 sec: 42487.0). Total num frames: 1369014272. Throughput: 0: 42904.9. Samples: 1369177020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:11:56,996][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 07:11:58,417][12883] Updated weights for policy 0, policy_version 83562 (0.0041) +[2024-06-18 07:12:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1369227264. Throughput: 0: 42788.4. Samples: 1369298760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:01,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 07:12:02,698][12883] Updated weights for policy 0, policy_version 83572 (0.0032) +[2024-06-18 07:12:06,087][12883] Updated weights for policy 0, policy_version 83582 (0.0032) +[2024-06-18 07:12:06,994][12645] Fps is (10 sec: 45883.2, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1369473024. Throughput: 0: 42840.8. Samples: 1369560540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:06,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 07:12:10,276][12883] Updated weights for policy 0, policy_version 83592 (0.0038) +[2024-06-18 07:12:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42376.7). Total num frames: 1369636864. Throughput: 0: 42723.1. Samples: 1369814340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:11,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 07:12:13,798][12883] Updated weights for policy 0, policy_version 83602 (0.0020) +[2024-06-18 07:12:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1369866240. Throughput: 0: 42797.2. Samples: 1369939220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:16,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 07:12:18,233][12883] Updated weights for policy 0, policy_version 83612 (0.0039) +[2024-06-18 07:12:21,457][12883] Updated weights for policy 0, policy_version 83622 (0.0038) +[2024-06-18 07:12:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1370095616. Throughput: 0: 42603.2. Samples: 1370193100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:21,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 07:12:25,765][12883] Updated weights for policy 0, policy_version 83632 (0.0031) +[2024-06-18 07:12:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1370275840. Throughput: 0: 42843.6. Samples: 1370456960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:26,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 07:12:28,899][12883] Updated weights for policy 0, policy_version 83642 (0.0038) +[2024-06-18 07:12:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1370505216. Throughput: 0: 42745.3. Samples: 1370580060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:32,000][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 07:12:33,263][12883] Updated weights for policy 0, policy_version 83652 (0.0034) +[2024-06-18 07:12:36,670][12883] Updated weights for policy 0, policy_version 83662 (0.0027) +[2024-06-18 07:12:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1370734592. Throughput: 0: 42652.0. Samples: 1370836300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:36,994][12645] Avg episode reward: [(0, '0.642')] +[2024-06-18 07:12:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083663_1370734592.pth... +[2024-06-18 07:12:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083040_1360527360.pth +[2024-06-18 07:12:40,995][12883] Updated weights for policy 0, policy_version 83672 (0.0025) +[2024-06-18 07:12:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1370931200. Throughput: 0: 42585.2. Samples: 1371093280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:12:41,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 07:12:44,425][12883] Updated weights for policy 0, policy_version 83682 (0.0034) +[2024-06-18 07:12:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1371144192. Throughput: 0: 42692.4. Samples: 1371219920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:12:46,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 07:12:48,621][12883] Updated weights for policy 0, policy_version 83692 (0.0040) +[2024-06-18 07:12:51,831][12883] Updated weights for policy 0, policy_version 83702 (0.0026) +[2024-06-18 07:12:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1371373568. Throughput: 0: 42635.6. Samples: 1371479140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:12:51,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 07:12:56,070][12883] Updated weights for policy 0, policy_version 83712 (0.0036) +[2024-06-18 07:12:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42872.8, 300 sec: 42653.9). Total num frames: 1371586560. Throughput: 0: 42911.6. Samples: 1371745360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:12:56,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 07:12:59,694][12883] Updated weights for policy 0, policy_version 83722 (0.0024) +[2024-06-18 07:13:01,996][12645] Fps is (10 sec: 42589.6, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1371799552. Throughput: 0: 42935.8. Samples: 1371871420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:13:01,996][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 07:13:03,963][12883] Updated weights for policy 0, policy_version 83732 (0.0037) +[2024-06-18 07:13:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1372012544. Throughput: 0: 42998.2. Samples: 1372128020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:13:06,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 07:13:07,200][12883] Updated weights for policy 0, policy_version 83742 (0.0046) +[2024-06-18 07:13:07,675][12862] Signal inference workers to stop experience collection... (19850 times) +[2024-06-18 07:13:07,680][12862] Signal inference workers to resume experience collection... (19850 times) +[2024-06-18 07:13:07,711][12883] InferenceWorker_p0-w0: stopping experience collection (19850 times) +[2024-06-18 07:13:07,711][12883] InferenceWorker_p0-w0: resuming experience collection (19850 times) +[2024-06-18 07:13:11,860][12883] Updated weights for policy 0, policy_version 83752 (0.0031) +[2024-06-18 07:13:11,994][12645] Fps is (10 sec: 39330.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1372192768. Throughput: 0: 42969.4. Samples: 1372390580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:13:11,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 07:13:14,855][12883] Updated weights for policy 0, policy_version 83762 (0.0029) +[2024-06-18 07:13:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1372454912. Throughput: 0: 42844.9. Samples: 1372508080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:13:16,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 07:13:19,397][12883] Updated weights for policy 0, policy_version 83772 (0.0037) +[2024-06-18 07:13:21,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1372651520. Throughput: 0: 42883.6. Samples: 1372766060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:13:21,994][12645] Avg episode reward: [(0, '0.131')] +[2024-06-18 07:13:22,296][12883] Updated weights for policy 0, policy_version 83782 (0.0028) +[2024-06-18 07:13:26,846][12883] Updated weights for policy 0, policy_version 83792 (0.0044) +[2024-06-18 07:13:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1372848128. Throughput: 0: 42892.1. Samples: 1373023420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:13:26,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 07:13:30,067][12883] Updated weights for policy 0, policy_version 83802 (0.0023) +[2024-06-18 07:13:31,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1373093888. Throughput: 0: 42906.6. Samples: 1373150720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:13:31,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 07:13:34,310][12883] Updated weights for policy 0, policy_version 83812 (0.0039) +[2024-06-18 07:13:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1373274112. Throughput: 0: 42919.7. Samples: 1373410520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:13:36,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 07:13:37,775][12883] Updated weights for policy 0, policy_version 83822 (0.0045) +[2024-06-18 07:13:41,850][12883] Updated weights for policy 0, policy_version 83832 (0.0031) +[2024-06-18 07:13:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1373519872. Throughput: 0: 42704.0. Samples: 1373667040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:13:41,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 07:13:45,374][12883] Updated weights for policy 0, policy_version 83842 (0.0028) +[2024-06-18 07:13:46,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1373749248. Throughput: 0: 42771.0. Samples: 1373796020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:13:46,994][12645] Avg episode reward: [(0, '0.045')] +[2024-06-18 07:13:49,443][12883] Updated weights for policy 0, policy_version 83852 (0.0052) +[2024-06-18 07:13:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1373929472. Throughput: 0: 42855.2. Samples: 1374056500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:13:51,994][12645] Avg episode reward: [(0, '0.093')] +[2024-06-18 07:13:53,045][12883] Updated weights for policy 0, policy_version 83862 (0.0032) +[2024-06-18 07:13:56,961][12883] Updated weights for policy 0, policy_version 83872 (0.0034) +[2024-06-18 07:13:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1374158848. Throughput: 0: 42770.2. Samples: 1374315240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:13:56,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 07:14:00,907][12883] Updated weights for policy 0, policy_version 83882 (0.0030) +[2024-06-18 07:14:01,998][12645] Fps is (10 sec: 45854.2, 60 sec: 43142.8, 300 sec: 42764.4). Total num frames: 1374388224. Throughput: 0: 42913.9. Samples: 1374439400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:14:01,999][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 07:14:04,666][12883] Updated weights for policy 0, policy_version 83892 (0.0046) +[2024-06-18 07:14:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1374568448. Throughput: 0: 42964.0. Samples: 1374699440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:14:06,994][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 07:14:08,379][12883] Updated weights for policy 0, policy_version 83902 (0.0032) +[2024-06-18 07:14:11,994][12645] Fps is (10 sec: 40978.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1374797824. Throughput: 0: 42861.3. Samples: 1374952180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:14:11,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 07:14:12,301][12883] Updated weights for policy 0, policy_version 83912 (0.0031) +[2024-06-18 07:14:16,237][12883] Updated weights for policy 0, policy_version 83922 (0.0034) +[2024-06-18 07:14:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1375010816. Throughput: 0: 42942.2. Samples: 1375083120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:14:16,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 07:14:20,062][12883] Updated weights for policy 0, policy_version 83932 (0.0036) +[2024-06-18 07:14:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1375223808. Throughput: 0: 42844.8. Samples: 1375338540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 07:14:21,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 07:14:23,708][12883] Updated weights for policy 0, policy_version 83942 (0.0035) +[2024-06-18 07:14:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1375436800. Throughput: 0: 42896.9. Samples: 1375597400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:14:27,002][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 07:14:27,613][12883] Updated weights for policy 0, policy_version 83952 (0.0040) +[2024-06-18 07:14:31,179][12883] Updated weights for policy 0, policy_version 83962 (0.0054) +[2024-06-18 07:14:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1375649792. Throughput: 0: 42797.7. Samples: 1375721920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:14:31,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 07:14:35,194][12883] Updated weights for policy 0, policy_version 83972 (0.0035) +[2024-06-18 07:14:35,757][12862] Signal inference workers to stop experience collection... (19900 times) +[2024-06-18 07:14:35,758][12862] Signal inference workers to resume experience collection... (19900 times) +[2024-06-18 07:14:35,789][12883] InferenceWorker_p0-w0: stopping experience collection (19900 times) +[2024-06-18 07:14:35,789][12883] InferenceWorker_p0-w0: resuming experience collection (19900 times) +[2024-06-18 07:14:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1375879168. Throughput: 0: 42756.9. Samples: 1375980560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:14:36,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 07:14:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083977_1375879168.pth... +[2024-06-18 07:14:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083350_1365606400.pth +[2024-06-18 07:14:39,239][12883] Updated weights for policy 0, policy_version 83982 (0.0043) +[2024-06-18 07:14:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1376075776. Throughput: 0: 42669.3. Samples: 1376235360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:14:41,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 07:14:43,155][12883] Updated weights for policy 0, policy_version 83992 (0.0034) +[2024-06-18 07:14:46,737][12883] Updated weights for policy 0, policy_version 84002 (0.0052) +[2024-06-18 07:14:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1376288768. Throughput: 0: 42681.2. Samples: 1376359860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:14:46,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 07:14:50,676][12883] Updated weights for policy 0, policy_version 84012 (0.0029) +[2024-06-18 07:14:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1376501760. Throughput: 0: 42644.9. Samples: 1376618460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:14:51,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 07:14:54,704][12883] Updated weights for policy 0, policy_version 84022 (0.0036) +[2024-06-18 07:14:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1376714752. Throughput: 0: 42805.7. Samples: 1376878440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:14:56,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 07:14:58,425][12883] Updated weights for policy 0, policy_version 84032 (0.0041) +[2024-06-18 07:15:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42328.5, 300 sec: 42765.0). Total num frames: 1376927744. Throughput: 0: 42630.7. Samples: 1377001500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:15:01,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 07:15:02,323][12883] Updated weights for policy 0, policy_version 84042 (0.0023) +[2024-06-18 07:15:06,300][12883] Updated weights for policy 0, policy_version 84052 (0.0029) +[2024-06-18 07:15:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42711.0). Total num frames: 1377157120. Throughput: 0: 42629.8. Samples: 1377256880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:15:06,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 07:15:10,246][12883] Updated weights for policy 0, policy_version 84062 (0.0040) +[2024-06-18 07:15:11,999][12645] Fps is (10 sec: 42574.5, 60 sec: 42594.3, 300 sec: 42708.7). Total num frames: 1377353728. Throughput: 0: 42588.0. Samples: 1377514100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:15:12,000][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 07:15:13,914][12883] Updated weights for policy 0, policy_version 84072 (0.0033) +[2024-06-18 07:15:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1377566720. Throughput: 0: 42573.0. Samples: 1377637700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:15:16,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 07:15:17,890][12883] Updated weights for policy 0, policy_version 84082 (0.0037) +[2024-06-18 07:15:21,359][12883] Updated weights for policy 0, policy_version 84092 (0.0028) +[2024-06-18 07:15:21,994][12645] Fps is (10 sec: 42622.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1377779712. Throughput: 0: 42631.2. Samples: 1377898960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:15:21,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 07:15:25,526][12883] Updated weights for policy 0, policy_version 84102 (0.0033) +[2024-06-18 07:15:26,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 1377976320. Throughput: 0: 42602.3. Samples: 1378152560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:15:26,997][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 07:15:29,321][12883] Updated weights for policy 0, policy_version 84112 (0.0028) +[2024-06-18 07:15:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1378222080. Throughput: 0: 42591.0. Samples: 1378276460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:15:31,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 07:15:33,109][12883] Updated weights for policy 0, policy_version 84122 (0.0038) +[2024-06-18 07:15:36,822][12883] Updated weights for policy 0, policy_version 84132 (0.0031) +[2024-06-18 07:15:36,994][12645] Fps is (10 sec: 44247.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1378418688. Throughput: 0: 42645.3. Samples: 1378537500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:15:36,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 07:15:40,730][12883] Updated weights for policy 0, policy_version 84142 (0.0034) +[2024-06-18 07:15:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1378615296. Throughput: 0: 42711.5. Samples: 1378800460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:15:41,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 07:15:44,272][12883] Updated weights for policy 0, policy_version 84152 (0.0045) +[2024-06-18 07:15:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1378861056. Throughput: 0: 42665.0. Samples: 1378921420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:15:46,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 07:15:48,278][12883] Updated weights for policy 0, policy_version 84162 (0.0026) +[2024-06-18 07:15:52,000][12645] Fps is (10 sec: 45846.8, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 1379074048. Throughput: 0: 42826.1. Samples: 1379184320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:15:52,001][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 07:15:52,001][12883] Updated weights for policy 0, policy_version 84172 (0.0039) +[2024-06-18 07:15:55,894][12883] Updated weights for policy 0, policy_version 84182 (0.0049) +[2024-06-18 07:15:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1379270656. Throughput: 0: 42871.6. Samples: 1379443080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:15:56,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 07:15:58,692][12862] Signal inference workers to stop experience collection... (19950 times) +[2024-06-18 07:15:58,692][12862] Signal inference workers to resume experience collection... (19950 times) +[2024-06-18 07:15:58,739][12883] InferenceWorker_p0-w0: stopping experience collection (19950 times) +[2024-06-18 07:15:58,739][12883] InferenceWorker_p0-w0: resuming experience collection (19950 times) +[2024-06-18 07:15:59,461][12883] Updated weights for policy 0, policy_version 84192 (0.0046) +[2024-06-18 07:16:01,994][12645] Fps is (10 sec: 42625.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1379500032. Throughput: 0: 42882.3. Samples: 1379567400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:16:01,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 07:16:03,404][12883] Updated weights for policy 0, policy_version 84202 (0.0032) +[2024-06-18 07:16:06,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1379713024. Throughput: 0: 42864.6. Samples: 1379827880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) +[2024-06-18 07:16:06,995][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 07:16:07,131][12883] Updated weights for policy 0, policy_version 84212 (0.0037) +[2024-06-18 07:16:11,348][12883] Updated weights for policy 0, policy_version 84222 (0.0044) +[2024-06-18 07:16:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42602.4, 300 sec: 42820.5). Total num frames: 1379909632. Throughput: 0: 42892.8. Samples: 1380082640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:11,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 07:16:15,313][12883] Updated weights for policy 0, policy_version 84232 (0.0035) +[2024-06-18 07:16:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1380122624. Throughput: 0: 42855.6. Samples: 1380204960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:16,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 07:16:18,989][12883] Updated weights for policy 0, policy_version 84242 (0.0031) +[2024-06-18 07:16:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1380352000. Throughput: 0: 42809.7. Samples: 1380463940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:21,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 07:16:22,878][12883] Updated weights for policy 0, policy_version 84252 (0.0045) +[2024-06-18 07:16:26,749][12883] Updated weights for policy 0, policy_version 84262 (0.0030) +[2024-06-18 07:16:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 1380564992. Throughput: 0: 42654.4. Samples: 1380719900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:26,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 07:16:30,465][12883] Updated weights for policy 0, policy_version 84272 (0.0049) +[2024-06-18 07:16:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1380777984. Throughput: 0: 42769.7. Samples: 1380846060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:31,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 07:16:34,229][12883] Updated weights for policy 0, policy_version 84282 (0.0029) +[2024-06-18 07:16:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1380974592. Throughput: 0: 42733.5. Samples: 1381107060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:36,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 07:16:37,127][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084290_1381007360.pth... +[2024-06-18 07:16:37,171][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083663_1370734592.pth +[2024-06-18 07:16:38,053][12883] Updated weights for policy 0, policy_version 84292 (0.0028) +[2024-06-18 07:16:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1381203968. Throughput: 0: 42676.5. Samples: 1381363520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:41,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 07:16:42,244][12883] Updated weights for policy 0, policy_version 84303 (0.0031) +[2024-06-18 07:16:45,951][12883] Updated weights for policy 0, policy_version 84313 (0.0028) +[2024-06-18 07:16:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1381433344. Throughput: 0: 42800.3. Samples: 1381493420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:46,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 07:16:50,018][12883] Updated weights for policy 0, policy_version 84323 (0.0044) +[2024-06-18 07:16:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42602.9, 300 sec: 42765.3). Total num frames: 1381629952. Throughput: 0: 42757.9. Samples: 1381751980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:51,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 07:16:53,690][12883] Updated weights for policy 0, policy_version 84333 (0.0036) +[2024-06-18 07:16:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1381826560. Throughput: 0: 42750.7. Samples: 1382006420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:16:56,994][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 07:16:57,813][12883] Updated weights for policy 0, policy_version 84343 (0.0034) +[2024-06-18 07:17:01,211][12883] Updated weights for policy 0, policy_version 84353 (0.0040) +[2024-06-18 07:17:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1382072320. Throughput: 0: 42790.2. Samples: 1382130520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:01,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 07:17:05,358][12883] Updated weights for policy 0, policy_version 84363 (0.0031) +[2024-06-18 07:17:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1382252544. Throughput: 0: 42649.4. Samples: 1382383160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:06,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 07:17:09,364][12883] Updated weights for policy 0, policy_version 84373 (0.0022) +[2024-06-18 07:17:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1382465536. Throughput: 0: 42662.2. Samples: 1382639700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:11,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 07:17:13,013][12883] Updated weights for policy 0, policy_version 84383 (0.0033) +[2024-06-18 07:17:14,528][12862] Signal inference workers to stop experience collection... (20000 times) +[2024-06-18 07:17:14,581][12883] InferenceWorker_p0-w0: stopping experience collection (20000 times) +[2024-06-18 07:17:14,649][12862] Signal inference workers to resume experience collection... (20000 times) +[2024-06-18 07:17:14,649][12883] InferenceWorker_p0-w0: resuming experience collection (20000 times) +[2024-06-18 07:17:16,928][12883] Updated weights for policy 0, policy_version 84393 (0.0038) +[2024-06-18 07:17:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1382694912. Throughput: 0: 42705.3. Samples: 1382767800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:16,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 07:17:20,823][12883] Updated weights for policy 0, policy_version 84403 (0.0046) +[2024-06-18 07:17:21,993][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 1382891520. Throughput: 0: 42519.7. Samples: 1383020440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:21,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 07:17:24,505][12883] Updated weights for policy 0, policy_version 84413 (0.0028) +[2024-06-18 07:17:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1383104512. Throughput: 0: 42407.0. Samples: 1383271840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:26,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 07:17:28,778][12883] Updated weights for policy 0, policy_version 84423 (0.0041) +[2024-06-18 07:17:31,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1383333888. Throughput: 0: 42403.2. Samples: 1383401560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:31,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 07:17:32,123][12883] Updated weights for policy 0, policy_version 84433 (0.0030) +[2024-06-18 07:17:36,563][12883] Updated weights for policy 0, policy_version 84443 (0.0038) +[2024-06-18 07:17:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1383514112. Throughput: 0: 42337.4. Samples: 1383657160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:36,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 07:17:39,781][12883] Updated weights for policy 0, policy_version 84453 (0.0039) +[2024-06-18 07:17:42,000][12645] Fps is (10 sec: 42571.8, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 1383759872. Throughput: 0: 42165.7. Samples: 1383904140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:42,001][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 07:17:44,591][12883] Updated weights for policy 0, policy_version 84463 (0.0024) +[2024-06-18 07:17:46,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1383972864. Throughput: 0: 42485.9. Samples: 1384042380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 07:17:46,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 07:17:47,628][12883] Updated weights for policy 0, policy_version 84473 (0.0046) +[2024-06-18 07:17:51,994][12645] Fps is (10 sec: 39346.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1384153088. Throughput: 0: 42412.4. Samples: 1384291720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:17:51,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 07:17:52,154][12883] Updated weights for policy 0, policy_version 84483 (0.0033) +[2024-06-18 07:17:55,148][12883] Updated weights for policy 0, policy_version 84493 (0.0033) +[2024-06-18 07:17:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 1384415232. Throughput: 0: 42352.5. Samples: 1384545560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:17:56,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 07:17:59,859][12883] Updated weights for policy 0, policy_version 84503 (0.0029) +[2024-06-18 07:18:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1384595456. Throughput: 0: 42634.3. Samples: 1384686340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:01,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 07:18:02,820][12883] Updated weights for policy 0, policy_version 84513 (0.0041) +[2024-06-18 07:18:06,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 1384792064. Throughput: 0: 42374.0. Samples: 1384927280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:06,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 07:18:07,533][12883] Updated weights for policy 0, policy_version 84523 (0.0041) +[2024-06-18 07:18:10,389][12883] Updated weights for policy 0, policy_version 84533 (0.0031) +[2024-06-18 07:18:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1385054208. Throughput: 0: 42471.2. Samples: 1385183040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:11,994][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 07:18:15,212][12883] Updated weights for policy 0, policy_version 84543 (0.0040) +[2024-06-18 07:18:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1385218048. Throughput: 0: 42681.5. Samples: 1385322220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:16,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 07:18:17,980][12883] Updated weights for policy 0, policy_version 84553 (0.0042) +[2024-06-18 07:18:21,998][12645] Fps is (10 sec: 39304.9, 60 sec: 42595.3, 300 sec: 42708.9). Total num frames: 1385447424. Throughput: 0: 42531.5. Samples: 1385571260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:21,998][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 07:18:22,753][12883] Updated weights for policy 0, policy_version 84563 (0.0042) +[2024-06-18 07:18:25,582][12883] Updated weights for policy 0, policy_version 84573 (0.0040) +[2024-06-18 07:18:26,996][12645] Fps is (10 sec: 45864.2, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1385676800. Throughput: 0: 42788.2. Samples: 1385829440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:26,996][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 07:18:30,326][12883] Updated weights for policy 0, policy_version 84583 (0.0035) +[2024-06-18 07:18:31,994][12645] Fps is (10 sec: 42616.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1385873408. Throughput: 0: 42625.7. Samples: 1385960540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:31,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 07:18:33,255][12883] Updated weights for policy 0, policy_version 84593 (0.0035) +[2024-06-18 07:18:34,153][12862] Signal inference workers to stop experience collection... (20050 times) +[2024-06-18 07:18:34,153][12862] Signal inference workers to resume experience collection... (20050 times) +[2024-06-18 07:18:34,166][12883] InferenceWorker_p0-w0: stopping experience collection (20050 times) +[2024-06-18 07:18:34,167][12883] InferenceWorker_p0-w0: resuming experience collection (20050 times) +[2024-06-18 07:18:36,994][12645] Fps is (10 sec: 42608.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1386102784. Throughput: 0: 42664.1. Samples: 1386211600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:36,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 07:18:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084601_1386102784.pth... +[2024-06-18 07:18:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083977_1375879168.pth +[2024-06-18 07:18:38,117][12883] Updated weights for policy 0, policy_version 84603 (0.0036) +[2024-06-18 07:18:40,946][12883] Updated weights for policy 0, policy_version 84613 (0.0035) +[2024-06-18 07:18:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42602.8, 300 sec: 42598.4). Total num frames: 1386315776. Throughput: 0: 42731.0. Samples: 1386468460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 07:18:41,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 07:18:45,628][12883] Updated weights for policy 0, policy_version 84623 (0.0034) +[2024-06-18 07:18:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1386496000. Throughput: 0: 42476.9. Samples: 1386597800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:18:46,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 07:18:48,827][12883] Updated weights for policy 0, policy_version 84633 (0.0032) +[2024-06-18 07:18:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1386741760. Throughput: 0: 42859.1. Samples: 1386855940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:18:51,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 07:18:53,136][12883] Updated weights for policy 0, policy_version 84643 (0.0034) +[2024-06-18 07:18:56,570][12883] Updated weights for policy 0, policy_version 84653 (0.0027) +[2024-06-18 07:18:56,998][12645] Fps is (10 sec: 45853.2, 60 sec: 42322.0, 300 sec: 42598.4). Total num frames: 1386954752. Throughput: 0: 42776.0. Samples: 1387108160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:18:56,999][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 07:19:00,812][12883] Updated weights for policy 0, policy_version 84663 (0.0045) +[2024-06-18 07:19:01,994][12645] Fps is (10 sec: 37683.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1387118592. Throughput: 0: 42567.1. Samples: 1387237740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:19:01,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 07:19:04,359][12883] Updated weights for policy 0, policy_version 84673 (0.0035) +[2024-06-18 07:19:06,994][12645] Fps is (10 sec: 44257.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1387397120. Throughput: 0: 42678.2. Samples: 1387491600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:19:06,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 07:19:08,507][12883] Updated weights for policy 0, policy_version 84683 (0.0032) +[2024-06-18 07:19:11,994][12645] Fps is (10 sec: 47512.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1387593728. Throughput: 0: 42719.4. Samples: 1387751720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:19:11,995][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 07:19:12,530][12883] Updated weights for policy 0, policy_version 84693 (0.0037) +[2024-06-18 07:19:16,369][12883] Updated weights for policy 0, policy_version 84703 (0.0027) +[2024-06-18 07:19:16,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1387773952. Throughput: 0: 42537.4. Samples: 1387874720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:19:16,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 07:19:19,883][12883] Updated weights for policy 0, policy_version 84713 (0.0029) +[2024-06-18 07:19:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43147.5, 300 sec: 42709.5). Total num frames: 1388036096. Throughput: 0: 42832.4. Samples: 1388139060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:19:21,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 07:19:23,873][12883] Updated weights for policy 0, policy_version 84723 (0.0039) +[2024-06-18 07:19:26,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1388232704. Throughput: 0: 42765.7. Samples: 1388392920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:19:26,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 07:19:27,487][12883] Updated weights for policy 0, policy_version 84733 (0.0027) +[2024-06-18 07:19:31,487][12883] Updated weights for policy 0, policy_version 84743 (0.0025) +[2024-06-18 07:19:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1388429312. Throughput: 0: 42710.1. Samples: 1388519760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) +[2024-06-18 07:19:31,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 07:19:34,981][12883] Updated weights for policy 0, policy_version 84753 (0.0037) +[2024-06-18 07:19:35,428][12862] Signal inference workers to stop experience collection... (20100 times) +[2024-06-18 07:19:35,428][12862] Signal inference workers to resume experience collection... (20100 times) +[2024-06-18 07:19:35,447][12883] InferenceWorker_p0-w0: stopping experience collection (20100 times) +[2024-06-18 07:19:35,448][12883] InferenceWorker_p0-w0: resuming experience collection (20100 times) +[2024-06-18 07:19:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1388675072. Throughput: 0: 42726.2. Samples: 1388778620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:19:36,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 07:19:39,079][12883] Updated weights for policy 0, policy_version 84763 (0.0029) +[2024-06-18 07:19:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1388888064. Throughput: 0: 42930.7. Samples: 1389039840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:19:41,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 07:19:42,639][12883] Updated weights for policy 0, policy_version 84773 (0.0041) +[2024-06-18 07:19:46,859][12883] Updated weights for policy 0, policy_version 84783 (0.0038) +[2024-06-18 07:19:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1389084672. Throughput: 0: 42837.2. Samples: 1389165420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:19:46,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 07:19:50,241][12883] Updated weights for policy 0, policy_version 84793 (0.0039) +[2024-06-18 07:19:51,996][12645] Fps is (10 sec: 44226.9, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 1389330432. Throughput: 0: 43088.6. Samples: 1389430680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:19:51,996][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 07:19:54,649][12883] Updated weights for policy 0, policy_version 84803 (0.0033) +[2024-06-18 07:19:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42874.9, 300 sec: 42709.5). Total num frames: 1389527040. Throughput: 0: 43188.2. Samples: 1389695180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:19:56,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 07:19:57,825][12883] Updated weights for policy 0, policy_version 84813 (0.0038) +[2024-06-18 07:20:01,994][12645] Fps is (10 sec: 39330.2, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 1389723648. Throughput: 0: 43140.3. Samples: 1389816040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:20:01,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 07:20:02,064][12883] Updated weights for policy 0, policy_version 84823 (0.0044) +[2024-06-18 07:20:05,251][12883] Updated weights for policy 0, policy_version 84833 (0.0039) +[2024-06-18 07:20:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.8). Total num frames: 1389969408. Throughput: 0: 43031.2. Samples: 1390075460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:20:06,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 07:20:09,539][12883] Updated weights for policy 0, policy_version 84843 (0.0027) +[2024-06-18 07:20:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1390149632. Throughput: 0: 43138.4. Samples: 1390334140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:20:11,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 07:20:12,878][12883] Updated weights for policy 0, policy_version 84853 (0.0041) +[2024-06-18 07:20:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1390379008. Throughput: 0: 43177.0. Samples: 1390462720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:20:16,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 07:20:17,051][12883] Updated weights for policy 0, policy_version 84863 (0.0030) +[2024-06-18 07:20:20,476][12883] Updated weights for policy 0, policy_version 84873 (0.0048) +[2024-06-18 07:20:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 1390592000. Throughput: 0: 43014.2. Samples: 1390714260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) +[2024-06-18 07:20:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 07:20:24,572][12883] Updated weights for policy 0, policy_version 84883 (0.0038) +[2024-06-18 07:20:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1390821376. Throughput: 0: 43084.4. Samples: 1390978640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:20:26,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 07:20:28,331][12883] Updated weights for policy 0, policy_version 84893 (0.0036) +[2024-06-18 07:20:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1391034368. Throughput: 0: 43139.7. Samples: 1391106700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:20:31,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 07:20:32,120][12883] Updated weights for policy 0, policy_version 84903 (0.0041) +[2024-06-18 07:20:36,022][12883] Updated weights for policy 0, policy_version 84913 (0.0025) +[2024-06-18 07:20:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1391247360. Throughput: 0: 42927.4. Samples: 1391362320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:20:36,994][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 07:20:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084915_1391247360.pth... +[2024-06-18 07:20:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084290_1381007360.pth +[2024-06-18 07:20:39,548][12883] Updated weights for policy 0, policy_version 84923 (0.0041) +[2024-06-18 07:20:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1391443968. Throughput: 0: 42788.4. Samples: 1391620660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:20:41,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 07:20:43,617][12883] Updated weights for policy 0, policy_version 84933 (0.0034) +[2024-06-18 07:20:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42765.9). Total num frames: 1391689728. Throughput: 0: 42900.6. Samples: 1391746560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:20:46,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 07:20:47,338][12883] Updated weights for policy 0, policy_version 84943 (0.0031) +[2024-06-18 07:20:51,138][12883] Updated weights for policy 0, policy_version 84953 (0.0029) +[2024-06-18 07:20:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1391886336. Throughput: 0: 42742.2. Samples: 1391998860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:20:51,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 07:20:55,089][12883] Updated weights for policy 0, policy_version 84963 (0.0035) +[2024-06-18 07:20:56,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1392082944. Throughput: 0: 42847.4. Samples: 1392262280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:20:56,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 07:20:58,830][12883] Updated weights for policy 0, policy_version 84973 (0.0037) +[2024-06-18 07:21:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1392328704. Throughput: 0: 42753.3. Samples: 1392386620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:21:01,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 07:21:02,497][12883] Updated weights for policy 0, policy_version 84983 (0.0046) +[2024-06-18 07:21:06,657][12883] Updated weights for policy 0, policy_version 84993 (0.0038) +[2024-06-18 07:21:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1392525312. Throughput: 0: 42817.0. Samples: 1392641020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:21:06,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 07:21:07,521][12862] Signal inference workers to stop experience collection... (20150 times) +[2024-06-18 07:21:07,528][12862] Signal inference workers to resume experience collection... (20150 times) +[2024-06-18 07:21:07,568][12883] InferenceWorker_p0-w0: stopping experience collection (20150 times) +[2024-06-18 07:21:07,568][12883] InferenceWorker_p0-w0: resuming experience collection (20150 times) +[2024-06-18 07:21:10,010][12883] Updated weights for policy 0, policy_version 85003 (0.0033) +[2024-06-18 07:21:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1392721920. Throughput: 0: 42744.5. Samples: 1392902140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:21:11,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 07:21:14,355][12883] Updated weights for policy 0, policy_version 85013 (0.0025) +[2024-06-18 07:21:16,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1392967680. Throughput: 0: 42723.3. Samples: 1393029260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:21:17,000][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 07:21:17,962][12883] Updated weights for policy 0, policy_version 85023 (0.0027) +[2024-06-18 07:21:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1393164288. Throughput: 0: 42816.1. Samples: 1393289040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:21:21,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 07:21:22,036][12883] Updated weights for policy 0, policy_version 85033 (0.0034) +[2024-06-18 07:21:25,511][12883] Updated weights for policy 0, policy_version 85043 (0.0037) +[2024-06-18 07:21:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1393377280. Throughput: 0: 42780.9. Samples: 1393545800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:21:26,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 07:21:29,618][12883] Updated weights for policy 0, policy_version 85053 (0.0031) +[2024-06-18 07:21:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1393606656. Throughput: 0: 42822.1. Samples: 1393673560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:21:31,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 07:21:33,115][12883] Updated weights for policy 0, policy_version 85063 (0.0045) +[2024-06-18 07:21:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1393803264. Throughput: 0: 42994.3. Samples: 1393933600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:21:36,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 07:21:37,259][12883] Updated weights for policy 0, policy_version 85073 (0.0038) +[2024-06-18 07:21:41,173][12883] Updated weights for policy 0, policy_version 85083 (0.0034) +[2024-06-18 07:21:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1394016256. Throughput: 0: 42728.6. Samples: 1394185060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:21:41,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 07:21:44,865][12883] Updated weights for policy 0, policy_version 85093 (0.0026) +[2024-06-18 07:21:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1394245632. Throughput: 0: 42887.1. Samples: 1394316540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:21:46,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 07:21:48,561][12883] Updated weights for policy 0, policy_version 85103 (0.0028) +[2024-06-18 07:21:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1394442240. Throughput: 0: 43025.8. Samples: 1394577180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:21:51,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 07:21:52,369][12883] Updated weights for policy 0, policy_version 85113 (0.0031) +[2024-06-18 07:21:56,055][12883] Updated weights for policy 0, policy_version 85123 (0.0041) +[2024-06-18 07:21:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1394671616. Throughput: 0: 42725.2. Samples: 1394824780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:21:56,996][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 07:21:59,953][12883] Updated weights for policy 0, policy_version 85133 (0.0032) +[2024-06-18 07:22:01,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 42709.2). Total num frames: 1394851840. Throughput: 0: 42777.2. Samples: 1394954320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:22:01,996][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 07:22:03,601][12883] Updated weights for policy 0, policy_version 85143 (0.0025) +[2024-06-18 07:22:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1395097600. Throughput: 0: 42864.9. Samples: 1395217960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 07:22:06,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 07:22:07,577][12883] Updated weights for policy 0, policy_version 85153 (0.0047) +[2024-06-18 07:22:11,249][12883] Updated weights for policy 0, policy_version 85163 (0.0026) +[2024-06-18 07:22:11,995][12645] Fps is (10 sec: 45877.8, 60 sec: 43143.3, 300 sec: 42764.8). Total num frames: 1395310592. Throughput: 0: 42686.0. Samples: 1395466740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:11,996][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 07:22:15,455][12883] Updated weights for policy 0, policy_version 85173 (0.0037) +[2024-06-18 07:22:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1395507200. Throughput: 0: 42776.0. Samples: 1395598480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:16,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 07:22:18,750][12883] Updated weights for policy 0, policy_version 85183 (0.0033) +[2024-06-18 07:22:21,081][12862] Signal inference workers to stop experience collection... (20200 times) +[2024-06-18 07:22:21,127][12883] InferenceWorker_p0-w0: stopping experience collection (20200 times) +[2024-06-18 07:22:21,135][12862] Signal inference workers to resume experience collection... (20200 times) +[2024-06-18 07:22:21,144][12883] InferenceWorker_p0-w0: resuming experience collection (20200 times) +[2024-06-18 07:22:21,994][12645] Fps is (10 sec: 42605.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1395736576. Throughput: 0: 42833.4. Samples: 1395861100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:21,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 07:22:23,020][12883] Updated weights for policy 0, policy_version 85193 (0.0030) +[2024-06-18 07:22:26,520][12883] Updated weights for policy 0, policy_version 85203 (0.0039) +[2024-06-18 07:22:26,996][12645] Fps is (10 sec: 45865.3, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 1395965952. Throughput: 0: 42865.8. Samples: 1396114120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:26,996][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 07:22:30,878][12883] Updated weights for policy 0, policy_version 85213 (0.0026) +[2024-06-18 07:22:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1396162560. Throughput: 0: 42890.2. Samples: 1396246600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:31,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 07:22:34,644][12883] Updated weights for policy 0, policy_version 85223 (0.0027) +[2024-06-18 07:22:36,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 1396391936. Throughput: 0: 42852.8. Samples: 1396505560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:36,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 07:22:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085230_1396408320.pth... +[2024-06-18 07:22:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084601_1386102784.pth +[2024-06-18 07:22:38,443][12883] Updated weights for policy 0, policy_version 85233 (0.0030) +[2024-06-18 07:22:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1396604928. Throughput: 0: 43078.7. Samples: 1396763320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:41,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 07:22:42,362][12883] Updated weights for policy 0, policy_version 85243 (0.0043) +[2024-06-18 07:22:45,935][12883] Updated weights for policy 0, policy_version 85253 (0.0038) +[2024-06-18 07:22:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1396785152. Throughput: 0: 42916.8. Samples: 1396885480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:46,995][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 07:22:49,715][12883] Updated weights for policy 0, policy_version 85263 (0.0034) +[2024-06-18 07:22:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1397030912. Throughput: 0: 42868.3. Samples: 1397147040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:51,996][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 07:22:53,445][12883] Updated weights for policy 0, policy_version 85273 (0.0044) +[2024-06-18 07:22:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1397243904. Throughput: 0: 42960.1. Samples: 1397399880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:22:56,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 07:22:57,466][12883] Updated weights for policy 0, policy_version 85283 (0.0022) +[2024-06-18 07:23:01,208][12883] Updated weights for policy 0, policy_version 85293 (0.0027) +[2024-06-18 07:23:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 1397440512. Throughput: 0: 42972.5. Samples: 1397532240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 07:23:01,994][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 07:23:05,851][12883] Updated weights for policy 0, policy_version 85303 (0.0033) +[2024-06-18 07:23:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1397653504. Throughput: 0: 42961.7. Samples: 1397794380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:06,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 07:23:08,855][12883] Updated weights for policy 0, policy_version 85313 (0.0035) +[2024-06-18 07:23:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 43145.8, 300 sec: 42987.2). Total num frames: 1397899264. Throughput: 0: 42829.8. Samples: 1398041360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:11,994][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 07:23:13,475][12883] Updated weights for policy 0, policy_version 85323 (0.0038) +[2024-06-18 07:23:16,478][12883] Updated weights for policy 0, policy_version 85333 (0.0035) +[2024-06-18 07:23:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.7). Total num frames: 1398095872. Throughput: 0: 42874.6. Samples: 1398175960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:16,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 07:23:20,991][12883] Updated weights for policy 0, policy_version 85343 (0.0041) +[2024-06-18 07:23:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1398308864. Throughput: 0: 42782.7. Samples: 1398430780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:21,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 07:23:24,163][12883] Updated weights for policy 0, policy_version 85353 (0.0036) +[2024-06-18 07:23:26,637][12862] Signal inference workers to stop experience collection... (20250 times) +[2024-06-18 07:23:26,637][12862] Signal inference workers to resume experience collection... (20250 times) +[2024-06-18 07:23:26,682][12883] InferenceWorker_p0-w0: stopping experience collection (20250 times) +[2024-06-18 07:23:26,682][12883] InferenceWorker_p0-w0: resuming experience collection (20250 times) +[2024-06-18 07:23:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42931.6). Total num frames: 1398538240. Throughput: 0: 42755.0. Samples: 1398687300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:26,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 07:23:28,614][12883] Updated weights for policy 0, policy_version 85363 (0.0039) +[2024-06-18 07:23:31,656][12883] Updated weights for policy 0, policy_version 85373 (0.0039) +[2024-06-18 07:23:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1398751232. Throughput: 0: 42835.5. Samples: 1398813080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:31,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 07:23:36,211][12883] Updated weights for policy 0, policy_version 85383 (0.0031) +[2024-06-18 07:23:36,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1398947840. Throughput: 0: 42840.2. Samples: 1399074840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:36,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 07:23:39,161][12883] Updated weights for policy 0, policy_version 85393 (0.0040) +[2024-06-18 07:23:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1399160832. Throughput: 0: 42797.0. Samples: 1399325740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:41,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 07:23:43,944][12883] Updated weights for policy 0, policy_version 85403 (0.0029) +[2024-06-18 07:23:46,755][12883] Updated weights for policy 0, policy_version 85413 (0.0038) +[2024-06-18 07:23:46,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 1399406592. Throughput: 0: 42705.7. Samples: 1399454000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:46,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 07:23:51,519][12883] Updated weights for policy 0, policy_version 85423 (0.0031) +[2024-06-18 07:23:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.7). Total num frames: 1399570432. Throughput: 0: 42431.1. Samples: 1399703780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:23:51,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 07:23:55,081][12883] Updated weights for policy 0, policy_version 85433 (0.0044) +[2024-06-18 07:23:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 1399799808. Throughput: 0: 42605.7. Samples: 1399958620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:23:56,994][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 07:23:59,388][12883] Updated weights for policy 0, policy_version 85443 (0.0035) +[2024-06-18 07:24:01,994][12645] Fps is (10 sec: 47513.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1400045568. Throughput: 0: 42454.6. Samples: 1400086420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:01,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 07:24:02,773][12883] Updated weights for policy 0, policy_version 85453 (0.0030) +[2024-06-18 07:24:06,919][12883] Updated weights for policy 0, policy_version 85463 (0.0040) +[2024-06-18 07:24:06,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1400225792. Throughput: 0: 42470.7. Samples: 1400342060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:06,996][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 07:24:10,510][12883] Updated weights for policy 0, policy_version 85473 (0.0047) +[2024-06-18 07:24:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 1400438784. Throughput: 0: 42367.2. Samples: 1400593820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:11,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 07:24:14,674][12883] Updated weights for policy 0, policy_version 85483 (0.0043) +[2024-06-18 07:24:16,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1400668160. Throughput: 0: 42531.2. Samples: 1400726980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:16,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 07:24:18,159][12883] Updated weights for policy 0, policy_version 85493 (0.0027) +[2024-06-18 07:24:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1400864768. Throughput: 0: 42363.3. Samples: 1400981200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:21,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 07:24:22,834][12883] Updated weights for policy 0, policy_version 85503 (0.0030) +[2024-06-18 07:24:26,244][12883] Updated weights for policy 0, policy_version 85513 (0.0041) +[2024-06-18 07:24:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1401077760. Throughput: 0: 42297.8. Samples: 1401229140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:26,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 07:24:30,253][12883] Updated weights for policy 0, policy_version 85523 (0.0049) +[2024-06-18 07:24:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1401290752. Throughput: 0: 42357.7. Samples: 1401360100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:31,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 07:24:33,728][12883] Updated weights for policy 0, policy_version 85533 (0.0044) +[2024-06-18 07:24:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1401503744. Throughput: 0: 42566.7. Samples: 1401619280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:36,994][12645] Avg episode reward: [(0, '0.046')] +[2024-06-18 07:24:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085542_1401520128.pth... +[2024-06-18 07:24:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084915_1391247360.pth +[2024-06-18 07:24:37,763][12883] Updated weights for policy 0, policy_version 85543 (0.0030) +[2024-06-18 07:24:39,648][12862] Signal inference workers to stop experience collection... (20300 times) +[2024-06-18 07:24:39,703][12862] Signal inference workers to resume experience collection... (20300 times) +[2024-06-18 07:24:39,703][12883] InferenceWorker_p0-w0: stopping experience collection (20300 times) +[2024-06-18 07:24:39,721][12883] InferenceWorker_p0-w0: resuming experience collection (20300 times) +[2024-06-18 07:24:41,308][12883] Updated weights for policy 0, policy_version 85553 (0.0044) +[2024-06-18 07:24:41,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1401733120. Throughput: 0: 42549.8. Samples: 1401873360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) +[2024-06-18 07:24:41,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 07:24:45,356][12883] Updated weights for policy 0, policy_version 85563 (0.0034) +[2024-06-18 07:24:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 42598.7). Total num frames: 1401896960. Throughput: 0: 42672.6. Samples: 1402006680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:24:46,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 07:24:48,784][12883] Updated weights for policy 0, policy_version 85573 (0.0023) +[2024-06-18 07:24:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1402159104. Throughput: 0: 42742.6. Samples: 1402265380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:24:51,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 07:24:53,061][12883] Updated weights for policy 0, policy_version 85583 (0.0036) +[2024-06-18 07:24:56,570][12883] Updated weights for policy 0, policy_version 85593 (0.0028) +[2024-06-18 07:24:56,994][12645] Fps is (10 sec: 49151.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1402388480. Throughput: 0: 42822.2. Samples: 1402520820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:24:56,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 07:25:00,692][12883] Updated weights for policy 0, policy_version 85603 (0.0030) +[2024-06-18 07:25:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 1402552320. Throughput: 0: 42710.7. Samples: 1402648960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:25:01,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 07:25:03,971][12883] Updated weights for policy 0, policy_version 85613 (0.0029) +[2024-06-18 07:25:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42873.0, 300 sec: 42876.1). Total num frames: 1402798080. Throughput: 0: 42717.8. Samples: 1402903500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:25:06,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 07:25:08,201][12883] Updated weights for policy 0, policy_version 85623 (0.0043) +[2024-06-18 07:25:11,871][12883] Updated weights for policy 0, policy_version 85633 (0.0028) +[2024-06-18 07:25:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1403011072. Throughput: 0: 43101.0. Samples: 1403168680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:25:11,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 07:25:15,686][12883] Updated weights for policy 0, policy_version 85643 (0.0028) +[2024-06-18 07:25:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1403207680. Throughput: 0: 42995.6. Samples: 1403294900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:25:16,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 07:25:19,489][12883] Updated weights for policy 0, policy_version 85653 (0.0033) +[2024-06-18 07:25:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1403453440. Throughput: 0: 42829.8. Samples: 1403546620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:25:21,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 07:25:23,262][12883] Updated weights for policy 0, policy_version 85663 (0.0040) +[2024-06-18 07:25:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1403650048. Throughput: 0: 43024.0. Samples: 1403809440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:25:26,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 07:25:27,122][12883] Updated weights for policy 0, policy_version 85673 (0.0029) +[2024-06-18 07:25:30,858][12883] Updated weights for policy 0, policy_version 85683 (0.0046) +[2024-06-18 07:25:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1403863040. Throughput: 0: 42811.1. Samples: 1403933180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:25:31,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 07:25:34,813][12883] Updated weights for policy 0, policy_version 85693 (0.0031) +[2024-06-18 07:25:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1404076032. Throughput: 0: 42768.5. Samples: 1404189960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 07:25:36,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 07:25:38,682][12883] Updated weights for policy 0, policy_version 85703 (0.0043) +[2024-06-18 07:25:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1404272640. Throughput: 0: 42801.8. Samples: 1404446900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:25:41,994][12645] Avg episode reward: [(0, '0.096')] +[2024-06-18 07:25:42,722][12883] Updated weights for policy 0, policy_version 85713 (0.0033) +[2024-06-18 07:25:46,277][12883] Updated weights for policy 0, policy_version 85723 (0.0035) +[2024-06-18 07:25:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1404502016. Throughput: 0: 42726.7. Samples: 1404571660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:25:46,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 07:25:50,247][12862] Signal inference workers to stop experience collection... (20350 times) +[2024-06-18 07:25:50,247][12862] Signal inference workers to resume experience collection... (20350 times) +[2024-06-18 07:25:50,262][12883] InferenceWorker_p0-w0: stopping experience collection (20350 times) +[2024-06-18 07:25:50,287][12883] InferenceWorker_p0-w0: resuming experience collection (20350 times) +[2024-06-18 07:25:50,393][12883] Updated weights for policy 0, policy_version 85733 (0.0032) +[2024-06-18 07:25:51,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1404715008. Throughput: 0: 42788.6. Samples: 1404828980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:25:51,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 07:25:54,346][12883] Updated weights for policy 0, policy_version 85743 (0.0032) +[2024-06-18 07:25:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1404928000. Throughput: 0: 42502.4. Samples: 1405081300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:25:56,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 07:25:58,250][12883] Updated weights for policy 0, policy_version 85753 (0.0027) +[2024-06-18 07:26:01,773][12883] Updated weights for policy 0, policy_version 85763 (0.0031) +[2024-06-18 07:26:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1405140992. Throughput: 0: 42519.6. Samples: 1405208280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:26:01,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 07:26:05,859][12883] Updated weights for policy 0, policy_version 85773 (0.0036) +[2024-06-18 07:26:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1405370368. Throughput: 0: 42794.6. Samples: 1405472380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:26:06,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 07:26:09,724][12883] Updated weights for policy 0, policy_version 85783 (0.0047) +[2024-06-18 07:26:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1405583360. Throughput: 0: 42619.6. Samples: 1405727320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:26:11,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 07:26:13,757][12883] Updated weights for policy 0, policy_version 85793 (0.0041) +[2024-06-18 07:26:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1405779968. Throughput: 0: 42656.4. Samples: 1405852720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:26:16,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 07:26:17,201][12883] Updated weights for policy 0, policy_version 85803 (0.0043) +[2024-06-18 07:26:21,444][12883] Updated weights for policy 0, policy_version 85813 (0.0038) +[2024-06-18 07:26:21,997][12645] Fps is (10 sec: 40945.2, 60 sec: 42322.7, 300 sec: 42764.5). Total num frames: 1405992960. Throughput: 0: 42609.4. Samples: 1406107540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:26:21,998][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 07:26:24,654][12883] Updated weights for policy 0, policy_version 85823 (0.0037) +[2024-06-18 07:26:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1406205952. Throughput: 0: 42518.6. Samples: 1406360240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:26:26,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 07:26:29,138][12883] Updated weights for policy 0, policy_version 85833 (0.0037) +[2024-06-18 07:26:31,994][12645] Fps is (10 sec: 42614.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1406418944. Throughput: 0: 42696.9. Samples: 1406493020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:26:31,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 07:26:32,991][12883] Updated weights for policy 0, policy_version 85843 (0.0038) +[2024-06-18 07:26:36,994][12645] Fps is (10 sec: 37683.7, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 1406582784. Throughput: 0: 42477.3. Samples: 1406740460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:26:36,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 07:26:37,274][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085853_1406615552.pth... +[2024-06-18 07:26:37,278][12883] Updated weights for policy 0, policy_version 85853 (0.0026) +[2024-06-18 07:26:37,334][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085230_1396408320.pth +[2024-06-18 07:26:40,580][12883] Updated weights for policy 0, policy_version 85863 (0.0026) +[2024-06-18 07:26:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1406844928. Throughput: 0: 42524.5. Samples: 1406994900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:26:41,994][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 07:26:44,766][12883] Updated weights for policy 0, policy_version 85873 (0.0027) +[2024-06-18 07:26:46,994][12645] Fps is (10 sec: 47512.3, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 1407057920. Throughput: 0: 42769.6. Samples: 1407132920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:26:46,995][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 07:26:48,116][12883] Updated weights for policy 0, policy_version 85883 (0.0037) +[2024-06-18 07:26:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 1407238144. Throughput: 0: 42387.1. Samples: 1407379800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:26:51,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 07:26:52,335][12883] Updated weights for policy 0, policy_version 85893 (0.0028) +[2024-06-18 07:26:55,801][12883] Updated weights for policy 0, policy_version 85903 (0.0036) +[2024-06-18 07:26:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 1407483904. Throughput: 0: 42522.2. Samples: 1407640820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:26:56,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 07:26:59,863][12883] Updated weights for policy 0, policy_version 85913 (0.0032) +[2024-06-18 07:27:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1407696896. Throughput: 0: 42575.6. Samples: 1407768620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:27:01,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 07:27:03,335][12883] Updated weights for policy 0, policy_version 85923 (0.0031) +[2024-06-18 07:27:04,366][12862] Signal inference workers to stop experience collection... (20400 times) +[2024-06-18 07:27:04,422][12883] InferenceWorker_p0-w0: stopping experience collection (20400 times) +[2024-06-18 07:27:04,431][12862] Signal inference workers to resume experience collection... (20400 times) +[2024-06-18 07:27:04,432][12883] InferenceWorker_p0-w0: resuming experience collection (20400 times) +[2024-06-18 07:27:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42654.2). Total num frames: 1407893504. Throughput: 0: 42510.0. Samples: 1408020340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:27:06,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 07:27:07,203][12883] Updated weights for policy 0, policy_version 85933 (0.0044) +[2024-06-18 07:27:11,028][12883] Updated weights for policy 0, policy_version 85943 (0.0039) +[2024-06-18 07:27:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1408122880. Throughput: 0: 42761.4. Samples: 1408284500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:27:11,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 07:27:15,099][12883] Updated weights for policy 0, policy_version 85953 (0.0032) +[2024-06-18 07:27:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1408335872. Throughput: 0: 42631.0. Samples: 1408411420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) +[2024-06-18 07:27:16,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 07:27:18,895][12883] Updated weights for policy 0, policy_version 85963 (0.0035) +[2024-06-18 07:27:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42601.0, 300 sec: 42654.3). Total num frames: 1408548864. Throughput: 0: 42631.1. Samples: 1408658860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:27:21,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 07:27:22,864][12883] Updated weights for policy 0, policy_version 85973 (0.0050) +[2024-06-18 07:27:26,578][12883] Updated weights for policy 0, policy_version 85983 (0.0034) +[2024-06-18 07:27:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1408778240. Throughput: 0: 42802.4. Samples: 1408921000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:27:26,994][12645] Avg episode reward: [(0, '0.637')] +[2024-06-18 07:27:30,392][12883] Updated weights for policy 0, policy_version 85993 (0.0042) +[2024-06-18 07:27:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1408991232. Throughput: 0: 42511.7. Samples: 1409045940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:27:31,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 07:27:34,199][12883] Updated weights for policy 0, policy_version 86003 (0.0021) +[2024-06-18 07:27:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1409187840. Throughput: 0: 42590.2. Samples: 1409296360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:27:36,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 07:27:38,399][12883] Updated weights for policy 0, policy_version 86013 (0.0033) +[2024-06-18 07:27:41,744][12883] Updated weights for policy 0, policy_version 86023 (0.0028) +[2024-06-18 07:27:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1409400832. Throughput: 0: 42581.4. Samples: 1409556980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:27:41,994][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 07:27:46,076][12883] Updated weights for policy 0, policy_version 86033 (0.0058) +[2024-06-18 07:27:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 1409613824. Throughput: 0: 42539.6. Samples: 1409682900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:27:46,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 07:27:49,463][12883] Updated weights for policy 0, policy_version 86043 (0.0024) +[2024-06-18 07:27:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1409843200. Throughput: 0: 42562.3. Samples: 1409935640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:27:51,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 07:27:53,707][12883] Updated weights for policy 0, policy_version 86053 (0.0028) +[2024-06-18 07:27:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1410039808. Throughput: 0: 42384.0. Samples: 1410191780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:27:56,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 07:27:57,135][12883] Updated weights for policy 0, policy_version 86063 (0.0040) +[2024-06-18 07:28:01,249][12883] Updated weights for policy 0, policy_version 86073 (0.0029) +[2024-06-18 07:28:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1410252800. Throughput: 0: 42387.6. Samples: 1410318860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:28:01,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 07:28:05,021][12883] Updated weights for policy 0, policy_version 86083 (0.0024) +[2024-06-18 07:28:06,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1410465792. Throughput: 0: 42618.8. Samples: 1410576700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:28:06,994][12645] Avg episode reward: [(0, '0.228')] +[2024-06-18 07:28:08,992][12883] Updated weights for policy 0, policy_version 86093 (0.0043) +[2024-06-18 07:28:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1410678784. Throughput: 0: 42455.6. Samples: 1410831500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:28:11,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 07:28:12,616][12883] Updated weights for policy 0, policy_version 86103 (0.0029) +[2024-06-18 07:28:16,534][12883] Updated weights for policy 0, policy_version 86113 (0.0044) +[2024-06-18 07:28:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1410875392. Throughput: 0: 42504.6. Samples: 1410958640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:16,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 07:28:20,510][12883] Updated weights for policy 0, policy_version 86123 (0.0050) +[2024-06-18 07:28:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1411104768. Throughput: 0: 42647.7. Samples: 1411215500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:21,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 07:28:24,211][12883] Updated weights for policy 0, policy_version 86133 (0.0029) +[2024-06-18 07:28:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1411317760. Throughput: 0: 42591.9. Samples: 1411473620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:26,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 07:28:28,108][12883] Updated weights for policy 0, policy_version 86143 (0.0034) +[2024-06-18 07:28:31,808][12883] Updated weights for policy 0, policy_version 86153 (0.0023) +[2024-06-18 07:28:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1411530752. Throughput: 0: 42602.3. Samples: 1411600000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:31,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 07:28:35,651][12883] Updated weights for policy 0, policy_version 86163 (0.0033) +[2024-06-18 07:28:36,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1411743744. Throughput: 0: 42725.8. Samples: 1411858400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:36,997][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 07:28:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086166_1411743744.pth... +[2024-06-18 07:28:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085542_1401520128.pth +[2024-06-18 07:28:39,482][12883] Updated weights for policy 0, policy_version 86173 (0.0029) +[2024-06-18 07:28:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1411973120. Throughput: 0: 42659.7. Samples: 1412111460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:41,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 07:28:43,364][12883] Updated weights for policy 0, policy_version 86183 (0.0031) +[2024-06-18 07:28:46,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1412169728. Throughput: 0: 42652.0. Samples: 1412238200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:46,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 07:28:47,119][12883] Updated weights for policy 0, policy_version 86193 (0.0029) +[2024-06-18 07:28:49,034][12862] Signal inference workers to stop experience collection... (20450 times) +[2024-06-18 07:28:49,034][12862] Signal inference workers to resume experience collection... (20450 times) +[2024-06-18 07:28:49,053][12883] InferenceWorker_p0-w0: stopping experience collection (20450 times) +[2024-06-18 07:28:49,053][12883] InferenceWorker_p0-w0: resuming experience collection (20450 times) +[2024-06-18 07:28:50,902][12883] Updated weights for policy 0, policy_version 86203 (0.0032) +[2024-06-18 07:28:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1412366336. Throughput: 0: 42547.5. Samples: 1412491340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:51,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 07:28:55,119][12883] Updated weights for policy 0, policy_version 86213 (0.0029) +[2024-06-18 07:28:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1412595712. Throughput: 0: 42519.6. Samples: 1412744880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:28:56,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 07:28:58,695][12883] Updated weights for policy 0, policy_version 86223 (0.0039) +[2024-06-18 07:29:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 1412808704. Throughput: 0: 42579.9. Samples: 1412874740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) +[2024-06-18 07:29:01,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 07:29:02,850][12883] Updated weights for policy 0, policy_version 86233 (0.0037) +[2024-06-18 07:29:06,650][12883] Updated weights for policy 0, policy_version 86243 (0.0033) +[2024-06-18 07:29:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1413021696. Throughput: 0: 42437.7. Samples: 1413125200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:06,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 07:29:10,621][12883] Updated weights for policy 0, policy_version 86253 (0.0028) +[2024-06-18 07:29:11,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1413234688. Throughput: 0: 42520.2. Samples: 1413387120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:11,996][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 07:29:14,296][12883] Updated weights for policy 0, policy_version 86263 (0.0024) +[2024-06-18 07:29:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1413447680. Throughput: 0: 42519.5. Samples: 1413513380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:16,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 07:29:18,091][12883] Updated weights for policy 0, policy_version 86273 (0.0026) +[2024-06-18 07:29:21,831][12883] Updated weights for policy 0, policy_version 86283 (0.0036) +[2024-06-18 07:29:21,994][12645] Fps is (10 sec: 42607.4, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1413660672. Throughput: 0: 42529.6. Samples: 1413772140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:21,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 07:29:25,662][12883] Updated weights for policy 0, policy_version 86293 (0.0024) +[2024-06-18 07:29:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1413873664. Throughput: 0: 42719.9. Samples: 1414033860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:26,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 07:29:29,359][12883] Updated weights for policy 0, policy_version 86303 (0.0031) +[2024-06-18 07:29:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1414070272. Throughput: 0: 42649.7. Samples: 1414157440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:31,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 07:29:33,446][12883] Updated weights for policy 0, policy_version 86313 (0.0036) +[2024-06-18 07:29:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1414299648. Throughput: 0: 42636.9. Samples: 1414410000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:36,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 07:29:37,024][12883] Updated weights for policy 0, policy_version 86323 (0.0031) +[2024-06-18 07:29:41,339][12883] Updated weights for policy 0, policy_version 86333 (0.0037) +[2024-06-18 07:29:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1414496256. Throughput: 0: 42795.0. Samples: 1414670660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:41,994][12645] Avg episode reward: [(0, '0.129')] +[2024-06-18 07:29:44,571][12883] Updated weights for policy 0, policy_version 86343 (0.0041) +[2024-06-18 07:29:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1414709248. Throughput: 0: 42640.5. Samples: 1414793560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:46,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 07:29:49,159][12883] Updated weights for policy 0, policy_version 86353 (0.0023) +[2024-06-18 07:29:51,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1414955008. Throughput: 0: 42693.2. Samples: 1415046400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:51,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 07:29:52,274][12883] Updated weights for policy 0, policy_version 86363 (0.0036) +[2024-06-18 07:29:56,798][12883] Updated weights for policy 0, policy_version 86373 (0.0027) +[2024-06-18 07:29:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1415135232. Throughput: 0: 42766.5. Samples: 1415311520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 07:29:56,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 07:30:00,578][12883] Updated weights for policy 0, policy_version 86383 (0.0036) +[2024-06-18 07:30:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1415348224. Throughput: 0: 42620.9. Samples: 1415431320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:01,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 07:30:04,586][12883] Updated weights for policy 0, policy_version 86393 (0.0051) +[2024-06-18 07:30:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1415593984. Throughput: 0: 42560.1. Samples: 1415687340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:06,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 07:30:08,365][12883] Updated weights for policy 0, policy_version 86403 (0.0030) +[2024-06-18 07:30:11,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42325.3, 300 sec: 42598.1). Total num frames: 1415774208. Throughput: 0: 42536.6. Samples: 1415948100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:11,996][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 07:30:12,129][12883] Updated weights for policy 0, policy_version 86413 (0.0038) +[2024-06-18 07:30:16,091][12883] Updated weights for policy 0, policy_version 86423 (0.0029) +[2024-06-18 07:30:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1415987200. Throughput: 0: 42456.4. Samples: 1416067980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:16,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 07:30:19,919][12883] Updated weights for policy 0, policy_version 86433 (0.0023) +[2024-06-18 07:30:21,546][12862] Signal inference workers to stop experience collection... (20500 times) +[2024-06-18 07:30:21,547][12862] Signal inference workers to resume experience collection... (20500 times) +[2024-06-18 07:30:21,570][12883] InferenceWorker_p0-w0: stopping experience collection (20500 times) +[2024-06-18 07:30:21,570][12883] InferenceWorker_p0-w0: resuming experience collection (20500 times) +[2024-06-18 07:30:21,994][12645] Fps is (10 sec: 45885.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1416232960. Throughput: 0: 42490.7. Samples: 1416322080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:21,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 07:30:23,587][12883] Updated weights for policy 0, policy_version 86443 (0.0033) +[2024-06-18 07:30:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1416413184. Throughput: 0: 42511.5. Samples: 1416583680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:26,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 07:30:27,463][12883] Updated weights for policy 0, policy_version 86453 (0.0030) +[2024-06-18 07:30:31,287][12883] Updated weights for policy 0, policy_version 86463 (0.0039) +[2024-06-18 07:30:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1416626176. Throughput: 0: 42403.2. Samples: 1416701700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:31,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 07:30:35,005][12883] Updated weights for policy 0, policy_version 86473 (0.0040) +[2024-06-18 07:30:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1416871936. Throughput: 0: 42663.6. Samples: 1416966260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:36,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 07:30:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086479_1416871936.pth... +[2024-06-18 07:30:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085853_1406615552.pth +[2024-06-18 07:30:38,872][12883] Updated weights for policy 0, policy_version 86483 (0.0032) +[2024-06-18 07:30:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1417052160. Throughput: 0: 42516.4. Samples: 1417224760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:41,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 07:30:42,741][12883] Updated weights for policy 0, policy_version 86493 (0.0031) +[2024-06-18 07:30:46,679][12883] Updated weights for policy 0, policy_version 86503 (0.0033) +[2024-06-18 07:30:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1417265152. Throughput: 0: 42639.5. Samples: 1417350100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 07:30:46,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 07:30:50,335][12883] Updated weights for policy 0, policy_version 86513 (0.0036) +[2024-06-18 07:30:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1417494528. Throughput: 0: 42714.1. Samples: 1417609480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:30:51,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 07:30:54,178][12883] Updated weights for policy 0, policy_version 86523 (0.0028) +[2024-06-18 07:30:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1417707520. Throughput: 0: 42710.0. Samples: 1417869960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:30:56,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 07:30:57,799][12883] Updated weights for policy 0, policy_version 86533 (0.0027) +[2024-06-18 07:31:01,732][12883] Updated weights for policy 0, policy_version 86543 (0.0028) +[2024-06-18 07:31:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1417920512. Throughput: 0: 42847.9. Samples: 1417996140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:01,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 07:31:05,406][12883] Updated weights for policy 0, policy_version 86553 (0.0046) +[2024-06-18 07:31:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1418133504. Throughput: 0: 42940.3. Samples: 1418254400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:06,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 07:31:09,634][12883] Updated weights for policy 0, policy_version 86563 (0.0033) +[2024-06-18 07:31:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 1418346496. Throughput: 0: 42910.1. Samples: 1418514640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:11,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 07:31:12,894][12883] Updated weights for policy 0, policy_version 86573 (0.0027) +[2024-06-18 07:31:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.9). Total num frames: 1418559488. Throughput: 0: 43137.3. Samples: 1418642880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:16,994][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 07:31:17,123][12883] Updated weights for policy 0, policy_version 86583 (0.0039) +[2024-06-18 07:31:20,361][12883] Updated weights for policy 0, policy_version 86593 (0.0041) +[2024-06-18 07:31:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1418772480. Throughput: 0: 42872.0. Samples: 1418895500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:21,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 07:31:24,823][12883] Updated weights for policy 0, policy_version 86603 (0.0024) +[2024-06-18 07:31:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1419001856. Throughput: 0: 42951.2. Samples: 1419157560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:26,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 07:31:28,276][12883] Updated weights for policy 0, policy_version 86613 (0.0029) +[2024-06-18 07:31:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1419198464. Throughput: 0: 43008.1. Samples: 1419285460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:31,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 07:31:32,746][12883] Updated weights for policy 0, policy_version 86623 (0.0050) +[2024-06-18 07:31:35,774][12883] Updated weights for policy 0, policy_version 86633 (0.0034) +[2024-06-18 07:31:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1419427840. Throughput: 0: 42861.8. Samples: 1419538260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:36,996][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 07:31:40,363][12883] Updated weights for policy 0, policy_version 86643 (0.0042) +[2024-06-18 07:31:40,368][12862] Signal inference workers to stop experience collection... (20550 times) +[2024-06-18 07:31:40,369][12862] Signal inference workers to resume experience collection... (20550 times) +[2024-06-18 07:31:40,389][12883] InferenceWorker_p0-w0: stopping experience collection (20550 times) +[2024-06-18 07:31:40,389][12883] InferenceWorker_p0-w0: resuming experience collection (20550 times) +[2024-06-18 07:31:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1419640832. Throughput: 0: 42994.8. Samples: 1419804720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:31:41,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 07:31:43,336][12883] Updated weights for policy 0, policy_version 86653 (0.0031) +[2024-06-18 07:31:46,996][12645] Fps is (10 sec: 42589.0, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 1419853824. Throughput: 0: 42923.7. Samples: 1419927800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:31:46,996][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 07:31:47,783][12883] Updated weights for policy 0, policy_version 86663 (0.0028) +[2024-06-18 07:31:50,925][12883] Updated weights for policy 0, policy_version 86673 (0.0047) +[2024-06-18 07:31:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1420083200. Throughput: 0: 42818.7. Samples: 1420181240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:31:51,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 07:31:55,335][12883] Updated weights for policy 0, policy_version 86683 (0.0041) +[2024-06-18 07:31:56,997][12645] Fps is (10 sec: 42592.9, 60 sec: 42869.0, 300 sec: 42653.4). Total num frames: 1420279808. Throughput: 0: 42878.5. Samples: 1420444320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:31:56,998][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 07:31:58,538][12883] Updated weights for policy 0, policy_version 86693 (0.0034) +[2024-06-18 07:32:01,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 1420492800. Throughput: 0: 42824.5. Samples: 1420570080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:32:02,005][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 07:32:02,795][12883] Updated weights for policy 0, policy_version 86703 (0.0044) +[2024-06-18 07:32:06,149][12883] Updated weights for policy 0, policy_version 86713 (0.0039) +[2024-06-18 07:32:06,994][12645] Fps is (10 sec: 45890.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1420738560. Throughput: 0: 43060.4. Samples: 1420833220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:32:06,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 07:32:10,697][12883] Updated weights for policy 0, policy_version 86723 (0.0034) +[2024-06-18 07:32:11,999][12645] Fps is (10 sec: 42583.8, 60 sec: 42867.5, 300 sec: 42653.1). Total num frames: 1420918784. Throughput: 0: 43033.7. Samples: 1421094320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:32:12,000][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 07:32:14,050][12883] Updated weights for policy 0, policy_version 86733 (0.0039) +[2024-06-18 07:32:16,994][12645] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1421148160. Throughput: 0: 42906.3. Samples: 1421216240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:32:16,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 07:32:18,418][12883] Updated weights for policy 0, policy_version 86743 (0.0027) +[2024-06-18 07:32:21,534][12883] Updated weights for policy 0, policy_version 86753 (0.0030) +[2024-06-18 07:32:21,994][12645] Fps is (10 sec: 47540.4, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1421393920. Throughput: 0: 43126.7. Samples: 1421478960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:32:21,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 07:32:25,725][12883] Updated weights for policy 0, policy_version 86763 (0.0038) +[2024-06-18 07:32:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1421574144. Throughput: 0: 43131.7. Samples: 1421745640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:32:26,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 07:32:29,225][12883] Updated weights for policy 0, policy_version 86773 (0.0046) +[2024-06-18 07:32:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1421803520. Throughput: 0: 43110.6. Samples: 1421867680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 07:32:31,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 07:32:33,446][12883] Updated weights for policy 0, policy_version 86783 (0.0030) +[2024-06-18 07:32:36,685][12883] Updated weights for policy 0, policy_version 86793 (0.0042) +[2024-06-18 07:32:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1422032896. Throughput: 0: 43229.0. Samples: 1422126540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:32:36,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 07:32:37,112][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086795_1422049280.pth... +[2024-06-18 07:32:37,162][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086166_1411743744.pth +[2024-06-18 07:32:40,961][12883] Updated weights for policy 0, policy_version 86803 (0.0034) +[2024-06-18 07:32:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1422229504. Throughput: 0: 43252.3. Samples: 1422390520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:32:41,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 07:32:44,453][12883] Updated weights for policy 0, policy_version 86813 (0.0039) +[2024-06-18 07:32:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 1422442496. Throughput: 0: 43109.7. Samples: 1422509920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:32:46,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 07:32:48,639][12883] Updated weights for policy 0, policy_version 86823 (0.0033) +[2024-06-18 07:32:51,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1422655488. Throughput: 0: 43150.0. Samples: 1422775060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:32:51,996][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 07:32:52,023][12883] Updated weights for policy 0, policy_version 86833 (0.0029) +[2024-06-18 07:32:56,111][12883] Updated weights for policy 0, policy_version 86843 (0.0035) +[2024-06-18 07:32:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43420.2, 300 sec: 42820.6). Total num frames: 1422884864. Throughput: 0: 42941.0. Samples: 1423026420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:32:56,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 07:32:59,922][12883] Updated weights for policy 0, policy_version 86853 (0.0037) +[2024-06-18 07:33:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 1423081472. Throughput: 0: 43227.0. Samples: 1423161460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:33:01,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 07:33:02,567][12862] Signal inference workers to stop experience collection... (20600 times) +[2024-06-18 07:33:02,568][12862] Signal inference workers to resume experience collection... (20600 times) +[2024-06-18 07:33:02,583][12883] InferenceWorker_p0-w0: stopping experience collection (20600 times) +[2024-06-18 07:33:02,611][12883] InferenceWorker_p0-w0: resuming experience collection (20600 times) +[2024-06-18 07:33:03,666][12883] Updated weights for policy 0, policy_version 86863 (0.0032) +[2024-06-18 07:33:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1423294464. Throughput: 0: 43215.2. Samples: 1423423640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:33:06,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 07:33:07,310][12883] Updated weights for policy 0, policy_version 86873 (0.0032) +[2024-06-18 07:33:11,141][12883] Updated weights for policy 0, policy_version 86883 (0.0047) +[2024-06-18 07:33:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43148.6, 300 sec: 42820.5). Total num frames: 1423507456. Throughput: 0: 42939.0. Samples: 1423677900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:33:11,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 07:33:14,817][12883] Updated weights for policy 0, policy_version 86893 (0.0030) +[2024-06-18 07:33:16,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1423736832. Throughput: 0: 43069.7. Samples: 1423805820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:33:16,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 07:33:19,157][12883] Updated weights for policy 0, policy_version 86903 (0.0036) +[2024-06-18 07:33:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1423949824. Throughput: 0: 43119.1. Samples: 1424066900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:33:21,994][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 07:33:22,298][12883] Updated weights for policy 0, policy_version 86913 (0.0036) +[2024-06-18 07:33:26,641][12883] Updated weights for policy 0, policy_version 86923 (0.0028) +[2024-06-18 07:33:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1424146432. Throughput: 0: 42973.8. Samples: 1424324340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:33:26,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 07:33:30,391][12883] Updated weights for policy 0, policy_version 86933 (0.0024) +[2024-06-18 07:33:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1424375808. Throughput: 0: 43102.6. Samples: 1424449540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:33:31,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 07:33:34,196][12883] Updated weights for policy 0, policy_version 86943 (0.0031) +[2024-06-18 07:33:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1424588800. Throughput: 0: 42954.7. Samples: 1424707920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:33:36,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 07:33:37,904][12883] Updated weights for policy 0, policy_version 86953 (0.0038) +[2024-06-18 07:33:41,969][12883] Updated weights for policy 0, policy_version 86963 (0.0033) +[2024-06-18 07:33:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1424801792. Throughput: 0: 43155.0. Samples: 1424968400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:33:41,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 07:33:45,438][12883] Updated weights for policy 0, policy_version 86973 (0.0034) +[2024-06-18 07:33:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1425014784. Throughput: 0: 42884.5. Samples: 1425091260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:33:46,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 07:33:49,922][12883] Updated weights for policy 0, policy_version 86983 (0.0041) +[2024-06-18 07:33:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 1425244160. Throughput: 0: 42808.4. Samples: 1425350020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:33:51,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 07:33:53,023][12883] Updated weights for policy 0, policy_version 86993 (0.0032) +[2024-06-18 07:33:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1425424384. Throughput: 0: 42755.2. Samples: 1425601880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:33:56,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 07:33:57,540][12883] Updated weights for policy 0, policy_version 87003 (0.0037) +[2024-06-18 07:34:00,613][12883] Updated weights for policy 0, policy_version 87013 (0.0033) +[2024-06-18 07:34:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1425653760. Throughput: 0: 42706.3. Samples: 1425727600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:34:01,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 07:34:05,060][12883] Updated weights for policy 0, policy_version 87023 (0.0031) +[2024-06-18 07:34:06,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42820.5). Total num frames: 1425866752. Throughput: 0: 42766.7. Samples: 1425991500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:34:06,996][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 07:34:08,117][12883] Updated weights for policy 0, policy_version 87033 (0.0034) +[2024-06-18 07:34:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1426079744. Throughput: 0: 42722.3. Samples: 1426246840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:34:11,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 07:34:12,644][12883] Updated weights for policy 0, policy_version 87043 (0.0031) +[2024-06-18 07:34:16,114][12883] Updated weights for policy 0, policy_version 87053 (0.0021) +[2024-06-18 07:34:16,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1426309120. Throughput: 0: 42904.9. Samples: 1426380260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:34:16,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 07:34:20,262][12883] Updated weights for policy 0, policy_version 87063 (0.0037) +[2024-06-18 07:34:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1426522112. Throughput: 0: 42887.5. Samples: 1426637860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 07:34:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 07:34:24,037][12883] Updated weights for policy 0, policy_version 87073 (0.0042) +[2024-06-18 07:34:24,405][12862] Signal inference workers to stop experience collection... (20650 times) +[2024-06-18 07:34:24,460][12883] InferenceWorker_p0-w0: stopping experience collection (20650 times) +[2024-06-18 07:34:24,520][12862] Signal inference workers to resume experience collection... (20650 times) +[2024-06-18 07:34:24,520][12883] InferenceWorker_p0-w0: resuming experience collection (20650 times) +[2024-06-18 07:34:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1426735104. Throughput: 0: 42864.8. Samples: 1426897320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:34:26,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 07:34:27,798][12883] Updated weights for policy 0, policy_version 87083 (0.0026) +[2024-06-18 07:34:31,437][12883] Updated weights for policy 0, policy_version 87093 (0.0031) +[2024-06-18 07:34:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1426948096. Throughput: 0: 43016.0. Samples: 1427026980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:34:31,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 07:34:35,345][12883] Updated weights for policy 0, policy_version 87103 (0.0038) +[2024-06-18 07:34:37,000][12645] Fps is (10 sec: 44209.3, 60 sec: 43139.9, 300 sec: 42986.2). Total num frames: 1427177472. Throughput: 0: 43049.9. Samples: 1427287540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:34:37,001][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 07:34:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087108_1427177472.pth... +[2024-06-18 07:34:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086479_1416871936.pth +[2024-06-18 07:34:38,939][12883] Updated weights for policy 0, policy_version 87113 (0.0048) +[2024-06-18 07:34:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1427390464. Throughput: 0: 43133.7. Samples: 1427542900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:34:41,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 07:34:42,871][12883] Updated weights for policy 0, policy_version 87123 (0.0042) +[2024-06-18 07:34:46,457][12883] Updated weights for policy 0, policy_version 87133 (0.0042) +[2024-06-18 07:34:46,994][12645] Fps is (10 sec: 40985.4, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1427587072. Throughput: 0: 43143.5. Samples: 1427669060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:34:47,000][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 07:34:50,419][12883] Updated weights for policy 0, policy_version 87143 (0.0034) +[2024-06-18 07:34:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1427816448. Throughput: 0: 43048.3. Samples: 1427928580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:34:51,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 07:34:54,008][12883] Updated weights for policy 0, policy_version 87153 (0.0029) +[2024-06-18 07:34:56,994][12645] Fps is (10 sec: 44237.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 1428029440. Throughput: 0: 43028.5. Samples: 1428183120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:34:56,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 07:34:58,308][12883] Updated weights for policy 0, policy_version 87163 (0.0027) +[2024-06-18 07:35:01,571][12883] Updated weights for policy 0, policy_version 87173 (0.0033) +[2024-06-18 07:35:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1428242432. Throughput: 0: 43076.9. Samples: 1428318720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:35:01,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 07:35:05,873][12883] Updated weights for policy 0, policy_version 87183 (0.0035) +[2024-06-18 07:35:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43146.1, 300 sec: 42987.5). Total num frames: 1428455424. Throughput: 0: 42957.2. Samples: 1428570940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:35:06,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 07:35:09,714][12883] Updated weights for policy 0, policy_version 87193 (0.0026) +[2024-06-18 07:35:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1428652032. Throughput: 0: 42885.5. Samples: 1428827160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 07:35:11,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 07:35:13,458][12883] Updated weights for policy 0, policy_version 87203 (0.0035) +[2024-06-18 07:35:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 1428865024. Throughput: 0: 42732.4. Samples: 1428949940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:16,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 07:35:17,276][12883] Updated weights for policy 0, policy_version 87213 (0.0044) +[2024-06-18 07:35:21,073][12883] Updated weights for policy 0, policy_version 87223 (0.0037) +[2024-06-18 07:35:21,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 1429110784. Throughput: 0: 42721.9. Samples: 1429209760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:21,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 07:35:25,030][12883] Updated weights for policy 0, policy_version 87233 (0.0037) +[2024-06-18 07:35:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1429291008. Throughput: 0: 42692.8. Samples: 1429464080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:26,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 07:35:28,707][12883] Updated weights for policy 0, policy_version 87243 (0.0036) +[2024-06-18 07:35:31,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1429520384. Throughput: 0: 42695.3. Samples: 1429590340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:31,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 07:35:32,595][12883] Updated weights for policy 0, policy_version 87253 (0.0038) +[2024-06-18 07:35:36,369][12883] Updated weights for policy 0, policy_version 87263 (0.0029) +[2024-06-18 07:35:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.7, 300 sec: 42931.6). Total num frames: 1429716992. Throughput: 0: 42660.4. Samples: 1429848300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:36,995][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 07:35:40,721][12883] Updated weights for policy 0, policy_version 87273 (0.0028) +[2024-06-18 07:35:40,994][12862] Signal inference workers to stop experience collection... (20700 times) +[2024-06-18 07:35:40,994][12862] Signal inference workers to resume experience collection... (20700 times) +[2024-06-18 07:35:41,034][12883] InferenceWorker_p0-w0: stopping experience collection (20700 times) +[2024-06-18 07:35:41,034][12883] InferenceWorker_p0-w0: resuming experience collection (20700 times) +[2024-06-18 07:35:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1429946368. Throughput: 0: 42591.0. Samples: 1430099720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:41,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 07:35:44,489][12883] Updated weights for policy 0, policy_version 87283 (0.0042) +[2024-06-18 07:35:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1430159360. Throughput: 0: 42417.8. Samples: 1430227520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:46,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 07:35:48,525][12883] Updated weights for policy 0, policy_version 87293 (0.0042) +[2024-06-18 07:35:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1430355968. Throughput: 0: 42340.5. Samples: 1430476260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:51,996][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 07:35:52,307][12883] Updated weights for policy 0, policy_version 87303 (0.0045) +[2024-06-18 07:35:56,233][12883] Updated weights for policy 0, policy_version 87313 (0.0035) +[2024-06-18 07:35:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1430568960. Throughput: 0: 42175.1. Samples: 1430725040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:35:56,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 07:36:00,024][12883] Updated weights for policy 0, policy_version 87323 (0.0039) +[2024-06-18 07:36:01,999][12645] Fps is (10 sec: 42575.3, 60 sec: 42321.5, 300 sec: 42875.3). Total num frames: 1430781952. Throughput: 0: 42280.2. Samples: 1430852780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:36:01,999][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 07:36:03,948][12883] Updated weights for policy 0, policy_version 87333 (0.0037) +[2024-06-18 07:36:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 1430978560. Throughput: 0: 42084.6. Samples: 1431103560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 07:36:06,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 07:36:07,751][12883] Updated weights for policy 0, policy_version 87343 (0.0024) +[2024-06-18 07:36:11,483][12883] Updated weights for policy 0, policy_version 87353 (0.0034) +[2024-06-18 07:36:11,994][12645] Fps is (10 sec: 40982.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1431191552. Throughput: 0: 42050.8. Samples: 1431356360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:11,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 07:36:15,397][12883] Updated weights for policy 0, policy_version 87363 (0.0033) +[2024-06-18 07:36:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1431420928. Throughput: 0: 42198.6. Samples: 1431489280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:16,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 07:36:19,113][12883] Updated weights for policy 0, policy_version 87373 (0.0037) +[2024-06-18 07:36:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 1431617536. Throughput: 0: 42098.8. Samples: 1431742740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:21,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 07:36:23,027][12883] Updated weights for policy 0, policy_version 87383 (0.0035) +[2024-06-18 07:36:26,859][12883] Updated weights for policy 0, policy_version 87393 (0.0036) +[2024-06-18 07:36:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1431846912. Throughput: 0: 42184.5. Samples: 1431998020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:26,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 07:36:30,511][12883] Updated weights for policy 0, policy_version 87403 (0.0035) +[2024-06-18 07:36:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1432059904. Throughput: 0: 42210.7. Samples: 1432127000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:31,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 07:36:34,716][12883] Updated weights for policy 0, policy_version 87413 (0.0040) +[2024-06-18 07:36:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1432240128. Throughput: 0: 42362.2. Samples: 1432382560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:36,998][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 07:36:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087418_1432256512.pth... +[2024-06-18 07:36:37,114][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086795_1422049280.pth +[2024-06-18 07:36:38,370][12883] Updated weights for policy 0, policy_version 87423 (0.0037) +[2024-06-18 07:36:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 1432469504. Throughput: 0: 42461.8. Samples: 1432635820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:41,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 07:36:42,348][12883] Updated weights for policy 0, policy_version 87433 (0.0028) +[2024-06-18 07:36:46,082][12883] Updated weights for policy 0, policy_version 87443 (0.0037) +[2024-06-18 07:36:46,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1432715264. Throughput: 0: 42642.1. Samples: 1432771440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:46,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 07:36:49,837][12883] Updated weights for policy 0, policy_version 87453 (0.0037) +[2024-06-18 07:36:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42710.0). Total num frames: 1432879104. Throughput: 0: 42608.4. Samples: 1433020940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:51,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 07:36:53,790][12883] Updated weights for policy 0, policy_version 87463 (0.0029) +[2024-06-18 07:36:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 1433124864. Throughput: 0: 42673.3. Samples: 1433276660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 07:36:56,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 07:36:57,415][12883] Updated weights for policy 0, policy_version 87473 (0.0040) +[2024-06-18 07:37:01,387][12883] Updated weights for policy 0, policy_version 87483 (0.0049) +[2024-06-18 07:37:01,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42875.3, 300 sec: 42765.0). Total num frames: 1433354240. Throughput: 0: 42748.5. Samples: 1433412960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:01,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 07:37:05,558][12883] Updated weights for policy 0, policy_version 87493 (0.0046) +[2024-06-18 07:37:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42710.3). Total num frames: 1433518080. Throughput: 0: 42579.2. Samples: 1433658800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:06,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 07:37:09,158][12883] Updated weights for policy 0, policy_version 87503 (0.0027) +[2024-06-18 07:37:11,995][12645] Fps is (10 sec: 39314.7, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 1433747456. Throughput: 0: 42490.2. Samples: 1433910160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:11,996][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 07:37:13,179][12883] Updated weights for policy 0, policy_version 87513 (0.0029) +[2024-06-18 07:37:16,756][12883] Updated weights for policy 0, policy_version 87523 (0.0035) +[2024-06-18 07:37:16,959][12862] Signal inference workers to stop experience collection... (20750 times) +[2024-06-18 07:37:16,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1433976832. Throughput: 0: 42590.7. Samples: 1434043580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:16,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 07:37:16,997][12883] InferenceWorker_p0-w0: stopping experience collection (20750 times) +[2024-06-18 07:37:17,021][12862] Signal inference workers to resume experience collection... (20750 times) +[2024-06-18 07:37:17,023][12883] InferenceWorker_p0-w0: resuming experience collection (20750 times) +[2024-06-18 07:37:20,910][12883] Updated weights for policy 0, policy_version 87533 (0.0037) +[2024-06-18 07:37:21,994][12645] Fps is (10 sec: 39328.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1434140672. Throughput: 0: 42456.9. Samples: 1434293120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:21,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 07:37:24,590][12883] Updated weights for policy 0, policy_version 87543 (0.0036) +[2024-06-18 07:37:26,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 1434402816. Throughput: 0: 42336.0. Samples: 1434541040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:26,996][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 07:37:28,678][12883] Updated weights for policy 0, policy_version 87553 (0.0038) +[2024-06-18 07:37:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1434599424. Throughput: 0: 42393.2. Samples: 1434679140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:31,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 07:37:32,473][12883] Updated weights for policy 0, policy_version 87563 (0.0037) +[2024-06-18 07:37:36,994][12645] Fps is (10 sec: 37692.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1434779648. Throughput: 0: 42229.0. Samples: 1434921240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:36,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 07:37:37,020][12883] Updated weights for policy 0, policy_version 87573 (0.0034) +[2024-06-18 07:37:40,218][12883] Updated weights for policy 0, policy_version 87583 (0.0032) +[2024-06-18 07:37:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1435041792. Throughput: 0: 42200.1. Samples: 1435175660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:41,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 07:37:44,659][12883] Updated weights for policy 0, policy_version 87593 (0.0036) +[2024-06-18 07:37:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 42543.2). Total num frames: 1435205632. Throughput: 0: 42051.2. Samples: 1435305260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:46,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 07:37:48,115][12883] Updated weights for policy 0, policy_version 87603 (0.0035) +[2024-06-18 07:37:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1435435008. Throughput: 0: 42058.7. Samples: 1435551440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 07:37:51,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 07:37:52,556][12883] Updated weights for policy 0, policy_version 87613 (0.0045) +[2024-06-18 07:37:55,825][12883] Updated weights for policy 0, policy_version 87623 (0.0024) +[2024-06-18 07:37:56,994][12645] Fps is (10 sec: 49151.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1435697152. Throughput: 0: 42266.5. Samples: 1435812080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:37:56,994][12645] Avg episode reward: [(0, '0.228')] +[2024-06-18 07:38:00,353][12883] Updated weights for policy 0, policy_version 87633 (0.0034) +[2024-06-18 07:38:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1435860992. Throughput: 0: 42262.7. Samples: 1435945400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:01,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 07:38:03,461][12883] Updated weights for policy 0, policy_version 87643 (0.0030) +[2024-06-18 07:38:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1436090368. Throughput: 0: 42247.1. Samples: 1436194240. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:06,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 07:38:08,023][12883] Updated weights for policy 0, policy_version 87653 (0.0033) +[2024-06-18 07:38:11,321][12883] Updated weights for policy 0, policy_version 87663 (0.0030) +[2024-06-18 07:38:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42872.7, 300 sec: 42653.9). Total num frames: 1436319744. Throughput: 0: 42506.5. Samples: 1436453740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:11,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 07:38:15,628][12883] Updated weights for policy 0, policy_version 87673 (0.0037) +[2024-06-18 07:38:16,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1436483584. Throughput: 0: 42233.0. Samples: 1436579620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:16,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 07:38:19,074][12883] Updated weights for policy 0, policy_version 87683 (0.0036) +[2024-06-18 07:38:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1436729344. Throughput: 0: 42362.3. Samples: 1436827540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:21,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 07:38:23,426][12883] Updated weights for policy 0, policy_version 87693 (0.0038) +[2024-06-18 07:38:26,794][12883] Updated weights for policy 0, policy_version 87703 (0.0049) +[2024-06-18 07:38:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 1436925952. Throughput: 0: 42619.2. Samples: 1437093520. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:26,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 07:38:30,889][12883] Updated weights for policy 0, policy_version 87713 (0.0042) +[2024-06-18 07:38:31,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1437138944. Throughput: 0: 42422.5. Samples: 1437214280. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:31,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 07:38:34,751][12883] Updated weights for policy 0, policy_version 87723 (0.0036) +[2024-06-18 07:38:34,781][12862] Signal inference workers to stop experience collection... (20800 times) +[2024-06-18 07:38:34,782][12862] Signal inference workers to resume experience collection... (20800 times) +[2024-06-18 07:38:34,828][12883] InferenceWorker_p0-w0: stopping experience collection (20800 times) +[2024-06-18 07:38:34,828][12883] InferenceWorker_p0-w0: resuming experience collection (20800 times) +[2024-06-18 07:38:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1437384704. Throughput: 0: 42589.3. Samples: 1437467960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:36,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 07:38:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087731_1437384704.pth... +[2024-06-18 07:38:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087108_1427177472.pth +[2024-06-18 07:38:38,555][12883] Updated weights for policy 0, policy_version 87733 (0.0037) +[2024-06-18 07:38:41,994][12645] Fps is (10 sec: 39322.6, 60 sec: 41506.2, 300 sec: 42431.8). Total num frames: 1437532160. Throughput: 0: 42618.4. Samples: 1437729900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:41,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 07:38:42,548][12883] Updated weights for policy 0, policy_version 87743 (0.0035) +[2024-06-18 07:38:46,134][12883] Updated weights for policy 0, policy_version 87753 (0.0043) +[2024-06-18 07:38:46,994][12645] Fps is (10 sec: 36044.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1437745152. Throughput: 0: 42216.4. Samples: 1437845140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 07:38:46,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 07:38:50,070][12883] Updated weights for policy 0, policy_version 87763 (0.0041) +[2024-06-18 07:38:52,000][12645] Fps is (10 sec: 49120.8, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 1438023680. Throughput: 0: 42479.9. Samples: 1438106100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:38:52,000][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 07:38:53,999][12883] Updated weights for policy 0, policy_version 87773 (0.0023) +[2024-06-18 07:38:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41506.3, 300 sec: 42487.3). Total num frames: 1438187520. Throughput: 0: 42488.6. Samples: 1438365720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:38:56,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 07:38:57,725][12883] Updated weights for policy 0, policy_version 87783 (0.0036) +[2024-06-18 07:39:01,911][12883] Updated weights for policy 0, policy_version 87793 (0.0037) +[2024-06-18 07:39:01,994][12645] Fps is (10 sec: 37707.2, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1438400512. Throughput: 0: 42376.9. Samples: 1438486580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:39:01,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 07:39:05,603][12883] Updated weights for policy 0, policy_version 87803 (0.0034) +[2024-06-18 07:39:06,994][12645] Fps is (10 sec: 49151.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1438679040. Throughput: 0: 42729.7. Samples: 1438750380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:39:06,998][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 07:39:09,535][12883] Updated weights for policy 0, policy_version 87813 (0.0025) +[2024-06-18 07:39:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1438842880. Throughput: 0: 42381.3. Samples: 1439000680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:39:11,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 07:39:13,197][12883] Updated weights for policy 0, policy_version 87823 (0.0029) +[2024-06-18 07:39:16,994][12645] Fps is (10 sec: 36044.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1439039488. Throughput: 0: 42482.8. Samples: 1439126000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:39:16,994][12645] Avg episode reward: [(0, '0.190')] +[2024-06-18 07:39:17,178][12883] Updated weights for policy 0, policy_version 87833 (0.0037) +[2024-06-18 07:39:20,761][12883] Updated weights for policy 0, policy_version 87843 (0.0028) +[2024-06-18 07:39:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1439285248. Throughput: 0: 42665.3. Samples: 1439387900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:39:21,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 07:39:24,938][12883] Updated weights for policy 0, policy_version 87853 (0.0035) +[2024-06-18 07:39:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1439481856. Throughput: 0: 42628.4. Samples: 1439648180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:39:26,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 07:39:28,670][12883] Updated weights for policy 0, policy_version 87863 (0.0038) +[2024-06-18 07:39:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42432.7). Total num frames: 1439694848. Throughput: 0: 42776.5. Samples: 1439770080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:39:31,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 07:39:32,460][12883] Updated weights for policy 0, policy_version 87873 (0.0036) +[2024-06-18 07:39:36,257][12883] Updated weights for policy 0, policy_version 87883 (0.0028) +[2024-06-18 07:39:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1439924224. Throughput: 0: 42776.0. Samples: 1440030760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) +[2024-06-18 07:39:36,995][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 07:39:37,187][12862] Signal inference workers to stop experience collection... (20850 times) +[2024-06-18 07:39:37,192][12862] Signal inference workers to resume experience collection... (20850 times) +[2024-06-18 07:39:37,239][12883] InferenceWorker_p0-w0: stopping experience collection (20850 times) +[2024-06-18 07:39:37,239][12883] InferenceWorker_p0-w0: resuming experience collection (20850 times) +[2024-06-18 07:39:40,386][12883] Updated weights for policy 0, policy_version 87893 (0.0035) +[2024-06-18 07:39:41,996][12645] Fps is (10 sec: 42588.8, 60 sec: 43142.8, 300 sec: 42487.0). Total num frames: 1440120832. Throughput: 0: 42648.9. Samples: 1440285020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:39:41,996][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 07:39:43,788][12883] Updated weights for policy 0, policy_version 87903 (0.0043) +[2024-06-18 07:39:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 1440333824. Throughput: 0: 42642.5. Samples: 1440405500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:39:46,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 07:39:47,797][12883] Updated weights for policy 0, policy_version 87913 (0.0032) +[2024-06-18 07:39:51,370][12883] Updated weights for policy 0, policy_version 87923 (0.0036) +[2024-06-18 07:39:51,994][12645] Fps is (10 sec: 44247.2, 60 sec: 42329.8, 300 sec: 42487.3). Total num frames: 1440563200. Throughput: 0: 42557.0. Samples: 1440665440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:39:51,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 07:39:55,455][12883] Updated weights for policy 0, policy_version 87933 (0.0036) +[2024-06-18 07:39:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1440743424. Throughput: 0: 42677.3. Samples: 1440921160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:39:56,995][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 07:39:59,031][12883] Updated weights for policy 0, policy_version 87943 (0.0033) +[2024-06-18 07:40:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1440972800. Throughput: 0: 42523.1. Samples: 1441039540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:40:01,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 07:40:03,124][12883] Updated weights for policy 0, policy_version 87953 (0.0041) +[2024-06-18 07:40:06,610][12883] Updated weights for policy 0, policy_version 87963 (0.0041) +[2024-06-18 07:40:06,994][12645] Fps is (10 sec: 45873.6, 60 sec: 42052.0, 300 sec: 42542.8). Total num frames: 1441202176. Throughput: 0: 42632.1. Samples: 1441306360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:40:06,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 07:40:11,055][12883] Updated weights for policy 0, policy_version 87973 (0.0033) +[2024-06-18 07:40:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1441382400. Throughput: 0: 42569.8. Samples: 1441563820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:40:11,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 07:40:14,233][12883] Updated weights for policy 0, policy_version 87983 (0.0034) +[2024-06-18 07:40:16,994][12645] Fps is (10 sec: 39323.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1441595392. Throughput: 0: 42388.8. Samples: 1441677580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:40:16,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 07:40:18,941][12883] Updated weights for policy 0, policy_version 87993 (0.0036) +[2024-06-18 07:40:21,904][12883] Updated weights for policy 0, policy_version 88003 (0.0041) +[2024-06-18 07:40:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1441841152. Throughput: 0: 42439.2. Samples: 1441940520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:40:21,995][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 07:40:26,438][12883] Updated weights for policy 0, policy_version 88013 (0.0033) +[2024-06-18 07:40:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1442037760. Throughput: 0: 42555.4. Samples: 1442199920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:40:26,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 07:40:29,461][12883] Updated weights for policy 0, policy_version 88023 (0.0032) +[2024-06-18 07:40:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1442250752. Throughput: 0: 42538.7. Samples: 1442319740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 07:40:31,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 07:40:33,995][12883] Updated weights for policy 0, policy_version 88033 (0.0045) +[2024-06-18 07:40:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1442480128. Throughput: 0: 42616.6. Samples: 1442583200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:40:36,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 07:40:37,051][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088043_1442496512.pth... +[2024-06-18 07:40:37,054][12883] Updated weights for policy 0, policy_version 88043 (0.0033) +[2024-06-18 07:40:37,107][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087418_1432256512.pth +[2024-06-18 07:40:41,591][12883] Updated weights for policy 0, policy_version 88053 (0.0040) +[2024-06-18 07:40:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42599.9, 300 sec: 42431.8). Total num frames: 1442676736. Throughput: 0: 42539.1. Samples: 1442835420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:40:41,994][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 07:40:45,259][12883] Updated weights for policy 0, policy_version 88063 (0.0035) +[2024-06-18 07:40:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1442906112. Throughput: 0: 42737.7. Samples: 1442962740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:40:46,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 07:40:49,205][12883] Updated weights for policy 0, policy_version 88073 (0.0042) +[2024-06-18 07:40:52,000][12645] Fps is (10 sec: 42572.2, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 1443102720. Throughput: 0: 42499.8. Samples: 1443219100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:40:52,000][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 07:40:52,771][12883] Updated weights for policy 0, policy_version 88083 (0.0030) +[2024-06-18 07:40:56,684][12883] Updated weights for policy 0, policy_version 88093 (0.0044) +[2024-06-18 07:40:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42488.1). Total num frames: 1443315712. Throughput: 0: 42368.0. Samples: 1443470380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:40:56,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 07:41:00,486][12883] Updated weights for policy 0, policy_version 88103 (0.0030) +[2024-06-18 07:41:01,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1443528704. Throughput: 0: 42670.2. Samples: 1443597740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:41:01,994][12645] Avg episode reward: [(0, '0.174')] +[2024-06-18 07:41:04,232][12883] Updated weights for policy 0, policy_version 88113 (0.0037) +[2024-06-18 07:41:05,274][12862] Signal inference workers to stop experience collection... (20900 times) +[2024-06-18 07:41:05,328][12862] Signal inference workers to resume experience collection... (20900 times) +[2024-06-18 07:41:05,329][12883] InferenceWorker_p0-w0: stopping experience collection (20900 times) +[2024-06-18 07:41:05,344][12883] InferenceWorker_p0-w0: resuming experience collection (20900 times) +[2024-06-18 07:41:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.7, 300 sec: 42542.9). Total num frames: 1443741696. Throughput: 0: 42587.7. Samples: 1443856960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:41:06,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 07:41:08,145][12883] Updated weights for policy 0, policy_version 88123 (0.0035) +[2024-06-18 07:41:11,999][12645] Fps is (10 sec: 42576.6, 60 sec: 42867.8, 300 sec: 42486.6). Total num frames: 1443954688. Throughput: 0: 42452.1. Samples: 1444110480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:41:11,999][12645] Avg episode reward: [(0, '0.682')] +[2024-06-18 07:41:12,128][12883] Updated weights for policy 0, policy_version 88133 (0.0026) +[2024-06-18 07:41:16,112][12883] Updated weights for policy 0, policy_version 88143 (0.0038) +[2024-06-18 07:41:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1444167680. Throughput: 0: 42683.6. Samples: 1444240500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:41:16,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 07:41:19,801][12883] Updated weights for policy 0, policy_version 88153 (0.0040) +[2024-06-18 07:41:21,994][12645] Fps is (10 sec: 40980.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1444364288. Throughput: 0: 42533.4. Samples: 1444497200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:41:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 07:41:23,553][12883] Updated weights for policy 0, policy_version 88163 (0.0036) +[2024-06-18 07:41:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1444593664. Throughput: 0: 42529.8. Samples: 1444749260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 07:41:26,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 07:41:27,508][12883] Updated weights for policy 0, policy_version 88173 (0.0029) +[2024-06-18 07:41:31,303][12883] Updated weights for policy 0, policy_version 88183 (0.0035) +[2024-06-18 07:41:31,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1444823040. Throughput: 0: 42665.5. Samples: 1444882680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:41:31,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 07:41:35,148][12883] Updated weights for policy 0, policy_version 88193 (0.0021) +[2024-06-18 07:41:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.5, 300 sec: 42487.3). Total num frames: 1445003264. Throughput: 0: 42562.4. Samples: 1445134140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:41:36,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 07:41:39,054][12883] Updated weights for policy 0, policy_version 88203 (0.0029) +[2024-06-18 07:41:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1445232640. Throughput: 0: 42495.5. Samples: 1445382680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:41:41,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 07:41:42,976][12883] Updated weights for policy 0, policy_version 88213 (0.0031) +[2024-06-18 07:41:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1445429248. Throughput: 0: 42688.0. Samples: 1445518700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:41:46,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 07:41:47,025][12883] Updated weights for policy 0, policy_version 88223 (0.0025) +[2024-06-18 07:41:50,689][12883] Updated weights for policy 0, policy_version 88233 (0.0033) +[2024-06-18 07:41:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41783.6, 300 sec: 42320.7). Total num frames: 1445609472. Throughput: 0: 42410.7. Samples: 1445765440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:41:51,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 07:41:54,657][12883] Updated weights for policy 0, policy_version 88243 (0.0031) +[2024-06-18 07:41:56,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 1445888000. Throughput: 0: 42399.6. Samples: 1446018340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:41:56,997][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 07:41:58,300][12883] Updated weights for policy 0, policy_version 88253 (0.0038) +[2024-06-18 07:42:01,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1446084608. Throughput: 0: 42561.4. Samples: 1446155760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:42:01,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 07:42:02,319][12883] Updated weights for policy 0, policy_version 88263 (0.0037) +[2024-06-18 07:42:05,881][12883] Updated weights for policy 0, policy_version 88273 (0.0042) +[2024-06-18 07:42:06,994][12645] Fps is (10 sec: 37691.5, 60 sec: 42052.1, 300 sec: 42432.0). Total num frames: 1446264832. Throughput: 0: 42342.2. Samples: 1446402600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:42:06,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 07:42:09,947][12883] Updated weights for policy 0, policy_version 88283 (0.0021) +[2024-06-18 07:42:11,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42875.1, 300 sec: 42542.8). Total num frames: 1446526976. Throughput: 0: 42430.2. Samples: 1446658620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:42:11,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 07:42:13,414][12883] Updated weights for policy 0, policy_version 88293 (0.0030) +[2024-06-18 07:42:15,551][12862] Signal inference workers to stop experience collection... (20950 times) +[2024-06-18 07:42:15,551][12862] Signal inference workers to resume experience collection... (20950 times) +[2024-06-18 07:42:15,571][12883] InferenceWorker_p0-w0: stopping experience collection (20950 times) +[2024-06-18 07:42:15,571][12883] InferenceWorker_p0-w0: resuming experience collection (20950 times) +[2024-06-18 07:42:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1446707200. Throughput: 0: 42426.5. Samples: 1446791880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 07:42:16,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 07:42:17,758][12883] Updated weights for policy 0, policy_version 88303 (0.0027) +[2024-06-18 07:42:21,577][12883] Updated weights for policy 0, policy_version 88313 (0.0034) +[2024-06-18 07:42:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 1446920192. Throughput: 0: 42321.8. Samples: 1447038620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:42:21,994][12645] Avg episode reward: [(0, '0.097')] +[2024-06-18 07:42:25,823][12883] Updated weights for policy 0, policy_version 88323 (0.0041) +[2024-06-18 07:42:26,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1447165952. Throughput: 0: 42409.8. Samples: 1447291120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:42:26,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 07:42:29,204][12883] Updated weights for policy 0, policy_version 88333 (0.0039) +[2024-06-18 07:42:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1447362560. Throughput: 0: 42296.5. Samples: 1447422040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:42:31,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 07:42:33,631][12883] Updated weights for policy 0, policy_version 88343 (0.0039) +[2024-06-18 07:42:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1447559168. Throughput: 0: 42443.5. Samples: 1447675400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:42:36,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 07:42:37,079][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088353_1447575552.pth... +[2024-06-18 07:42:37,098][12883] Updated weights for policy 0, policy_version 88353 (0.0041) +[2024-06-18 07:42:37,133][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087731_1437384704.pth +[2024-06-18 07:42:41,314][12883] Updated weights for policy 0, policy_version 88363 (0.0032) +[2024-06-18 07:42:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1447788544. Throughput: 0: 42533.3. Samples: 1447932240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:42:41,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 07:42:44,863][12883] Updated weights for policy 0, policy_version 88373 (0.0045) +[2024-06-18 07:42:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1448001536. Throughput: 0: 42326.2. Samples: 1448060440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:42:46,994][12645] Avg episode reward: [(0, '0.594')] +[2024-06-18 07:42:48,975][12883] Updated weights for policy 0, policy_version 88383 (0.0037) +[2024-06-18 07:42:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42431.8). Total num frames: 1448214528. Throughput: 0: 42428.1. Samples: 1448311860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:42:51,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 07:42:52,422][12883] Updated weights for policy 0, policy_version 88393 (0.0036) +[2024-06-18 07:42:56,968][12883] Updated weights for policy 0, policy_version 88403 (0.0046) +[2024-06-18 07:42:56,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41780.7, 300 sec: 42487.3). Total num frames: 1448394752. Throughput: 0: 42455.9. Samples: 1448569140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:42:56,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 07:43:00,384][12883] Updated weights for policy 0, policy_version 88413 (0.0038) +[2024-06-18 07:43:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1448624128. Throughput: 0: 42221.0. Samples: 1448691820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:43:01,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 07:43:04,478][12883] Updated weights for policy 0, policy_version 88423 (0.0031) +[2024-06-18 07:43:06,994][12645] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 42487.3). Total num frames: 1448853504. Throughput: 0: 42476.0. Samples: 1448950040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:43:06,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 07:43:08,179][12883] Updated weights for policy 0, policy_version 88433 (0.0034) +[2024-06-18 07:43:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1449033728. Throughput: 0: 42579.1. Samples: 1449207180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 07:43:11,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 07:43:12,142][12883] Updated weights for policy 0, policy_version 88443 (0.0036) +[2024-06-18 07:43:16,053][12883] Updated weights for policy 0, policy_version 88453 (0.0042) +[2024-06-18 07:43:16,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1449246720. Throughput: 0: 42326.6. Samples: 1449326740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:16,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 07:43:20,196][12883] Updated weights for policy 0, policy_version 88463 (0.0037) +[2024-06-18 07:43:21,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1449492480. Throughput: 0: 42454.3. Samples: 1449585840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:21,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 07:43:23,783][12883] Updated weights for policy 0, policy_version 88473 (0.0032) +[2024-06-18 07:43:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 42487.4). Total num frames: 1449672704. Throughput: 0: 42512.5. Samples: 1449845300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:26,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 07:43:27,596][12883] Updated weights for policy 0, policy_version 88483 (0.0026) +[2024-06-18 07:43:31,289][12883] Updated weights for policy 0, policy_version 88493 (0.0038) +[2024-06-18 07:43:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1449902080. Throughput: 0: 42456.3. Samples: 1449970980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:31,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 07:43:33,600][12862] Signal inference workers to stop experience collection... (21000 times) +[2024-06-18 07:43:33,600][12862] Signal inference workers to resume experience collection... (21000 times) +[2024-06-18 07:43:33,651][12883] InferenceWorker_p0-w0: stopping experience collection (21000 times) +[2024-06-18 07:43:33,652][12883] InferenceWorker_p0-w0: resuming experience collection (21000 times) +[2024-06-18 07:43:35,149][12883] Updated weights for policy 0, policy_version 88503 (0.0042) +[2024-06-18 07:43:36,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 1450131456. Throughput: 0: 42712.8. Samples: 1450233940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:36,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 07:43:38,909][12883] Updated weights for policy 0, policy_version 88513 (0.0041) +[2024-06-18 07:43:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1450328064. Throughput: 0: 42609.8. Samples: 1450486580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:41,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 07:43:42,950][12883] Updated weights for policy 0, policy_version 88523 (0.0031) +[2024-06-18 07:43:46,573][12883] Updated weights for policy 0, policy_version 88533 (0.0042) +[2024-06-18 07:43:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42377.1). Total num frames: 1450524672. Throughput: 0: 42635.1. Samples: 1450610400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:46,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 07:43:50,635][12883] Updated weights for policy 0, policy_version 88543 (0.0037) +[2024-06-18 07:43:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1450770432. Throughput: 0: 42765.2. Samples: 1450874480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:51,994][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 07:43:54,265][12883] Updated weights for policy 0, policy_version 88553 (0.0038) +[2024-06-18 07:43:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1450967040. Throughput: 0: 42477.3. Samples: 1451118660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:43:56,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 07:43:58,382][12883] Updated weights for policy 0, policy_version 88563 (0.0028) +[2024-06-18 07:44:01,911][12883] Updated weights for policy 0, policy_version 88573 (0.0034) +[2024-06-18 07:44:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1451180032. Throughput: 0: 42659.1. Samples: 1451246400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:44:01,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 07:44:06,051][12883] Updated weights for policy 0, policy_version 88583 (0.0035) +[2024-06-18 07:44:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1451393024. Throughput: 0: 42782.6. Samples: 1451511060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 07:44:06,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 07:44:09,481][12883] Updated weights for policy 0, policy_version 88593 (0.0041) +[2024-06-18 07:44:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1451606016. Throughput: 0: 42600.4. Samples: 1451762320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:11,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 07:44:13,725][12883] Updated weights for policy 0, policy_version 88603 (0.0028) +[2024-06-18 07:44:16,950][12883] Updated weights for policy 0, policy_version 88613 (0.0040) +[2024-06-18 07:44:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1451835392. Throughput: 0: 42626.2. Samples: 1451889160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:16,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 07:44:21,262][12883] Updated weights for policy 0, policy_version 88623 (0.0035) +[2024-06-18 07:44:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1452032000. Throughput: 0: 42644.2. Samples: 1452152920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:21,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 07:44:24,813][12883] Updated weights for policy 0, policy_version 88633 (0.0027) +[2024-06-18 07:44:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1452244992. Throughput: 0: 42508.1. Samples: 1452399440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:26,994][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 07:44:29,240][12883] Updated weights for policy 0, policy_version 88643 (0.0034) +[2024-06-18 07:44:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 1452457984. Throughput: 0: 42602.3. Samples: 1452527500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:31,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 07:44:32,359][12883] Updated weights for policy 0, policy_version 88653 (0.0046) +[2024-06-18 07:44:36,838][12883] Updated weights for policy 0, policy_version 88663 (0.0041) +[2024-06-18 07:44:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42487.6). Total num frames: 1452654592. Throughput: 0: 42371.2. Samples: 1452781180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:36,994][12645] Avg episode reward: [(0, '0.170')] +[2024-06-18 07:44:37,140][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088665_1452687360.pth... +[2024-06-18 07:44:37,186][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088043_1442496512.pth +[2024-06-18 07:44:39,998][12883] Updated weights for policy 0, policy_version 88673 (0.0048) +[2024-06-18 07:44:41,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1452900352. Throughput: 0: 42619.0. Samples: 1453036520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:41,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 07:44:44,380][12883] Updated weights for policy 0, policy_version 88683 (0.0023) +[2024-06-18 07:44:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1453096960. Throughput: 0: 42568.5. Samples: 1453161980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:46,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 07:44:47,798][12883] Updated weights for policy 0, policy_version 88693 (0.0037) +[2024-06-18 07:44:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 1453293568. Throughput: 0: 42421.2. Samples: 1453420020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:51,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 07:44:52,076][12883] Updated weights for policy 0, policy_version 88703 (0.0036) +[2024-06-18 07:44:55,775][12883] Updated weights for policy 0, policy_version 88713 (0.0028) +[2024-06-18 07:44:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1453522944. Throughput: 0: 42496.0. Samples: 1453674640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:44:56,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 07:44:57,620][12862] Signal inference workers to stop experience collection... (21050 times) +[2024-06-18 07:44:57,620][12862] Signal inference workers to resume experience collection... (21050 times) +[2024-06-18 07:44:57,652][12883] InferenceWorker_p0-w0: stopping experience collection (21050 times) +[2024-06-18 07:44:57,652][12883] InferenceWorker_p0-w0: resuming experience collection (21050 times) +[2024-06-18 07:44:59,688][12883] Updated weights for policy 0, policy_version 88723 (0.0030) +[2024-06-18 07:45:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1453719552. Throughput: 0: 42531.5. Samples: 1453803080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 07:45:01,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 07:45:03,251][12883] Updated weights for policy 0, policy_version 88733 (0.0022) +[2024-06-18 07:45:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1453916160. Throughput: 0: 42389.7. Samples: 1454060460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:06,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 07:45:07,382][12883] Updated weights for policy 0, policy_version 88743 (0.0032) +[2024-06-18 07:45:10,845][12883] Updated weights for policy 0, policy_version 88753 (0.0030) +[2024-06-18 07:45:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1454145536. Throughput: 0: 42560.9. Samples: 1454314680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:11,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 07:45:15,140][12883] Updated weights for policy 0, policy_version 88763 (0.0039) +[2024-06-18 07:45:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1454374912. Throughput: 0: 42675.4. Samples: 1454447900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:17,006][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 07:45:19,153][12883] Updated weights for policy 0, policy_version 88773 (0.0037) +[2024-06-18 07:45:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1454571520. Throughput: 0: 42610.2. Samples: 1454698640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:21,994][12645] Avg episode reward: [(0, '0.171')] +[2024-06-18 07:45:22,645][12883] Updated weights for policy 0, policy_version 88783 (0.0031) +[2024-06-18 07:45:26,619][12883] Updated weights for policy 0, policy_version 88793 (0.0041) +[2024-06-18 07:45:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1454784512. Throughput: 0: 42757.5. Samples: 1454960600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:26,994][12645] Avg episode reward: [(0, '0.133')] +[2024-06-18 07:45:30,237][12883] Updated weights for policy 0, policy_version 88803 (0.0035) +[2024-06-18 07:45:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1455030272. Throughput: 0: 42861.7. Samples: 1455090760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:31,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 07:45:34,276][12883] Updated weights for policy 0, policy_version 88813 (0.0038) +[2024-06-18 07:45:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1455226880. Throughput: 0: 42820.6. Samples: 1455346940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:36,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 07:45:38,017][12883] Updated weights for policy 0, policy_version 88823 (0.0030) +[2024-06-18 07:45:41,826][12883] Updated weights for policy 0, policy_version 88833 (0.0040) +[2024-06-18 07:45:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1455439872. Throughput: 0: 42657.4. Samples: 1455594220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:41,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 07:45:45,846][12883] Updated weights for policy 0, policy_version 88843 (0.0031) +[2024-06-18 07:45:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 1455669248. Throughput: 0: 42662.6. Samples: 1455722900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:46,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 07:45:49,198][12883] Updated weights for policy 0, policy_version 88853 (0.0039) +[2024-06-18 07:45:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42542.8). Total num frames: 1455865856. Throughput: 0: 42743.1. Samples: 1455983900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) +[2024-06-18 07:45:51,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 07:45:53,693][12883] Updated weights for policy 0, policy_version 88863 (0.0037) +[2024-06-18 07:45:56,798][12883] Updated weights for policy 0, policy_version 88873 (0.0042) +[2024-06-18 07:45:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1456095232. Throughput: 0: 42652.5. Samples: 1456234040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:45:56,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 07:46:01,322][12883] Updated weights for policy 0, policy_version 88883 (0.0038) +[2024-06-18 07:46:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1456275456. Throughput: 0: 42589.4. Samples: 1456364420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:01,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 07:46:04,699][12883] Updated weights for policy 0, policy_version 88893 (0.0023) +[2024-06-18 07:46:06,460][12862] Signal inference workers to stop experience collection... (21100 times) +[2024-06-18 07:46:06,511][12883] InferenceWorker_p0-w0: stopping experience collection (21100 times) +[2024-06-18 07:46:06,519][12862] Signal inference workers to resume experience collection... (21100 times) +[2024-06-18 07:46:06,526][12883] InferenceWorker_p0-w0: resuming experience collection (21100 times) +[2024-06-18 07:46:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42599.1). Total num frames: 1456521216. Throughput: 0: 42773.7. Samples: 1456623460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:06,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 07:46:09,355][12883] Updated weights for policy 0, policy_version 88903 (0.0032) +[2024-06-18 07:46:11,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1456717824. Throughput: 0: 42534.5. Samples: 1456874660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:11,995][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 07:46:12,265][12883] Updated weights for policy 0, policy_version 88913 (0.0027) +[2024-06-18 07:46:16,763][12883] Updated weights for policy 0, policy_version 88923 (0.0027) +[2024-06-18 07:46:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1456914432. Throughput: 0: 42624.9. Samples: 1457008880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:16,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 07:46:19,801][12883] Updated weights for policy 0, policy_version 88933 (0.0035) +[2024-06-18 07:46:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1457127424. Throughput: 0: 42608.4. Samples: 1457264320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:21,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 07:46:24,235][12883] Updated weights for policy 0, policy_version 88943 (0.0029) +[2024-06-18 07:46:26,994][12645] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1457373184. Throughput: 0: 42718.9. Samples: 1457516580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:26,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 07:46:27,796][12883] Updated weights for policy 0, policy_version 88953 (0.0028) +[2024-06-18 07:46:31,841][12883] Updated weights for policy 0, policy_version 88963 (0.0050) +[2024-06-18 07:46:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1457569792. Throughput: 0: 42815.3. Samples: 1457649580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:31,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 07:46:35,320][12883] Updated weights for policy 0, policy_version 88973 (0.0041) +[2024-06-18 07:46:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1457766400. Throughput: 0: 42556.5. Samples: 1457898940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:36,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 07:46:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088975_1457766400.pth... +[2024-06-18 07:46:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088353_1447575552.pth +[2024-06-18 07:46:39,807][12883] Updated weights for policy 0, policy_version 88983 (0.0037) +[2024-06-18 07:46:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1457979392. Throughput: 0: 42608.4. Samples: 1458151420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:41,994][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 07:46:42,894][12883] Updated weights for policy 0, policy_version 88993 (0.0032) +[2024-06-18 07:46:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1458192384. Throughput: 0: 42576.4. Samples: 1458280360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) +[2024-06-18 07:46:46,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 07:46:47,676][12883] Updated weights for policy 0, policy_version 89003 (0.0032) +[2024-06-18 07:46:50,719][12883] Updated weights for policy 0, policy_version 89013 (0.0042) +[2024-06-18 07:46:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 1458405376. Throughput: 0: 42332.0. Samples: 1458528400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:46:51,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 07:46:55,195][12883] Updated weights for policy 0, policy_version 89023 (0.0027) +[2024-06-18 07:46:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1458618368. Throughput: 0: 42518.7. Samples: 1458788000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:46:57,000][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 07:46:58,751][12883] Updated weights for policy 0, policy_version 89033 (0.0036) +[2024-06-18 07:47:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1458847744. Throughput: 0: 42397.6. Samples: 1458916780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:01,995][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 07:47:02,754][12883] Updated weights for policy 0, policy_version 89043 (0.0025) +[2024-06-18 07:47:06,610][12883] Updated weights for policy 0, policy_version 89053 (0.0025) +[2024-06-18 07:47:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1459044352. Throughput: 0: 42389.3. Samples: 1459171840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:06,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 07:47:07,642][12862] Signal inference workers to stop experience collection... (21150 times) +[2024-06-18 07:47:07,693][12883] InferenceWorker_p0-w0: stopping experience collection (21150 times) +[2024-06-18 07:47:07,752][12862] Signal inference workers to resume experience collection... (21150 times) +[2024-06-18 07:47:07,752][12883] InferenceWorker_p0-w0: resuming experience collection (21150 times) +[2024-06-18 07:47:10,589][12883] Updated weights for policy 0, policy_version 89063 (0.0026) +[2024-06-18 07:47:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1459273728. Throughput: 0: 42396.6. Samples: 1459424420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:11,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 07:47:14,250][12883] Updated weights for policy 0, policy_version 89073 (0.0030) +[2024-06-18 07:47:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1459486720. Throughput: 0: 42289.3. Samples: 1459552600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:16,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 07:47:18,142][12883] Updated weights for policy 0, policy_version 89083 (0.0040) +[2024-06-18 07:47:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1459683328. Throughput: 0: 42479.9. Samples: 1459810540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:21,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 07:47:22,234][12883] Updated weights for policy 0, policy_version 89093 (0.0042) +[2024-06-18 07:47:25,826][12883] Updated weights for policy 0, policy_version 89103 (0.0043) +[2024-06-18 07:47:26,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1459912704. Throughput: 0: 42382.5. Samples: 1460058640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:26,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 07:47:30,134][12883] Updated weights for policy 0, policy_version 89113 (0.0044) +[2024-06-18 07:47:31,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1460109312. Throughput: 0: 42468.1. Samples: 1460191420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:31,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 07:47:33,400][12883] Updated weights for policy 0, policy_version 89123 (0.0041) +[2024-06-18 07:47:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1460322304. Throughput: 0: 42565.3. Samples: 1460443840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:36,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 07:47:38,179][12883] Updated weights for policy 0, policy_version 89133 (0.0034) +[2024-06-18 07:47:41,468][12883] Updated weights for policy 0, policy_version 89143 (0.0030) +[2024-06-18 07:47:41,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1460551680. Throughput: 0: 42308.5. Samples: 1460691880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 07:47:41,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 07:47:45,890][12883] Updated weights for policy 0, policy_version 89153 (0.0028) +[2024-06-18 07:47:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1460764672. Throughput: 0: 42385.8. Samples: 1460824140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:47:46,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 07:47:49,051][12883] Updated weights for policy 0, policy_version 89163 (0.0040) +[2024-06-18 07:47:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1460961280. Throughput: 0: 42382.2. Samples: 1461079040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:47:51,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 07:47:53,637][12883] Updated weights for policy 0, policy_version 89173 (0.0039) +[2024-06-18 07:47:56,678][12883] Updated weights for policy 0, policy_version 89183 (0.0027) +[2024-06-18 07:47:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1461207040. Throughput: 0: 42389.8. Samples: 1461331960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:47:57,004][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 07:48:01,363][12883] Updated weights for policy 0, policy_version 89193 (0.0035) +[2024-06-18 07:48:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1461387264. Throughput: 0: 42377.7. Samples: 1461459600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:48:01,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 07:48:04,305][12883] Updated weights for policy 0, policy_version 89203 (0.0033) +[2024-06-18 07:48:06,994][12645] Fps is (10 sec: 36044.1, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 1461567488. Throughput: 0: 42213.6. Samples: 1461710160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:48:06,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 07:48:09,085][12883] Updated weights for policy 0, policy_version 89213 (0.0043) +[2024-06-18 07:48:10,566][12862] Signal inference workers to stop experience collection... (21200 times) +[2024-06-18 07:48:10,612][12883] InferenceWorker_p0-w0: stopping experience collection (21200 times) +[2024-06-18 07:48:10,623][12862] Signal inference workers to resume experience collection... (21200 times) +[2024-06-18 07:48:10,636][12883] InferenceWorker_p0-w0: resuming experience collection (21200 times) +[2024-06-18 07:48:11,835][12883] Updated weights for policy 0, policy_version 89223 (0.0030) +[2024-06-18 07:48:11,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1461829632. Throughput: 0: 42211.8. Samples: 1461958260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:48:11,996][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 07:48:16,574][12883] Updated weights for policy 0, policy_version 89233 (0.0044) +[2024-06-18 07:48:16,994][12645] Fps is (10 sec: 42599.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1461993472. Throughput: 0: 42371.4. Samples: 1462098140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:48:16,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 07:48:19,403][12883] Updated weights for policy 0, policy_version 89243 (0.0035) +[2024-06-18 07:48:21,994][12645] Fps is (10 sec: 37691.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1462206464. Throughput: 0: 42246.1. Samples: 1462344920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:48:21,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 07:48:24,340][12883] Updated weights for policy 0, policy_version 89253 (0.0032) +[2024-06-18 07:48:26,973][12883] Updated weights for policy 0, policy_version 89263 (0.0036) +[2024-06-18 07:48:26,996][12645] Fps is (10 sec: 49141.0, 60 sec: 42870.0, 300 sec: 42653.6). Total num frames: 1462484992. Throughput: 0: 42342.8. Samples: 1462597400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:48:26,997][12645] Avg episode reward: [(0, '0.059')] +[2024-06-18 07:48:31,866][12883] Updated weights for policy 0, policy_version 89273 (0.0033) +[2024-06-18 07:48:31,996][12645] Fps is (10 sec: 44227.6, 60 sec: 42323.7, 300 sec: 42431.5). Total num frames: 1462648832. Throughput: 0: 42541.5. Samples: 1462738600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:48:31,996][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 07:48:34,835][12883] Updated weights for policy 0, policy_version 89283 (0.0032) +[2024-06-18 07:48:36,994][12645] Fps is (10 sec: 37691.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1462861824. Throughput: 0: 42444.0. Samples: 1462989020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 07:48:36,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 07:48:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089286_1462861824.pth... +[2024-06-18 07:48:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088665_1452687360.pth +[2024-06-18 07:48:39,545][12883] Updated weights for policy 0, policy_version 89293 (0.0029) +[2024-06-18 07:48:41,993][12645] Fps is (10 sec: 45886.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1463107584. Throughput: 0: 42362.9. Samples: 1463238280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:48:41,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 07:48:42,523][12883] Updated weights for policy 0, policy_version 89303 (0.0034) +[2024-06-18 07:48:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 1463271424. Throughput: 0: 42499.1. Samples: 1463372060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:48:46,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 07:48:47,573][12883] Updated weights for policy 0, policy_version 89313 (0.0030) +[2024-06-18 07:48:50,551][12883] Updated weights for policy 0, policy_version 89323 (0.0027) +[2024-06-18 07:48:52,000][12645] Fps is (10 sec: 40933.9, 60 sec: 42594.0, 300 sec: 42542.0). Total num frames: 1463517184. Throughput: 0: 42537.0. Samples: 1463624580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:48:52,000][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 07:48:55,233][12883] Updated weights for policy 0, policy_version 89333 (0.0049) +[2024-06-18 07:48:56,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1463746560. Throughput: 0: 42575.1. Samples: 1463874040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:48:56,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 07:48:58,256][12883] Updated weights for policy 0, policy_version 89343 (0.0037) +[2024-06-18 07:49:01,994][12645] Fps is (10 sec: 39345.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1463910400. Throughput: 0: 42316.3. Samples: 1464002380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:49:01,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 07:49:02,848][12883] Updated weights for policy 0, policy_version 89353 (0.0050) +[2024-06-18 07:49:05,891][12883] Updated weights for policy 0, policy_version 89363 (0.0032) +[2024-06-18 07:49:06,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 1464139776. Throughput: 0: 42458.7. Samples: 1464255560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:49:06,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 07:49:10,516][12883] Updated weights for policy 0, policy_version 89373 (0.0028) +[2024-06-18 07:49:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1464369152. Throughput: 0: 42530.5. Samples: 1464511180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:49:11,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 07:49:12,290][12862] Signal inference workers to stop experience collection... (21250 times) +[2024-06-18 07:49:12,290][12862] Signal inference workers to resume experience collection... (21250 times) +[2024-06-18 07:49:12,307][12883] InferenceWorker_p0-w0: stopping experience collection (21250 times) +[2024-06-18 07:49:12,317][12883] InferenceWorker_p0-w0: resuming experience collection (21250 times) +[2024-06-18 07:49:13,768][12883] Updated weights for policy 0, policy_version 89383 (0.0032) +[2024-06-18 07:49:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1464549376. Throughput: 0: 42330.6. Samples: 1464643380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:49:16,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 07:49:18,054][12883] Updated weights for policy 0, policy_version 89393 (0.0036) +[2024-06-18 07:49:21,233][12883] Updated weights for policy 0, policy_version 89403 (0.0043) +[2024-06-18 07:49:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 1464778752. Throughput: 0: 42395.6. Samples: 1464896820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:49:21,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 07:49:25,539][12883] Updated weights for policy 0, policy_version 89413 (0.0037) +[2024-06-18 07:49:26,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 1465024512. Throughput: 0: 42575.4. Samples: 1465154180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:49:26,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 07:49:28,744][12883] Updated weights for policy 0, policy_version 89423 (0.0042) +[2024-06-18 07:49:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1465188352. Throughput: 0: 42561.8. Samples: 1465287340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) +[2024-06-18 07:49:31,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 07:49:33,276][12883] Updated weights for policy 0, policy_version 89433 (0.0039) +[2024-06-18 07:49:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1465417728. Throughput: 0: 42580.0. Samples: 1465540420. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:49:36,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 07:49:37,122][12883] Updated weights for policy 0, policy_version 89443 (0.0024) +[2024-06-18 07:49:40,816][12883] Updated weights for policy 0, policy_version 89453 (0.0038) +[2024-06-18 07:49:41,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1465663488. Throughput: 0: 42800.8. Samples: 1465800080. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:49:41,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 07:49:44,630][12883] Updated weights for policy 0, policy_version 89463 (0.0040) +[2024-06-18 07:49:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 1465827328. Throughput: 0: 42772.2. Samples: 1465927120. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:49:46,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 07:49:48,380][12883] Updated weights for policy 0, policy_version 89473 (0.0026) +[2024-06-18 07:49:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42329.6, 300 sec: 42487.3). Total num frames: 1466056704. Throughput: 0: 42686.6. Samples: 1466176460. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:49:51,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 07:49:52,287][12883] Updated weights for policy 0, policy_version 89483 (0.0032) +[2024-06-18 07:49:55,968][12883] Updated weights for policy 0, policy_version 89493 (0.0030) +[2024-06-18 07:49:56,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1466302464. Throughput: 0: 42816.5. Samples: 1466437920. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:49:56,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 07:49:59,786][12883] Updated weights for policy 0, policy_version 89503 (0.0035) +[2024-06-18 07:50:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1466482688. Throughput: 0: 42793.3. Samples: 1466569080. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:50:01,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 07:50:03,488][12883] Updated weights for policy 0, policy_version 89513 (0.0035) +[2024-06-18 07:50:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1466712064. Throughput: 0: 42814.9. Samples: 1466823500. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:50:06,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 07:50:07,414][12883] Updated weights for policy 0, policy_version 89523 (0.0026) +[2024-06-18 07:50:11,332][12883] Updated weights for policy 0, policy_version 89533 (0.0043) +[2024-06-18 07:50:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1466925056. Throughput: 0: 42874.1. Samples: 1467083520. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:50:11,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 07:50:15,033][12883] Updated weights for policy 0, policy_version 89543 (0.0034) +[2024-06-18 07:50:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1467138048. Throughput: 0: 42851.6. Samples: 1467215660. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:50:16,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 07:50:18,939][12883] Updated weights for policy 0, policy_version 89553 (0.0027) +[2024-06-18 07:50:21,998][12645] Fps is (10 sec: 44216.6, 60 sec: 43141.2, 300 sec: 42653.3). Total num frames: 1467367424. Throughput: 0: 42788.2. Samples: 1467466080. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:50:21,999][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 07:50:22,650][12883] Updated weights for policy 0, policy_version 89563 (0.0033) +[2024-06-18 07:50:26,664][12862] Signal inference workers to stop experience collection... (21300 times) +[2024-06-18 07:50:26,664][12862] Signal inference workers to resume experience collection... (21300 times) +[2024-06-18 07:50:26,680][12883] InferenceWorker_p0-w0: stopping experience collection (21300 times) +[2024-06-18 07:50:26,680][12883] InferenceWorker_p0-w0: resuming experience collection (21300 times) +[2024-06-18 07:50:26,812][12883] Updated weights for policy 0, policy_version 89573 (0.0029) +[2024-06-18 07:50:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1467564032. Throughput: 0: 42969.4. Samples: 1467733700. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) +[2024-06-18 07:50:26,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 07:50:30,264][12883] Updated weights for policy 0, policy_version 89583 (0.0027) +[2024-06-18 07:50:31,994][12645] Fps is (10 sec: 40978.5, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1467777024. Throughput: 0: 42740.7. Samples: 1467850460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:50:31,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 07:50:34,481][12883] Updated weights for policy 0, policy_version 89593 (0.0048) +[2024-06-18 07:50:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1467990016. Throughput: 0: 42960.9. Samples: 1468109700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:50:37,003][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 07:50:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089599_1467990016.pth... +[2024-06-18 07:50:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088975_1457766400.pth +[2024-06-18 07:50:37,902][12883] Updated weights for policy 0, policy_version 89603 (0.0031) +[2024-06-18 07:50:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1468186624. Throughput: 0: 42858.7. Samples: 1468366560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:50:41,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 07:50:42,375][12883] Updated weights for policy 0, policy_version 89613 (0.0034) +[2024-06-18 07:50:45,605][12883] Updated weights for policy 0, policy_version 89623 (0.0039) +[2024-06-18 07:50:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1468432384. Throughput: 0: 42833.8. Samples: 1468496600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:50:47,000][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 07:50:50,019][12883] Updated weights for policy 0, policy_version 89633 (0.0032) +[2024-06-18 07:50:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1468645376. Throughput: 0: 42934.7. Samples: 1468755560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:50:51,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 07:50:53,543][12883] Updated weights for policy 0, policy_version 89643 (0.0023) +[2024-06-18 07:50:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1468841984. Throughput: 0: 42871.6. Samples: 1469012740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:50:56,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 07:50:57,611][12883] Updated weights for policy 0, policy_version 89653 (0.0037) +[2024-06-18 07:51:01,088][12883] Updated weights for policy 0, policy_version 89663 (0.0043) +[2024-06-18 07:51:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1469087744. Throughput: 0: 42724.0. Samples: 1469138240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:51:01,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 07:51:05,547][12883] Updated weights for policy 0, policy_version 89673 (0.0032) +[2024-06-18 07:51:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1469284352. Throughput: 0: 42923.9. Samples: 1469397460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:51:06,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 07:51:08,720][12883] Updated weights for policy 0, policy_version 89683 (0.0050) +[2024-06-18 07:51:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1469497344. Throughput: 0: 42635.1. Samples: 1469652280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:51:11,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 07:51:13,193][12883] Updated weights for policy 0, policy_version 89693 (0.0037) +[2024-06-18 07:51:16,389][12883] Updated weights for policy 0, policy_version 89703 (0.0028) +[2024-06-18 07:51:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1469726720. Throughput: 0: 42895.3. Samples: 1469780740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 07:51:16,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 07:51:20,796][12883] Updated weights for policy 0, policy_version 89713 (0.0043) +[2024-06-18 07:51:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42601.7, 300 sec: 42542.9). Total num frames: 1469923328. Throughput: 0: 42999.6. Samples: 1470044680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:51:21,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 07:51:24,069][12883] Updated weights for policy 0, policy_version 89723 (0.0033) +[2024-06-18 07:51:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1470152704. Throughput: 0: 42812.1. Samples: 1470293100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:51:26,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 07:51:28,380][12883] Updated weights for policy 0, policy_version 89733 (0.0045) +[2024-06-18 07:51:31,483][12883] Updated weights for policy 0, policy_version 89743 (0.0029) +[2024-06-18 07:51:31,996][12645] Fps is (10 sec: 42588.3, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1470349312. Throughput: 0: 42860.0. Samples: 1470425400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:51:31,997][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 07:51:35,837][12883] Updated weights for policy 0, policy_version 89753 (0.0034) +[2024-06-18 07:51:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1470562304. Throughput: 0: 42926.7. Samples: 1470687260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:51:36,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 07:51:38,470][12862] Signal inference workers to stop experience collection... (21350 times) +[2024-06-18 07:51:38,471][12862] Signal inference workers to resume experience collection... (21350 times) +[2024-06-18 07:51:38,517][12883] InferenceWorker_p0-w0: stopping experience collection (21350 times) +[2024-06-18 07:51:38,517][12883] InferenceWorker_p0-w0: resuming experience collection (21350 times) +[2024-06-18 07:51:39,034][12883] Updated weights for policy 0, policy_version 89763 (0.0033) +[2024-06-18 07:51:41,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 1470808064. Throughput: 0: 42699.4. Samples: 1470934220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:51:41,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 07:51:43,872][12883] Updated weights for policy 0, policy_version 89773 (0.0028) +[2024-06-18 07:51:46,551][12883] Updated weights for policy 0, policy_version 89783 (0.0030) +[2024-06-18 07:51:46,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 1471004672. Throughput: 0: 42974.3. Samples: 1471072180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:51:46,996][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 07:51:51,458][12883] Updated weights for policy 0, policy_version 89793 (0.0045) +[2024-06-18 07:51:51,994][12645] Fps is (10 sec: 36044.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1471168512. Throughput: 0: 42892.4. Samples: 1471327620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:51:51,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 07:51:54,554][12883] Updated weights for policy 0, policy_version 89803 (0.0026) +[2024-06-18 07:51:56,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1471447040. Throughput: 0: 42721.7. Samples: 1471574760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:51:56,999][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 07:51:59,032][12883] Updated weights for policy 0, policy_version 89813 (0.0032) +[2024-06-18 07:52:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1471627264. Throughput: 0: 42935.5. Samples: 1471712840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:52:01,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 07:52:02,553][12883] Updated weights for policy 0, policy_version 89823 (0.0040) +[2024-06-18 07:52:06,636][12883] Updated weights for policy 0, policy_version 89833 (0.0028) +[2024-06-18 07:52:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1471823872. Throughput: 0: 42617.4. Samples: 1471962460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:52:06,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 07:52:09,994][12883] Updated weights for policy 0, policy_version 89843 (0.0034) +[2024-06-18 07:52:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1472086016. Throughput: 0: 42855.1. Samples: 1472221580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:52:11,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 07:52:14,164][12883] Updated weights for policy 0, policy_version 89853 (0.0031) +[2024-06-18 07:52:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1472282624. Throughput: 0: 43023.6. Samples: 1472361360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:16,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 07:52:17,441][12883] Updated weights for policy 0, policy_version 89863 (0.0032) +[2024-06-18 07:52:21,732][12883] Updated weights for policy 0, policy_version 89873 (0.0033) +[2024-06-18 07:52:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1472479232. Throughput: 0: 42693.6. Samples: 1472608480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:21,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 07:52:24,873][12883] Updated weights for policy 0, policy_version 89883 (0.0048) +[2024-06-18 07:52:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1472708608. Throughput: 0: 42910.8. Samples: 1472865200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:26,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 07:52:30,237][12883] Updated weights for policy 0, policy_version 89893 (0.0035) +[2024-06-18 07:52:31,993][12645] Fps is (10 sec: 44237.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1472921600. Throughput: 0: 42772.5. Samples: 1472996840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:31,994][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 07:52:32,456][12883] Updated weights for policy 0, policy_version 89903 (0.0039) +[2024-06-18 07:52:34,978][12862] Signal inference workers to stop experience collection... (21400 times) +[2024-06-18 07:52:34,984][12862] Signal inference workers to resume experience collection... (21400 times) +[2024-06-18 07:52:35,024][12883] InferenceWorker_p0-w0: stopping experience collection (21400 times) +[2024-06-18 07:52:35,024][12883] InferenceWorker_p0-w0: resuming experience collection (21400 times) +[2024-06-18 07:52:36,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1473118208. Throughput: 0: 42567.5. Samples: 1473243160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:36,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 07:52:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089912_1473118208.pth... +[2024-06-18 07:52:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089286_1462861824.pth +[2024-06-18 07:52:37,807][12883] Updated weights for policy 0, policy_version 89913 (0.0039) +[2024-06-18 07:52:40,234][12883] Updated weights for policy 0, policy_version 89923 (0.0049) +[2024-06-18 07:52:41,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1473331200. Throughput: 0: 42865.2. Samples: 1473503700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:41,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 07:52:45,359][12883] Updated weights for policy 0, policy_version 89933 (0.0032) +[2024-06-18 07:52:46,996][12645] Fps is (10 sec: 45866.0, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 1473576960. Throughput: 0: 42610.4. Samples: 1473630400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:46,996][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 07:52:47,850][12883] Updated weights for policy 0, policy_version 89943 (0.0037) +[2024-06-18 07:52:51,994][12645] Fps is (10 sec: 44235.8, 60 sec: 43417.4, 300 sec: 42598.4). Total num frames: 1473773568. Throughput: 0: 42752.0. Samples: 1473886320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:51,995][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 07:52:52,849][12883] Updated weights for policy 0, policy_version 89953 (0.0041) +[2024-06-18 07:52:55,805][12883] Updated weights for policy 0, policy_version 89963 (0.0023) +[2024-06-18 07:52:56,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1473986560. Throughput: 0: 42714.1. Samples: 1474143720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:52:56,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 07:53:00,314][12883] Updated weights for policy 0, policy_version 89973 (0.0034) +[2024-06-18 07:53:01,994][12645] Fps is (10 sec: 42599.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1474199552. Throughput: 0: 42418.2. Samples: 1474270180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:53:01,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 07:53:03,515][12883] Updated weights for policy 0, policy_version 89983 (0.0037) +[2024-06-18 07:53:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1474396160. Throughput: 0: 42630.8. Samples: 1474526860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 07:53:06,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 07:53:07,920][12883] Updated weights for policy 0, policy_version 89993 (0.0038) +[2024-06-18 07:53:11,074][12883] Updated weights for policy 0, policy_version 90003 (0.0054) +[2024-06-18 07:53:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1474625536. Throughput: 0: 42535.1. Samples: 1474779280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:11,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 07:53:15,820][12883] Updated weights for policy 0, policy_version 90013 (0.0036) +[2024-06-18 07:53:16,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1474838528. Throughput: 0: 42521.1. Samples: 1474910300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:16,998][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 07:53:18,544][12883] Updated weights for policy 0, policy_version 90023 (0.0033) +[2024-06-18 07:53:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 1475035136. Throughput: 0: 42785.1. Samples: 1475168480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:21,994][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 07:53:23,419][12883] Updated weights for policy 0, policy_version 90033 (0.0036) +[2024-06-18 07:53:26,137][12883] Updated weights for policy 0, policy_version 90043 (0.0045) +[2024-06-18 07:53:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1475264512. Throughput: 0: 42498.4. Samples: 1475416120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:26,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 07:53:30,938][12883] Updated weights for policy 0, policy_version 90053 (0.0033) +[2024-06-18 07:53:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1475461120. Throughput: 0: 42673.1. Samples: 1475550600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:31,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 07:53:34,317][12883] Updated weights for policy 0, policy_version 90063 (0.0032) +[2024-06-18 07:53:36,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42870.0, 300 sec: 42653.6). Total num frames: 1475690496. Throughput: 0: 42626.2. Samples: 1475804580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:36,996][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 07:53:38,782][12883] Updated weights for policy 0, policy_version 90073 (0.0028) +[2024-06-18 07:53:39,183][12862] Signal inference workers to stop experience collection... (21450 times) +[2024-06-18 07:53:39,240][12883] InferenceWorker_p0-w0: stopping experience collection (21450 times) +[2024-06-18 07:53:39,244][12862] Signal inference workers to resume experience collection... (21450 times) +[2024-06-18 07:53:39,254][12883] InferenceWorker_p0-w0: resuming experience collection (21450 times) +[2024-06-18 07:53:41,976][12883] Updated weights for policy 0, policy_version 90083 (0.0036) +[2024-06-18 07:53:41,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1475919872. Throughput: 0: 42573.4. Samples: 1476059520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:41,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 07:53:46,542][12883] Updated weights for policy 0, policy_version 90093 (0.0032) +[2024-06-18 07:53:46,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42053.8, 300 sec: 42654.8). Total num frames: 1476100096. Throughput: 0: 42739.6. Samples: 1476193460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:46,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 07:53:49,693][12883] Updated weights for policy 0, policy_version 90103 (0.0027) +[2024-06-18 07:53:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 1476329472. Throughput: 0: 42607.0. Samples: 1476444180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:51,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 07:53:54,159][12883] Updated weights for policy 0, policy_version 90113 (0.0046) +[2024-06-18 07:53:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1476542464. Throughput: 0: 42623.0. Samples: 1476697320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:53:56,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 07:53:57,460][12883] Updated weights for policy 0, policy_version 90123 (0.0032) +[2024-06-18 07:54:01,671][12883] Updated weights for policy 0, policy_version 90133 (0.0033) +[2024-06-18 07:54:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1476739072. Throughput: 0: 42645.4. Samples: 1476829340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 07:54:01,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 07:54:05,009][12883] Updated weights for policy 0, policy_version 90143 (0.0033) +[2024-06-18 07:54:07,000][12645] Fps is (10 sec: 44209.4, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 1476984832. Throughput: 0: 42528.7. Samples: 1477082540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:07,000][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 07:54:09,205][12883] Updated weights for policy 0, policy_version 90153 (0.0032) +[2024-06-18 07:54:11,995][12645] Fps is (10 sec: 44230.8, 60 sec: 42597.4, 300 sec: 42820.4). Total num frames: 1477181440. Throughput: 0: 42815.5. Samples: 1477342880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:11,996][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 07:54:12,611][12883] Updated weights for policy 0, policy_version 90163 (0.0038) +[2024-06-18 07:54:16,822][12883] Updated weights for policy 0, policy_version 90173 (0.0043) +[2024-06-18 07:54:16,994][12645] Fps is (10 sec: 40986.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1477394432. Throughput: 0: 42574.8. Samples: 1477466460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:16,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 07:54:20,243][12883] Updated weights for policy 0, policy_version 90183 (0.0038) +[2024-06-18 07:54:21,994][12645] Fps is (10 sec: 44242.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1477623808. Throughput: 0: 42572.8. Samples: 1477720260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:21,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 07:54:24,699][12883] Updated weights for policy 0, policy_version 90193 (0.0047) +[2024-06-18 07:54:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1477836800. Throughput: 0: 42827.5. Samples: 1477986760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:26,998][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 07:54:27,798][12883] Updated weights for policy 0, policy_version 90203 (0.0037) +[2024-06-18 07:54:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1478017024. Throughput: 0: 42641.8. Samples: 1478112340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:31,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 07:54:32,424][12883] Updated weights for policy 0, policy_version 90213 (0.0031) +[2024-06-18 07:54:35,452][12883] Updated weights for policy 0, policy_version 90223 (0.0029) +[2024-06-18 07:54:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 1478279168. Throughput: 0: 42727.6. Samples: 1478366920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:36,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 07:54:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090227_1478279168.pth... +[2024-06-18 07:54:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089599_1467990016.pth +[2024-06-18 07:54:39,836][12883] Updated weights for policy 0, policy_version 90233 (0.0041) +[2024-06-18 07:54:41,682][12862] Signal inference workers to stop experience collection... (21500 times) +[2024-06-18 07:54:41,683][12862] Signal inference workers to resume experience collection... (21500 times) +[2024-06-18 07:54:41,727][12883] InferenceWorker_p0-w0: stopping experience collection (21500 times) +[2024-06-18 07:54:41,727][12883] InferenceWorker_p0-w0: resuming experience collection (21500 times) +[2024-06-18 07:54:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1478475776. Throughput: 0: 43161.4. Samples: 1478639580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:41,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 07:54:42,999][12883] Updated weights for policy 0, policy_version 90243 (0.0029) +[2024-06-18 07:54:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1478672384. Throughput: 0: 42813.2. Samples: 1478755940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:46,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 07:54:47,518][12883] Updated weights for policy 0, policy_version 90253 (0.0026) +[2024-06-18 07:54:50,542][12883] Updated weights for policy 0, policy_version 90263 (0.0038) +[2024-06-18 07:54:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1478918144. Throughput: 0: 42800.6. Samples: 1479008300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:51,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 07:54:55,235][12883] Updated weights for policy 0, policy_version 90273 (0.0042) +[2024-06-18 07:54:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1479081984. Throughput: 0: 42959.6. Samples: 1479276000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 07:54:56,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 07:54:58,440][12883] Updated weights for policy 0, policy_version 90283 (0.0044) +[2024-06-18 07:55:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1479294976. Throughput: 0: 42804.9. Samples: 1479392680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:01,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 07:55:02,924][12883] Updated weights for policy 0, policy_version 90293 (0.0032) +[2024-06-18 07:55:06,029][12883] Updated weights for policy 0, policy_version 90303 (0.0027) +[2024-06-18 07:55:06,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42875.9, 300 sec: 42820.6). Total num frames: 1479557120. Throughput: 0: 43013.3. Samples: 1479655860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:06,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 07:55:10,590][12883] Updated weights for policy 0, policy_version 90313 (0.0044) +[2024-06-18 07:55:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 1479737344. Throughput: 0: 42906.7. Samples: 1479917560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:11,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 07:55:13,584][12883] Updated weights for policy 0, policy_version 90323 (0.0032) +[2024-06-18 07:55:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 1479950336. Throughput: 0: 42712.4. Samples: 1480034400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:16,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 07:55:18,211][12883] Updated weights for policy 0, policy_version 90333 (0.0042) +[2024-06-18 07:55:21,195][12883] Updated weights for policy 0, policy_version 90343 (0.0030) +[2024-06-18 07:55:21,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1480212480. Throughput: 0: 42971.2. Samples: 1480300620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:21,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 07:55:25,934][12883] Updated weights for policy 0, policy_version 90353 (0.0042) +[2024-06-18 07:55:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1480392704. Throughput: 0: 42591.1. Samples: 1480556180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:26,994][12645] Avg episode reward: [(0, '0.637')] +[2024-06-18 07:55:29,026][12883] Updated weights for policy 0, policy_version 90363 (0.0034) +[2024-06-18 07:55:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1480605696. Throughput: 0: 42773.8. Samples: 1480680760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:31,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 07:55:33,489][12883] Updated weights for policy 0, policy_version 90373 (0.0038) +[2024-06-18 07:55:36,756][12883] Updated weights for policy 0, policy_version 90383 (0.0040) +[2024-06-18 07:55:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1480835072. Throughput: 0: 43040.9. Samples: 1480945140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:36,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 07:55:40,942][12883] Updated weights for policy 0, policy_version 90393 (0.0029) +[2024-06-18 07:55:41,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 1481015296. Throughput: 0: 42930.3. Samples: 1481207960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:41,996][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 07:55:44,391][12862] Signal inference workers to stop experience collection... (21550 times) +[2024-06-18 07:55:44,392][12862] Signal inference workers to resume experience collection... (21550 times) +[2024-06-18 07:55:44,410][12883] InferenceWorker_p0-w0: stopping experience collection (21550 times) +[2024-06-18 07:55:44,410][12883] InferenceWorker_p0-w0: resuming experience collection (21550 times) +[2024-06-18 07:55:44,542][12883] Updated weights for policy 0, policy_version 90403 (0.0042) +[2024-06-18 07:55:46,996][12645] Fps is (10 sec: 42588.6, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 1481261056. Throughput: 0: 42966.6. Samples: 1481326280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:46,996][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 07:55:48,606][12883] Updated weights for policy 0, policy_version 90413 (0.0023) +[2024-06-18 07:55:51,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1481474048. Throughput: 0: 42965.4. Samples: 1481589300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) +[2024-06-18 07:55:51,994][12645] Avg episode reward: [(0, '0.215')] +[2024-06-18 07:55:52,022][12883] Updated weights for policy 0, policy_version 90423 (0.0035) +[2024-06-18 07:55:56,329][12883] Updated weights for policy 0, policy_version 90433 (0.0040) +[2024-06-18 07:55:56,994][12645] Fps is (10 sec: 40969.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1481670656. Throughput: 0: 42952.9. Samples: 1481850440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:55:56,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 07:55:59,539][12883] Updated weights for policy 0, policy_version 90443 (0.0033) +[2024-06-18 07:56:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 1481916416. Throughput: 0: 43058.2. Samples: 1481972020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:01,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 07:56:03,876][12883] Updated weights for policy 0, policy_version 90453 (0.0021) +[2024-06-18 07:56:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1482113024. Throughput: 0: 43049.3. Samples: 1482237840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:06,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 07:56:07,312][12883] Updated weights for policy 0, policy_version 90463 (0.0037) +[2024-06-18 07:56:11,474][12883] Updated weights for policy 0, policy_version 90473 (0.0042) +[2024-06-18 07:56:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1482309632. Throughput: 0: 42900.0. Samples: 1482486680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:11,994][12645] Avg episode reward: [(0, '0.145')] +[2024-06-18 07:56:14,978][12883] Updated weights for policy 0, policy_version 90483 (0.0034) +[2024-06-18 07:56:17,002][12645] Fps is (10 sec: 44199.7, 60 sec: 43411.6, 300 sec: 42819.3). Total num frames: 1482555392. Throughput: 0: 42983.6. Samples: 1482615380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:17,002][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 07:56:19,042][12883] Updated weights for policy 0, policy_version 90493 (0.0029) +[2024-06-18 07:56:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 1482719232. Throughput: 0: 42865.6. Samples: 1482874100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:21,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 07:56:22,835][12883] Updated weights for policy 0, policy_version 90503 (0.0030) +[2024-06-18 07:56:26,639][12883] Updated weights for policy 0, policy_version 90513 (0.0026) +[2024-06-18 07:56:26,994][12645] Fps is (10 sec: 40994.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 1482964992. Throughput: 0: 42567.0. Samples: 1483123380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:26,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 07:56:30,558][12883] Updated weights for policy 0, policy_version 90523 (0.0030) +[2024-06-18 07:56:31,994][12645] Fps is (10 sec: 49153.1, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1483210752. Throughput: 0: 42908.9. Samples: 1483257080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:31,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 07:56:34,254][12883] Updated weights for policy 0, policy_version 90533 (0.0045) +[2024-06-18 07:56:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1483374592. Throughput: 0: 42671.5. Samples: 1483509520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:36,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 07:56:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090538_1483374592.pth... +[2024-06-18 07:56:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089912_1473118208.pth +[2024-06-18 07:56:38,190][12883] Updated weights for policy 0, policy_version 90543 (0.0043) +[2024-06-18 07:56:41,996][12645] Fps is (10 sec: 39312.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1483603968. Throughput: 0: 42640.1. Samples: 1483769340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:41,997][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 07:56:42,227][12883] Updated weights for policy 0, policy_version 90553 (0.0033) +[2024-06-18 07:56:45,843][12883] Updated weights for policy 0, policy_version 90563 (0.0037) +[2024-06-18 07:56:46,994][12645] Fps is (10 sec: 47514.3, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 1483849728. Throughput: 0: 42804.5. Samples: 1483898220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 07:56:46,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 07:56:49,838][12883] Updated weights for policy 0, policy_version 90573 (0.0034) +[2024-06-18 07:56:51,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1484029952. Throughput: 0: 42556.3. Samples: 1484152880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:56:51,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 07:56:53,589][12883] Updated weights for policy 0, policy_version 90583 (0.0034) +[2024-06-18 07:56:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1484242944. Throughput: 0: 42849.4. Samples: 1484414900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:56:56,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 07:56:57,344][12883] Updated weights for policy 0, policy_version 90593 (0.0043) +[2024-06-18 07:57:01,257][12883] Updated weights for policy 0, policy_version 90603 (0.0040) +[2024-06-18 07:57:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1484488704. Throughput: 0: 42870.1. Samples: 1484544180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:01,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 07:57:05,030][12883] Updated weights for policy 0, policy_version 90613 (0.0039) +[2024-06-18 07:57:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1484652544. Throughput: 0: 42674.0. Samples: 1484794420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:06,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 07:57:07,656][12862] Signal inference workers to stop experience collection... (21600 times) +[2024-06-18 07:57:07,656][12862] Signal inference workers to resume experience collection... (21600 times) +[2024-06-18 07:57:07,682][12883] InferenceWorker_p0-w0: stopping experience collection (21600 times) +[2024-06-18 07:57:07,682][12883] InferenceWorker_p0-w0: resuming experience collection (21600 times) +[2024-06-18 07:57:09,061][12883] Updated weights for policy 0, policy_version 90623 (0.0036) +[2024-06-18 07:57:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1484881920. Throughput: 0: 42731.5. Samples: 1485046300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:11,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 07:57:12,623][12883] Updated weights for policy 0, policy_version 90633 (0.0039) +[2024-06-18 07:57:16,785][12883] Updated weights for policy 0, policy_version 90643 (0.0039) +[2024-06-18 07:57:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42604.3, 300 sec: 42820.6). Total num frames: 1485111296. Throughput: 0: 42752.8. Samples: 1485180960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:16,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 07:57:20,178][12883] Updated weights for policy 0, policy_version 90653 (0.0030) +[2024-06-18 07:57:21,996][12645] Fps is (10 sec: 42588.8, 60 sec: 43143.0, 300 sec: 42709.1). Total num frames: 1485307904. Throughput: 0: 42629.0. Samples: 1485427920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:21,997][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 07:57:24,456][12883] Updated weights for policy 0, policy_version 90663 (0.0038) +[2024-06-18 07:57:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1485537280. Throughput: 0: 42560.7. Samples: 1485684480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:26,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 07:57:27,836][12883] Updated weights for policy 0, policy_version 90673 (0.0025) +[2024-06-18 07:57:31,994][12645] Fps is (10 sec: 40969.6, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1485717504. Throughput: 0: 42651.1. Samples: 1485817520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:31,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 07:57:32,126][12883] Updated weights for policy 0, policy_version 90683 (0.0033) +[2024-06-18 07:57:35,490][12883] Updated weights for policy 0, policy_version 90693 (0.0037) +[2024-06-18 07:57:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1485946880. Throughput: 0: 42452.5. Samples: 1486063240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:36,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 07:57:39,820][12883] Updated weights for policy 0, policy_version 90703 (0.0029) +[2024-06-18 07:57:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42873.1, 300 sec: 42709.8). Total num frames: 1486176256. Throughput: 0: 42364.9. Samples: 1486321320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 07:57:41,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 07:57:43,285][12883] Updated weights for policy 0, policy_version 90713 (0.0036) +[2024-06-18 07:57:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 1486356480. Throughput: 0: 42402.8. Samples: 1486452300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:57:46,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 07:57:47,555][12883] Updated weights for policy 0, policy_version 90723 (0.0033) +[2024-06-18 07:57:50,836][12883] Updated weights for policy 0, policy_version 90733 (0.0033) +[2024-06-18 07:57:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1486602240. Throughput: 0: 42397.2. Samples: 1486702300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:57:51,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 07:57:55,337][12883] Updated weights for policy 0, policy_version 90743 (0.0030) +[2024-06-18 07:57:56,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1486798848. Throughput: 0: 42540.4. Samples: 1486960620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:57:56,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 07:57:58,592][12883] Updated weights for policy 0, policy_version 90753 (0.0035) +[2024-06-18 07:58:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1486995456. Throughput: 0: 42371.0. Samples: 1487087660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:58:02,008][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 07:58:02,936][12883] Updated weights for policy 0, policy_version 90763 (0.0027) +[2024-06-18 07:58:06,250][12883] Updated weights for policy 0, policy_version 90773 (0.0027) +[2024-06-18 07:58:06,996][12645] Fps is (10 sec: 44227.0, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 1487241216. Throughput: 0: 42578.2. Samples: 1487343940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:58:06,996][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 07:58:10,575][12883] Updated weights for policy 0, policy_version 90783 (0.0028) +[2024-06-18 07:58:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1487437824. Throughput: 0: 42603.7. Samples: 1487601640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:58:11,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 07:58:14,134][12883] Updated weights for policy 0, policy_version 90793 (0.0034) +[2024-06-18 07:58:16,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1487650816. Throughput: 0: 42424.8. Samples: 1487726640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:58:16,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 07:58:17,970][12883] Updated weights for policy 0, policy_version 90803 (0.0026) +[2024-06-18 07:58:21,583][12862] Signal inference workers to stop experience collection... (21650 times) +[2024-06-18 07:58:21,583][12862] Signal inference workers to resume experience collection... (21650 times) +[2024-06-18 07:58:21,604][12883] InferenceWorker_p0-w0: stopping experience collection (21650 times) +[2024-06-18 07:58:21,604][12883] InferenceWorker_p0-w0: resuming experience collection (21650 times) +[2024-06-18 07:58:21,739][12883] Updated weights for policy 0, policy_version 90813 (0.0028) +[2024-06-18 07:58:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 1487896576. Throughput: 0: 42821.8. Samples: 1487990220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:58:21,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 07:58:25,597][12883] Updated weights for policy 0, policy_version 90823 (0.0033) +[2024-06-18 07:58:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1488076800. Throughput: 0: 42691.9. Samples: 1488242460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:58:26,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 07:58:29,368][12883] Updated weights for policy 0, policy_version 90833 (0.0044) +[2024-06-18 07:58:31,993][12645] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1488289792. Throughput: 0: 42587.6. Samples: 1488368740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:58:31,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 07:58:33,489][12883] Updated weights for policy 0, policy_version 90843 (0.0033) +[2024-06-18 07:58:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1488519168. Throughput: 0: 42875.6. Samples: 1488631700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 07:58:36,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 07:58:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090853_1488535552.pth... +[2024-06-18 07:58:37,020][12883] Updated weights for policy 0, policy_version 90853 (0.0036) +[2024-06-18 07:58:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090227_1478279168.pth +[2024-06-18 07:58:41,144][12883] Updated weights for policy 0, policy_version 90863 (0.0024) +[2024-06-18 07:58:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1488732160. Throughput: 0: 42733.4. Samples: 1488883620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:58:41,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 07:58:44,580][12883] Updated weights for policy 0, policy_version 90873 (0.0038) +[2024-06-18 07:58:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1488928768. Throughput: 0: 42734.7. Samples: 1489010720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:58:46,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 07:58:48,630][12883] Updated weights for policy 0, policy_version 90883 (0.0036) +[2024-06-18 07:58:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1489158144. Throughput: 0: 42838.1. Samples: 1489271560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:58:51,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 07:58:52,502][12883] Updated weights for policy 0, policy_version 90893 (0.0039) +[2024-06-18 07:58:56,756][12883] Updated weights for policy 0, policy_version 90903 (0.0034) +[2024-06-18 07:58:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1489354752. Throughput: 0: 42782.2. Samples: 1489526840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:58:56,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 07:59:00,609][12883] Updated weights for policy 0, policy_version 90913 (0.0037) +[2024-06-18 07:59:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42710.4). Total num frames: 1489584128. Throughput: 0: 42755.6. Samples: 1489650640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:59:01,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 07:59:04,431][12883] Updated weights for policy 0, policy_version 90923 (0.0027) +[2024-06-18 07:59:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42600.0, 300 sec: 42765.2). Total num frames: 1489797120. Throughput: 0: 42678.7. Samples: 1489910760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:59:06,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 07:59:08,340][12883] Updated weights for policy 0, policy_version 90933 (0.0039) +[2024-06-18 07:59:11,985][12883] Updated weights for policy 0, policy_version 90943 (0.0032) +[2024-06-18 07:59:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1490010112. Throughput: 0: 42762.2. Samples: 1490166760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:59:11,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 07:59:15,947][12883] Updated weights for policy 0, policy_version 90953 (0.0046) +[2024-06-18 07:59:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1490223104. Throughput: 0: 42695.3. Samples: 1490290040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:59:16,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 07:59:19,615][12883] Updated weights for policy 0, policy_version 90963 (0.0034) +[2024-06-18 07:59:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1490436096. Throughput: 0: 42658.3. Samples: 1490551320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:59:21,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 07:59:23,707][12883] Updated weights for policy 0, policy_version 90973 (0.0029) +[2024-06-18 07:59:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1490649088. Throughput: 0: 42767.3. Samples: 1490808160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:59:26,995][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 07:59:27,142][12883] Updated weights for policy 0, policy_version 90983 (0.0029) +[2024-06-18 07:59:31,399][12883] Updated weights for policy 0, policy_version 90993 (0.0034) +[2024-06-18 07:59:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1490862080. Throughput: 0: 42848.4. Samples: 1490938900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 07:59:31,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 07:59:35,239][12883] Updated weights for policy 0, policy_version 91003 (0.0036) +[2024-06-18 07:59:36,996][12645] Fps is (10 sec: 42589.7, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 1491075072. Throughput: 0: 42744.1. Samples: 1491195140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:59:36,997][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 07:59:39,187][12883] Updated weights for policy 0, policy_version 91013 (0.0041) +[2024-06-18 07:59:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1491288064. Throughput: 0: 42551.5. Samples: 1491441660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:59:41,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 07:59:42,915][12883] Updated weights for policy 0, policy_version 91023 (0.0029) +[2024-06-18 07:59:44,397][12862] Signal inference workers to stop experience collection... (21700 times) +[2024-06-18 07:59:44,397][12862] Signal inference workers to resume experience collection... (21700 times) +[2024-06-18 07:59:44,411][12883] InferenceWorker_p0-w0: stopping experience collection (21700 times) +[2024-06-18 07:59:44,411][12883] InferenceWorker_p0-w0: resuming experience collection (21700 times) +[2024-06-18 07:59:46,815][12883] Updated weights for policy 0, policy_version 91033 (0.0038) +[2024-06-18 07:59:46,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1491484672. Throughput: 0: 42708.0. Samples: 1491572500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:59:46,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 07:59:50,501][12883] Updated weights for policy 0, policy_version 91043 (0.0040) +[2024-06-18 07:59:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1491714048. Throughput: 0: 42700.8. Samples: 1491832300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:59:51,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 07:59:54,462][12883] Updated weights for policy 0, policy_version 91053 (0.0034) +[2024-06-18 07:59:56,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1491943424. Throughput: 0: 42660.8. Samples: 1492086500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 07:59:56,994][12645] Avg episode reward: [(0, '0.625')] +[2024-06-18 07:59:58,017][12883] Updated weights for policy 0, policy_version 91063 (0.0035) +[2024-06-18 08:00:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1492123648. Throughput: 0: 42858.8. Samples: 1492218680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:00:01,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 08:00:02,055][12883] Updated weights for policy 0, policy_version 91073 (0.0028) +[2024-06-18 08:00:05,627][12883] Updated weights for policy 0, policy_version 91083 (0.0041) +[2024-06-18 08:00:06,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1492353024. Throughput: 0: 42806.3. Samples: 1492477600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:00:06,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 08:00:09,600][12883] Updated weights for policy 0, policy_version 91093 (0.0032) +[2024-06-18 08:00:11,995][12645] Fps is (10 sec: 45870.8, 60 sec: 42870.8, 300 sec: 42820.4). Total num frames: 1492582400. Throughput: 0: 42725.5. Samples: 1492730840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:00:11,995][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 08:00:13,263][12883] Updated weights for policy 0, policy_version 91103 (0.0030) +[2024-06-18 08:00:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 1492762624. Throughput: 0: 42730.3. Samples: 1492861760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:00:16,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 08:00:17,393][12883] Updated weights for policy 0, policy_version 91113 (0.0036) +[2024-06-18 08:00:21,587][12883] Updated weights for policy 0, policy_version 91123 (0.0036) +[2024-06-18 08:00:21,994][12645] Fps is (10 sec: 39325.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1492975616. Throughput: 0: 42776.1. Samples: 1493119960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:00:21,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 08:00:24,961][12883] Updated weights for policy 0, policy_version 91133 (0.0028) +[2024-06-18 08:00:26,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1493221376. Throughput: 0: 42913.7. Samples: 1493372780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:00:26,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 08:00:29,313][12883] Updated weights for policy 0, policy_version 91143 (0.0044) +[2024-06-18 08:00:31,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1493434368. Throughput: 0: 42971.0. Samples: 1493506200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:00:31,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 08:00:32,439][12883] Updated weights for policy 0, policy_version 91153 (0.0029) +[2024-06-18 08:00:36,873][12883] Updated weights for policy 0, policy_version 91163 (0.0044) +[2024-06-18 08:00:36,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42326.9, 300 sec: 42709.8). Total num frames: 1493614592. Throughput: 0: 42813.4. Samples: 1493758900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:00:36,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 08:00:37,061][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091164_1493630976.pth... +[2024-06-18 08:00:37,093][12862] Signal inference workers to stop experience collection... (21750 times) +[2024-06-18 08:00:37,093][12862] Signal inference workers to resume experience collection... (21750 times) +[2024-06-18 08:00:37,116][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090538_1483374592.pth +[2024-06-18 08:00:37,118][12883] InferenceWorker_p0-w0: stopping experience collection (21750 times) +[2024-06-18 08:00:37,118][12883] InferenceWorker_p0-w0: resuming experience collection (21750 times) +[2024-06-18 08:00:39,927][12883] Updated weights for policy 0, policy_version 91173 (0.0033) +[2024-06-18 08:00:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1493843968. Throughput: 0: 42979.3. Samples: 1494020560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:00:41,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 08:00:44,373][12883] Updated weights for policy 0, policy_version 91183 (0.0044) +[2024-06-18 08:00:46,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1494089728. Throughput: 0: 42990.5. Samples: 1494153260. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:00:46,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 08:00:47,550][12883] Updated weights for policy 0, policy_version 91193 (0.0035) +[2024-06-18 08:00:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1494253568. Throughput: 0: 42857.8. Samples: 1494406200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:00:51,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 08:00:52,039][12883] Updated weights for policy 0, policy_version 91203 (0.0031) +[2024-06-18 08:00:54,969][12883] Updated weights for policy 0, policy_version 91213 (0.0032) +[2024-06-18 08:00:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1494499328. Throughput: 0: 42935.1. Samples: 1494662880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:00:56,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 08:00:59,590][12883] Updated weights for policy 0, policy_version 91223 (0.0034) +[2024-06-18 08:01:01,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1494728704. Throughput: 0: 43027.0. Samples: 1494797980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:01:01,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 08:01:02,658][12883] Updated weights for policy 0, policy_version 91233 (0.0022) +[2024-06-18 08:01:06,994][12645] Fps is (10 sec: 40958.0, 60 sec: 42598.0, 300 sec: 42709.4). Total num frames: 1494908928. Throughput: 0: 42863.4. Samples: 1495048840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:01:06,994][12645] Avg episode reward: [(0, '0.180')] +[2024-06-18 08:01:07,091][12883] Updated weights for policy 0, policy_version 91243 (0.0022) +[2024-06-18 08:01:10,504][12883] Updated weights for policy 0, policy_version 91253 (0.0042) +[2024-06-18 08:01:11,996][12645] Fps is (10 sec: 39312.8, 60 sec: 42324.4, 300 sec: 42599.3). Total num frames: 1495121920. Throughput: 0: 42869.5. Samples: 1495302000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:01:11,997][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 08:01:14,722][12883] Updated weights for policy 0, policy_version 91263 (0.0025) +[2024-06-18 08:01:16,996][12645] Fps is (10 sec: 45867.4, 60 sec: 43416.0, 300 sec: 42875.8). Total num frames: 1495367680. Throughput: 0: 42849.5. Samples: 1495434520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:01:16,996][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 08:01:18,188][12883] Updated weights for policy 0, policy_version 91273 (0.0033) +[2024-06-18 08:01:21,996][12645] Fps is (10 sec: 44236.9, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 1495564288. Throughput: 0: 42778.3. Samples: 1495684020. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) +[2024-06-18 08:01:21,996][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 08:01:22,267][12883] Updated weights for policy 0, policy_version 91283 (0.0036) +[2024-06-18 08:01:25,365][12862] Signal inference workers to stop experience collection... (21800 times) +[2024-06-18 08:01:25,414][12883] InferenceWorker_p0-w0: stopping experience collection (21800 times) +[2024-06-18 08:01:25,476][12862] Signal inference workers to resume experience collection... (21800 times) +[2024-06-18 08:01:25,476][12883] InferenceWorker_p0-w0: resuming experience collection (21800 times) +[2024-06-18 08:01:25,606][12883] Updated weights for policy 0, policy_version 91293 (0.0030) +[2024-06-18 08:01:26,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1495777280. Throughput: 0: 42725.8. Samples: 1495943220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:01:26,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 08:01:29,919][12883] Updated weights for policy 0, policy_version 91303 (0.0027) +[2024-06-18 08:01:31,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1496023040. Throughput: 0: 42755.2. Samples: 1496077240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:01:31,994][12645] Avg episode reward: [(0, '0.234')] +[2024-06-18 08:01:33,655][12883] Updated weights for policy 0, policy_version 91313 (0.0043) +[2024-06-18 08:01:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 1496203264. Throughput: 0: 42680.8. Samples: 1496326840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:01:36,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 08:01:37,709][12883] Updated weights for policy 0, policy_version 91323 (0.0051) +[2024-06-18 08:01:41,197][12883] Updated weights for policy 0, policy_version 91333 (0.0042) +[2024-06-18 08:01:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1496432640. Throughput: 0: 42500.8. Samples: 1496575420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:01:41,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 08:01:45,587][12883] Updated weights for policy 0, policy_version 91343 (0.0031) +[2024-06-18 08:01:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1496645632. Throughput: 0: 42497.9. Samples: 1496710380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:01:46,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 08:01:48,720][12883] Updated weights for policy 0, policy_version 91353 (0.0031) +[2024-06-18 08:01:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1496825856. Throughput: 0: 42543.9. Samples: 1496963300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:01:51,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 08:01:53,073][12883] Updated weights for policy 0, policy_version 91363 (0.0042) +[2024-06-18 08:01:56,284][12883] Updated weights for policy 0, policy_version 91373 (0.0035) +[2024-06-18 08:01:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1497055232. Throughput: 0: 42710.6. Samples: 1497223880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:01:56,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 08:02:00,668][12883] Updated weights for policy 0, policy_version 91383 (0.0032) +[2024-06-18 08:02:01,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1497284608. Throughput: 0: 42736.8. Samples: 1497357580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:02:01,994][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 08:02:03,986][12883] Updated weights for policy 0, policy_version 91393 (0.0032) +[2024-06-18 08:02:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.8, 300 sec: 42709.5). Total num frames: 1497481216. Throughput: 0: 42701.2. Samples: 1497605480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:02:06,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 08:02:08,701][12883] Updated weights for policy 0, policy_version 91403 (0.0037) +[2024-06-18 08:02:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42873.0, 300 sec: 42653.9). Total num frames: 1497694208. Throughput: 0: 42607.8. Samples: 1497860580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:02:11,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 08:02:12,215][12883] Updated weights for policy 0, policy_version 91413 (0.0045) +[2024-06-18 08:02:16,225][12883] Updated weights for policy 0, policy_version 91423 (0.0030) +[2024-06-18 08:02:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42600.0, 300 sec: 42765.4). Total num frames: 1497923584. Throughput: 0: 42442.3. Samples: 1497987140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 08:02:16,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 08:02:19,914][12883] Updated weights for policy 0, policy_version 91433 (0.0024) +[2024-06-18 08:02:21,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1498136576. Throughput: 0: 42517.9. Samples: 1498240140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:02:21,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 08:02:23,827][12883] Updated weights for policy 0, policy_version 91443 (0.0043) +[2024-06-18 08:02:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 1498333184. Throughput: 0: 42711.1. Samples: 1498497420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:02:26,995][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 08:02:27,363][12883] Updated weights for policy 0, policy_version 91453 (0.0033) +[2024-06-18 08:02:31,322][12883] Updated weights for policy 0, policy_version 91463 (0.0039) +[2024-06-18 08:02:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1498562560. Throughput: 0: 42634.6. Samples: 1498628940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:02:31,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 08:02:34,830][12883] Updated weights for policy 0, policy_version 91473 (0.0033) +[2024-06-18 08:02:36,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1498775552. Throughput: 0: 42846.4. Samples: 1498891380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:02:36,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 08:02:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091478_1498775552.pth... +[2024-06-18 08:02:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090853_1488535552.pth +[2024-06-18 08:02:38,903][12883] Updated weights for policy 0, policy_version 91483 (0.0031) +[2024-06-18 08:02:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 1498972160. Throughput: 0: 42697.0. Samples: 1499145240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:02:41,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 08:02:42,446][12883] Updated weights for policy 0, policy_version 91493 (0.0028) +[2024-06-18 08:02:46,019][12862] Signal inference workers to stop experience collection... (21850 times) +[2024-06-18 08:02:46,024][12862] Signal inference workers to resume experience collection... (21850 times) +[2024-06-18 08:02:46,053][12883] InferenceWorker_p0-w0: stopping experience collection (21850 times) +[2024-06-18 08:02:46,053][12883] InferenceWorker_p0-w0: resuming experience collection (21850 times) +[2024-06-18 08:02:46,389][12883] Updated weights for policy 0, policy_version 91503 (0.0044) +[2024-06-18 08:02:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1499201536. Throughput: 0: 42456.8. Samples: 1499268140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:02:46,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 08:02:50,020][12883] Updated weights for policy 0, policy_version 91513 (0.0030) +[2024-06-18 08:02:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1499414528. Throughput: 0: 42760.5. Samples: 1499529700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:02:51,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 08:02:54,189][12883] Updated weights for policy 0, policy_version 91523 (0.0039) +[2024-06-18 08:02:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1499627520. Throughput: 0: 42761.4. Samples: 1499784840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:02:56,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 08:02:57,934][12883] Updated weights for policy 0, policy_version 91533 (0.0042) +[2024-06-18 08:03:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 1499807744. Throughput: 0: 42640.8. Samples: 1499905980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:03:01,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 08:03:02,245][12883] Updated weights for policy 0, policy_version 91543 (0.0035) +[2024-06-18 08:03:05,538][12883] Updated weights for policy 0, policy_version 91553 (0.0050) +[2024-06-18 08:03:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1500053504. Throughput: 0: 42685.6. Samples: 1500161000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:03:06,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 08:03:09,810][12883] Updated weights for policy 0, policy_version 91563 (0.0026) +[2024-06-18 08:03:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1500250112. Throughput: 0: 42753.0. Samples: 1500421300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 08:03:11,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 08:03:13,187][12883] Updated weights for policy 0, policy_version 91573 (0.0042) +[2024-06-18 08:03:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1500463104. Throughput: 0: 42698.7. Samples: 1500550380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:16,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 08:03:17,277][12883] Updated weights for policy 0, policy_version 91583 (0.0031) +[2024-06-18 08:03:21,129][12883] Updated weights for policy 0, policy_version 91593 (0.0021) +[2024-06-18 08:03:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 1500692480. Throughput: 0: 42633.6. Samples: 1500809900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:21,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 08:03:24,966][12883] Updated weights for policy 0, policy_version 91603 (0.0040) +[2024-06-18 08:03:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1500905472. Throughput: 0: 42719.9. Samples: 1501067640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:26,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 08:03:28,697][12883] Updated weights for policy 0, policy_version 91613 (0.0040) +[2024-06-18 08:03:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1501118464. Throughput: 0: 42801.4. Samples: 1501194200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:31,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 08:03:32,479][12883] Updated weights for policy 0, policy_version 91623 (0.0032) +[2024-06-18 08:03:36,467][12883] Updated weights for policy 0, policy_version 91633 (0.0045) +[2024-06-18 08:03:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1501331456. Throughput: 0: 42770.7. Samples: 1501454380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:36,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 08:03:39,848][12883] Updated weights for policy 0, policy_version 91643 (0.0036) +[2024-06-18 08:03:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1501528064. Throughput: 0: 42780.1. Samples: 1501709940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:41,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 08:03:44,163][12883] Updated weights for policy 0, policy_version 91653 (0.0026) +[2024-06-18 08:03:46,996][12645] Fps is (10 sec: 44228.4, 60 sec: 42870.1, 300 sec: 42764.7). Total num frames: 1501773824. Throughput: 0: 42984.9. Samples: 1501840380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:46,996][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 08:03:47,328][12883] Updated weights for policy 0, policy_version 91663 (0.0027) +[2024-06-18 08:03:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1501954048. Throughput: 0: 42955.1. Samples: 1502093980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:51,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 08:03:52,034][12883] Updated weights for policy 0, policy_version 91673 (0.0033) +[2024-06-18 08:03:54,926][12883] Updated weights for policy 0, policy_version 91683 (0.0031) +[2024-06-18 08:03:56,994][12645] Fps is (10 sec: 40967.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1502183424. Throughput: 0: 42924.8. Samples: 1502352920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:03:56,998][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 08:03:59,626][12883] Updated weights for policy 0, policy_version 91693 (0.0025) +[2024-06-18 08:04:01,994][12645] Fps is (10 sec: 47514.2, 60 sec: 43690.8, 300 sec: 42820.6). Total num frames: 1502429184. Throughput: 0: 42997.3. Samples: 1502485260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:04:01,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 08:04:02,289][12883] Updated weights for policy 0, policy_version 91703 (0.0036) +[2024-06-18 08:04:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1502593024. Throughput: 0: 43056.2. Samples: 1502747420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:04:06,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 08:04:07,159][12883] Updated weights for policy 0, policy_version 91713 (0.0024) +[2024-06-18 08:04:07,513][12862] Signal inference workers to stop experience collection... (21900 times) +[2024-06-18 08:04:07,513][12862] Signal inference workers to resume experience collection... (21900 times) +[2024-06-18 08:04:07,556][12883] InferenceWorker_p0-w0: stopping experience collection (21900 times) +[2024-06-18 08:04:07,556][12883] InferenceWorker_p0-w0: resuming experience collection (21900 times) +[2024-06-18 08:04:09,820][12883] Updated weights for policy 0, policy_version 91723 (0.0040) +[2024-06-18 08:04:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1502822400. Throughput: 0: 42880.1. Samples: 1502997240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:11,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 08:04:14,582][12883] Updated weights for policy 0, policy_version 91733 (0.0041) +[2024-06-18 08:04:16,994][12645] Fps is (10 sec: 47512.6, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 1503068160. Throughput: 0: 42918.9. Samples: 1503125560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:16,995][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 08:04:17,846][12883] Updated weights for policy 0, policy_version 91743 (0.0031) +[2024-06-18 08:04:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 1503264768. Throughput: 0: 42997.4. Samples: 1503389260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:21,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 08:04:22,140][12883] Updated weights for policy 0, policy_version 91753 (0.0039) +[2024-06-18 08:04:25,547][12883] Updated weights for policy 0, policy_version 91763 (0.0028) +[2024-06-18 08:04:26,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1503477760. Throughput: 0: 42934.8. Samples: 1503642000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:26,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 08:04:29,608][12883] Updated weights for policy 0, policy_version 91773 (0.0026) +[2024-06-18 08:04:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1503674368. Throughput: 0: 42965.4. Samples: 1503773740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:31,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 08:04:32,987][12883] Updated weights for policy 0, policy_version 91783 (0.0028) +[2024-06-18 08:04:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1503903744. Throughput: 0: 43063.6. Samples: 1504031840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:36,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 08:04:37,078][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091792_1503920128.pth... +[2024-06-18 08:04:37,128][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091164_1493630976.pth +[2024-06-18 08:04:37,326][12883] Updated weights for policy 0, policy_version 91793 (0.0037) +[2024-06-18 08:04:40,726][12883] Updated weights for policy 0, policy_version 91803 (0.0031) +[2024-06-18 08:04:41,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1504133120. Throughput: 0: 42915.6. Samples: 1504284120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:41,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 08:04:44,836][12883] Updated weights for policy 0, policy_version 91813 (0.0026) +[2024-06-18 08:04:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 1504329728. Throughput: 0: 42872.3. Samples: 1504414520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:46,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 08:04:49,008][12883] Updated weights for policy 0, policy_version 91823 (0.0041) +[2024-06-18 08:04:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1504559104. Throughput: 0: 42828.3. Samples: 1504674700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:51,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 08:04:52,830][12883] Updated weights for policy 0, policy_version 91833 (0.0043) +[2024-06-18 08:04:56,571][12883] Updated weights for policy 0, policy_version 91843 (0.0038) +[2024-06-18 08:04:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 1504772096. Throughput: 0: 42853.9. Samples: 1504925660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:04:56,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 08:05:00,287][12883] Updated weights for policy 0, policy_version 91853 (0.0031) +[2024-06-18 08:05:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1504985088. Throughput: 0: 42918.4. Samples: 1505056880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) +[2024-06-18 08:05:01,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 08:05:04,347][12883] Updated weights for policy 0, policy_version 91863 (0.0031) +[2024-06-18 08:05:06,997][12645] Fps is (10 sec: 40946.1, 60 sec: 43142.1, 300 sec: 42709.1). Total num frames: 1505181696. Throughput: 0: 42685.7. Samples: 1505310260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:06,998][12645] Avg episode reward: [(0, '0.616')] +[2024-06-18 08:05:08,212][12883] Updated weights for policy 0, policy_version 91873 (0.0033) +[2024-06-18 08:05:11,936][12883] Updated weights for policy 0, policy_version 91883 (0.0035) +[2024-06-18 08:05:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1505411072. Throughput: 0: 42822.1. Samples: 1505569000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:11,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 08:05:15,781][12883] Updated weights for policy 0, policy_version 91893 (0.0040) +[2024-06-18 08:05:16,994][12645] Fps is (10 sec: 44251.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1505624064. Throughput: 0: 42711.5. Samples: 1505695760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:16,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 08:05:19,597][12883] Updated weights for policy 0, policy_version 91903 (0.0023) +[2024-06-18 08:05:22,000][12645] Fps is (10 sec: 40935.0, 60 sec: 42594.0, 300 sec: 42708.6). Total num frames: 1505820672. Throughput: 0: 42492.9. Samples: 1505944280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:22,000][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 08:05:23,382][12883] Updated weights for policy 0, policy_version 91913 (0.0028) +[2024-06-18 08:05:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 1506033664. Throughput: 0: 42753.7. Samples: 1506208040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:26,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 08:05:27,235][12883] Updated weights for policy 0, policy_version 91923 (0.0038) +[2024-06-18 08:05:30,472][12862] Signal inference workers to stop experience collection... (21950 times) +[2024-06-18 08:05:30,472][12862] Signal inference workers to resume experience collection... (21950 times) +[2024-06-18 08:05:30,520][12883] InferenceWorker_p0-w0: stopping experience collection (21950 times) +[2024-06-18 08:05:30,520][12883] InferenceWorker_p0-w0: resuming experience collection (21950 times) +[2024-06-18 08:05:31,113][12883] Updated weights for policy 0, policy_version 91933 (0.0038) +[2024-06-18 08:05:31,994][12645] Fps is (10 sec: 44263.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1506263040. Throughput: 0: 42647.6. Samples: 1506333660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:31,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 08:05:34,810][12883] Updated weights for policy 0, policy_version 91943 (0.0036) +[2024-06-18 08:05:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1506476032. Throughput: 0: 42407.5. Samples: 1506583040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:36,995][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 08:05:38,918][12883] Updated weights for policy 0, policy_version 91953 (0.0035) +[2024-06-18 08:05:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1506689024. Throughput: 0: 42727.5. Samples: 1506848400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:41,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 08:05:42,400][12883] Updated weights for policy 0, policy_version 91963 (0.0036) +[2024-06-18 08:05:46,437][12883] Updated weights for policy 0, policy_version 91973 (0.0029) +[2024-06-18 08:05:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1506885632. Throughput: 0: 42550.0. Samples: 1506971640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:46,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 08:05:50,070][12883] Updated weights for policy 0, policy_version 91983 (0.0041) +[2024-06-18 08:05:51,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1507131392. Throughput: 0: 42648.3. Samples: 1507229300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:51,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 08:05:54,305][12883] Updated weights for policy 0, policy_version 91993 (0.0043) +[2024-06-18 08:05:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1507328000. Throughput: 0: 42651.5. Samples: 1507488320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:05:56,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 08:05:57,671][12883] Updated weights for policy 0, policy_version 92003 (0.0037) +[2024-06-18 08:06:01,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 1507524608. Throughput: 0: 42496.5. Samples: 1507608100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 08:06:01,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 08:06:02,143][12883] Updated weights for policy 0, policy_version 92013 (0.0033) +[2024-06-18 08:06:05,600][12883] Updated weights for policy 0, policy_version 92023 (0.0038) +[2024-06-18 08:06:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42873.7, 300 sec: 42820.9). Total num frames: 1507753984. Throughput: 0: 42624.3. Samples: 1507862120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:06,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 08:06:09,810][12883] Updated weights for policy 0, policy_version 92033 (0.0036) +[2024-06-18 08:06:11,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 1507966976. Throughput: 0: 42341.7. Samples: 1508113420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:11,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 08:06:13,456][12883] Updated weights for policy 0, policy_version 92043 (0.0037) +[2024-06-18 08:06:16,994][12645] Fps is (10 sec: 39322.5, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 1508147200. Throughput: 0: 42419.2. Samples: 1508242520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:16,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 08:06:17,675][12883] Updated weights for policy 0, policy_version 92053 (0.0030) +[2024-06-18 08:06:21,252][12883] Updated weights for policy 0, policy_version 92063 (0.0032) +[2024-06-18 08:06:21,996][12645] Fps is (10 sec: 42589.9, 60 sec: 42874.3, 300 sec: 42764.7). Total num frames: 1508392960. Throughput: 0: 42666.5. Samples: 1508503120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:21,996][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 08:06:25,131][12883] Updated weights for policy 0, policy_version 92073 (0.0029) +[2024-06-18 08:06:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1508605952. Throughput: 0: 42376.5. Samples: 1508755340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:26,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 08:06:28,738][12883] Updated weights for policy 0, policy_version 92083 (0.0027) +[2024-06-18 08:06:31,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1508802560. Throughput: 0: 42581.2. Samples: 1508887780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:31,994][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 08:06:32,597][12883] Updated weights for policy 0, policy_version 92093 (0.0037) +[2024-06-18 08:06:36,281][12883] Updated weights for policy 0, policy_version 92103 (0.0041) +[2024-06-18 08:06:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1509031936. Throughput: 0: 42574.8. Samples: 1509145160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:36,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 08:06:37,024][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092105_1509048320.pth... +[2024-06-18 08:06:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091478_1498775552.pth +[2024-06-18 08:06:40,358][12883] Updated weights for policy 0, policy_version 92113 (0.0024) +[2024-06-18 08:06:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1509228544. Throughput: 0: 42449.8. Samples: 1509398560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:41,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 08:06:43,899][12883] Updated weights for policy 0, policy_version 92123 (0.0037) +[2024-06-18 08:06:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1509441536. Throughput: 0: 42593.8. Samples: 1509524820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:46,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 08:06:48,061][12862] Signal inference workers to stop experience collection... (22000 times) +[2024-06-18 08:06:48,062][12862] Signal inference workers to resume experience collection... (22000 times) +[2024-06-18 08:06:48,080][12883] InferenceWorker_p0-w0: stopping experience collection (22000 times) +[2024-06-18 08:06:48,110][12883] InferenceWorker_p0-w0: resuming experience collection (22000 times) +[2024-06-18 08:06:48,212][12883] Updated weights for policy 0, policy_version 92133 (0.0030) +[2024-06-18 08:06:51,790][12883] Updated weights for policy 0, policy_version 92143 (0.0053) +[2024-06-18 08:06:51,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 1509670912. Throughput: 0: 42641.1. Samples: 1509780960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:51,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 08:06:55,765][12883] Updated weights for policy 0, policy_version 92153 (0.0027) +[2024-06-18 08:06:56,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1509867520. Throughput: 0: 42706.7. Samples: 1510035220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) +[2024-06-18 08:06:56,995][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 08:06:59,330][12883] Updated weights for policy 0, policy_version 92163 (0.0037) +[2024-06-18 08:07:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1510080512. Throughput: 0: 42666.5. Samples: 1510162520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:01,994][12645] Avg episode reward: [(0, '0.637')] +[2024-06-18 08:07:03,512][12883] Updated weights for policy 0, policy_version 92173 (0.0050) +[2024-06-18 08:07:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1510309888. Throughput: 0: 42607.3. Samples: 1510420360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:06,994][12645] Avg episode reward: [(0, '0.145')] +[2024-06-18 08:07:07,137][12883] Updated weights for policy 0, policy_version 92183 (0.0025) +[2024-06-18 08:07:11,178][12883] Updated weights for policy 0, policy_version 92193 (0.0031) +[2024-06-18 08:07:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1510522880. Throughput: 0: 42618.6. Samples: 1510673180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:11,994][12645] Avg episode reward: [(0, '0.185')] +[2024-06-18 08:07:14,573][12883] Updated weights for policy 0, policy_version 92203 (0.0032) +[2024-06-18 08:07:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 1510735872. Throughput: 0: 42571.4. Samples: 1510803500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:16,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 08:07:18,884][12883] Updated weights for policy 0, policy_version 92213 (0.0029) +[2024-06-18 08:07:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1510948864. Throughput: 0: 42488.0. Samples: 1511057120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:21,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 08:07:22,498][12883] Updated weights for policy 0, policy_version 92223 (0.0044) +[2024-06-18 08:07:26,416][12883] Updated weights for policy 0, policy_version 92233 (0.0031) +[2024-06-18 08:07:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1511178240. Throughput: 0: 42654.1. Samples: 1511318000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:26,994][12645] Avg episode reward: [(0, '0.182')] +[2024-06-18 08:07:30,148][12883] Updated weights for policy 0, policy_version 92243 (0.0032) +[2024-06-18 08:07:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1511358464. Throughput: 0: 42671.6. Samples: 1511445040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:31,994][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 08:07:34,141][12883] Updated weights for policy 0, policy_version 92253 (0.0033) +[2024-06-18 08:07:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1511604224. Throughput: 0: 42769.2. Samples: 1511705580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:36,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 08:07:37,715][12883] Updated weights for policy 0, policy_version 92263 (0.0033) +[2024-06-18 08:07:41,828][12883] Updated weights for policy 0, policy_version 92273 (0.0044) +[2024-06-18 08:07:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1511800832. Throughput: 0: 42798.3. Samples: 1511961140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:41,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 08:07:45,369][12883] Updated weights for policy 0, policy_version 92283 (0.0044) +[2024-06-18 08:07:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1511997440. Throughput: 0: 42682.1. Samples: 1512083220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:46,995][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 08:07:49,430][12883] Updated weights for policy 0, policy_version 92293 (0.0043) +[2024-06-18 08:07:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1512243200. Throughput: 0: 42752.4. Samples: 1512344220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 08:07:52,003][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 08:07:52,925][12883] Updated weights for policy 0, policy_version 92303 (0.0032) +[2024-06-18 08:07:56,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1512439808. Throughput: 0: 42949.4. Samples: 1512605900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:07:56,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 08:07:57,083][12883] Updated weights for policy 0, policy_version 92313 (0.0027) +[2024-06-18 08:08:00,509][12883] Updated weights for policy 0, policy_version 92323 (0.0027) +[2024-06-18 08:08:01,996][12645] Fps is (10 sec: 40951.5, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 1512652800. Throughput: 0: 42753.1. Samples: 1512727480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:01,997][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 08:08:04,528][12883] Updated weights for policy 0, policy_version 92333 (0.0029) +[2024-06-18 08:08:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1512865792. Throughput: 0: 42801.0. Samples: 1512983160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:06,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 08:08:07,217][12862] Signal inference workers to stop experience collection... (22050 times) +[2024-06-18 08:08:07,262][12883] InferenceWorker_p0-w0: stopping experience collection (22050 times) +[2024-06-18 08:08:07,267][12862] Signal inference workers to resume experience collection... (22050 times) +[2024-06-18 08:08:07,283][12883] InferenceWorker_p0-w0: resuming experience collection (22050 times) +[2024-06-18 08:08:08,008][12883] Updated weights for policy 0, policy_version 92343 (0.0032) +[2024-06-18 08:08:11,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1513095168. Throughput: 0: 42753.8. Samples: 1513241920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:11,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 08:08:12,464][12883] Updated weights for policy 0, policy_version 92353 (0.0028) +[2024-06-18 08:08:15,872][12883] Updated weights for policy 0, policy_version 92363 (0.0040) +[2024-06-18 08:08:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1513308160. Throughput: 0: 42800.4. Samples: 1513371060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:16,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 08:08:19,877][12883] Updated weights for policy 0, policy_version 92373 (0.0030) +[2024-06-18 08:08:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1513504768. Throughput: 0: 42735.6. Samples: 1513628680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:21,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 08:08:23,728][12883] Updated weights for policy 0, policy_version 92383 (0.0030) +[2024-06-18 08:08:27,001][12645] Fps is (10 sec: 40930.1, 60 sec: 42320.3, 300 sec: 42708.4). Total num frames: 1513717760. Throughput: 0: 42877.6. Samples: 1513890940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:27,002][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 08:08:27,388][12883] Updated weights for policy 0, policy_version 92393 (0.0035) +[2024-06-18 08:08:31,269][12883] Updated weights for policy 0, policy_version 92403 (0.0035) +[2024-06-18 08:08:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1513963520. Throughput: 0: 43059.2. Samples: 1514020880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:31,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 08:08:34,962][12883] Updated weights for policy 0, policy_version 92413 (0.0031) +[2024-06-18 08:08:36,994][12645] Fps is (10 sec: 42628.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1514143744. Throughput: 0: 42789.8. Samples: 1514269760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:36,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 08:08:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092416_1514143744.pth... +[2024-06-18 08:08:37,121][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091792_1503920128.pth +[2024-06-18 08:08:39,110][12883] Updated weights for policy 0, policy_version 92423 (0.0036) +[2024-06-18 08:08:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42654.2). Total num frames: 1514356736. Throughput: 0: 42747.6. Samples: 1514529540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:41,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 08:08:42,571][12883] Updated weights for policy 0, policy_version 92433 (0.0042) +[2024-06-18 08:08:46,692][12883] Updated weights for policy 0, policy_version 92443 (0.0032) +[2024-06-18 08:08:46,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1514586112. Throughput: 0: 42846.6. Samples: 1514655480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:08:46,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 08:08:50,461][12883] Updated weights for policy 0, policy_version 92453 (0.0050) +[2024-06-18 08:08:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1514782720. Throughput: 0: 42728.8. Samples: 1514905960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:08:51,994][12645] Avg episode reward: [(0, '0.671')] +[2024-06-18 08:08:54,306][12883] Updated weights for policy 0, policy_version 92463 (0.0028) +[2024-06-18 08:08:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1514995712. Throughput: 0: 42820.1. Samples: 1515168820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:08:56,994][12645] Avg episode reward: [(0, '0.671')] +[2024-06-18 08:08:57,984][12883] Updated weights for policy 0, policy_version 92473 (0.0029) +[2024-06-18 08:09:01,906][12883] Updated weights for policy 0, policy_version 92483 (0.0027) +[2024-06-18 08:09:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 1515241472. Throughput: 0: 42931.5. Samples: 1515302980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:01,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 08:09:05,758][12883] Updated weights for policy 0, policy_version 92493 (0.0028) +[2024-06-18 08:09:06,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1515438080. Throughput: 0: 42736.1. Samples: 1515551900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:06,996][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 08:09:09,767][12883] Updated weights for policy 0, policy_version 92503 (0.0035) +[2024-06-18 08:09:11,935][12862] Signal inference workers to stop experience collection... (22100 times) +[2024-06-18 08:09:11,972][12883] InferenceWorker_p0-w0: stopping experience collection (22100 times) +[2024-06-18 08:09:11,983][12862] Signal inference workers to resume experience collection... (22100 times) +[2024-06-18 08:09:11,991][12883] InferenceWorker_p0-w0: resuming experience collection (22100 times) +[2024-06-18 08:09:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1515651072. Throughput: 0: 42793.9. Samples: 1515816360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:11,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 08:09:13,182][12883] Updated weights for policy 0, policy_version 92513 (0.0030) +[2024-06-18 08:09:16,994][12645] Fps is (10 sec: 42607.2, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 1515864064. Throughput: 0: 42689.2. Samples: 1515941900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:16,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 08:09:17,286][12883] Updated weights for policy 0, policy_version 92523 (0.0036) +[2024-06-18 08:09:20,878][12883] Updated weights for policy 0, policy_version 92533 (0.0027) +[2024-06-18 08:09:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1516093440. Throughput: 0: 42824.6. Samples: 1516196860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:21,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 08:09:24,918][12883] Updated weights for policy 0, policy_version 92543 (0.0029) +[2024-06-18 08:09:26,994][12645] Fps is (10 sec: 44237.7, 60 sec: 43149.8, 300 sec: 42820.6). Total num frames: 1516306432. Throughput: 0: 42835.9. Samples: 1516457160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:26,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 08:09:28,674][12883] Updated weights for policy 0, policy_version 92553 (0.0034) +[2024-06-18 08:09:31,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1516519424. Throughput: 0: 42800.0. Samples: 1516581580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:31,996][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 08:09:32,449][12883] Updated weights for policy 0, policy_version 92563 (0.0032) +[2024-06-18 08:09:36,586][12883] Updated weights for policy 0, policy_version 92573 (0.0029) +[2024-06-18 08:09:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1516732416. Throughput: 0: 43088.5. Samples: 1516844940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:36,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 08:09:40,179][12883] Updated weights for policy 0, policy_version 92583 (0.0043) +[2024-06-18 08:09:41,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1516945408. Throughput: 0: 42820.9. Samples: 1517095760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 08:09:41,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 08:09:44,424][12883] Updated weights for policy 0, policy_version 92593 (0.0033) +[2024-06-18 08:09:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1517158400. Throughput: 0: 42765.3. Samples: 1517227420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:09:46,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 08:09:47,795][12883] Updated weights for policy 0, policy_version 92603 (0.0049) +[2024-06-18 08:09:51,970][12883] Updated weights for policy 0, policy_version 92613 (0.0027) +[2024-06-18 08:09:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1517371392. Throughput: 0: 43031.0. Samples: 1517488200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:09:51,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 08:09:55,541][12883] Updated weights for policy 0, policy_version 92623 (0.0029) +[2024-06-18 08:09:56,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 1517617152. Throughput: 0: 42758.2. Samples: 1517740480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:09:56,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 08:09:59,532][12883] Updated weights for policy 0, policy_version 92633 (0.0039) +[2024-06-18 08:10:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.5). Total num frames: 1517797376. Throughput: 0: 43038.4. Samples: 1517878620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:10:01,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 08:10:03,058][12883] Updated weights for policy 0, policy_version 92643 (0.0026) +[2024-06-18 08:10:06,962][12883] Updated weights for policy 0, policy_version 92653 (0.0021) +[2024-06-18 08:10:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43146.0, 300 sec: 42765.0). Total num frames: 1518026752. Throughput: 0: 43090.9. Samples: 1518135960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:10:06,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 08:10:10,627][12883] Updated weights for policy 0, policy_version 92663 (0.0034) +[2024-06-18 08:10:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1518256128. Throughput: 0: 42917.7. Samples: 1518388460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:10:11,997][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 08:10:14,457][12883] Updated weights for policy 0, policy_version 92673 (0.0028) +[2024-06-18 08:10:16,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42871.6, 300 sec: 42765.9). Total num frames: 1518436352. Throughput: 0: 43136.9. Samples: 1518522640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:10:16,994][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 08:10:18,157][12883] Updated weights for policy 0, policy_version 92683 (0.0033) +[2024-06-18 08:10:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1518649344. Throughput: 0: 42992.9. Samples: 1518779620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:10:21,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 08:10:22,507][12883] Updated weights for policy 0, policy_version 92693 (0.0035) +[2024-06-18 08:10:25,840][12883] Updated weights for policy 0, policy_version 92703 (0.0040) +[2024-06-18 08:10:26,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1518895104. Throughput: 0: 42883.0. Samples: 1519025500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:10:26,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 08:10:30,308][12883] Updated weights for policy 0, policy_version 92713 (0.0047) +[2024-06-18 08:10:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 1519058944. Throughput: 0: 43010.2. Samples: 1519162880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:10:31,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 08:10:33,543][12883] Updated weights for policy 0, policy_version 92723 (0.0036) +[2024-06-18 08:10:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1519288320. Throughput: 0: 42752.5. Samples: 1519412060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:10:36,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 08:10:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092730_1519288320.pth... +[2024-06-18 08:10:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092105_1509048320.pth +[2024-06-18 08:10:38,127][12883] Updated weights for policy 0, policy_version 92733 (0.0036) +[2024-06-18 08:10:41,174][12862] Signal inference workers to stop experience collection... (22150 times) +[2024-06-18 08:10:41,175][12862] Signal inference workers to resume experience collection... (22150 times) +[2024-06-18 08:10:41,218][12883] InferenceWorker_p0-w0: stopping experience collection (22150 times) +[2024-06-18 08:10:41,218][12883] InferenceWorker_p0-w0: resuming experience collection (22150 times) +[2024-06-18 08:10:41,308][12883] Updated weights for policy 0, policy_version 92743 (0.0042) +[2024-06-18 08:10:41,994][12645] Fps is (10 sec: 49151.9, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 1519550464. Throughput: 0: 42705.9. Samples: 1519662240. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:10:41,994][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 08:10:46,004][12883] Updated weights for policy 0, policy_version 92753 (0.0022) +[2024-06-18 08:10:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 1519714304. Throughput: 0: 42673.7. Samples: 1519798940. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:10:46,995][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 08:10:48,864][12883] Updated weights for policy 0, policy_version 92763 (0.0033) +[2024-06-18 08:10:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1519943680. Throughput: 0: 42666.7. Samples: 1520055960. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:10:51,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 08:10:53,656][12883] Updated weights for policy 0, policy_version 92773 (0.0027) +[2024-06-18 08:10:56,895][12883] Updated weights for policy 0, policy_version 92783 (0.0038) +[2024-06-18 08:10:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1520156672. Throughput: 0: 42656.0. Samples: 1520307980. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:10:56,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 08:11:01,504][12883] Updated weights for policy 0, policy_version 92793 (0.0034) +[2024-06-18 08:11:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1520336896. Throughput: 0: 42333.3. Samples: 1520427640. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:11:01,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 08:11:04,631][12883] Updated weights for policy 0, policy_version 92803 (0.0040) +[2024-06-18 08:11:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1520599040. Throughput: 0: 42225.3. Samples: 1520679760. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:11:06,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 08:11:09,023][12883] Updated weights for policy 0, policy_version 92813 (0.0030) +[2024-06-18 08:11:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 1520779264. Throughput: 0: 42713.0. Samples: 1520947580. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:11:11,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 08:11:12,359][12883] Updated weights for policy 0, policy_version 92823 (0.0045) +[2024-06-18 08:11:16,625][12883] Updated weights for policy 0, policy_version 92833 (0.0033) +[2024-06-18 08:11:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42654.2). Total num frames: 1520975872. Throughput: 0: 42352.4. Samples: 1521068740. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:11:16,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 08:11:19,873][12883] Updated weights for policy 0, policy_version 92843 (0.0032) +[2024-06-18 08:11:21,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1521254400. Throughput: 0: 42480.4. Samples: 1521323680. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:11:21,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 08:11:24,242][12883] Updated weights for policy 0, policy_version 92853 (0.0029) +[2024-06-18 08:11:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1521418240. Throughput: 0: 42932.8. Samples: 1521594220. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:11:26,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 08:11:27,445][12883] Updated weights for policy 0, policy_version 92863 (0.0035) +[2024-06-18 08:11:31,994][12645] Fps is (10 sec: 36045.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1521614848. Throughput: 0: 42519.7. Samples: 1521712320. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:11:31,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 08:11:32,194][12883] Updated weights for policy 0, policy_version 92873 (0.0028) +[2024-06-18 08:11:34,929][12883] Updated weights for policy 0, policy_version 92883 (0.0035) +[2024-06-18 08:11:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1521876992. Throughput: 0: 42538.2. Samples: 1521970180. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) +[2024-06-18 08:11:36,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 08:11:39,878][12883] Updated weights for policy 0, policy_version 92893 (0.0038) +[2024-06-18 08:11:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 1522057216. Throughput: 0: 42888.5. Samples: 1522237960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:11:41,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 08:11:42,694][12883] Updated weights for policy 0, policy_version 92903 (0.0029) +[2024-06-18 08:11:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1522253824. Throughput: 0: 42807.5. Samples: 1522353980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:11:46,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 08:11:47,448][12883] Updated weights for policy 0, policy_version 92913 (0.0029) +[2024-06-18 08:11:50,336][12862] Signal inference workers to stop experience collection... (22200 times) +[2024-06-18 08:11:50,389][12862] Signal inference workers to resume experience collection... (22200 times) +[2024-06-18 08:11:50,391][12883] InferenceWorker_p0-w0: stopping experience collection (22200 times) +[2024-06-18 08:11:50,419][12883] InferenceWorker_p0-w0: resuming experience collection (22200 times) +[2024-06-18 08:11:50,529][12883] Updated weights for policy 0, policy_version 92923 (0.0043) +[2024-06-18 08:11:51,994][12645] Fps is (10 sec: 47513.7, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 1522532352. Throughput: 0: 43027.6. Samples: 1522616000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:11:51,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 08:11:55,047][12883] Updated weights for policy 0, policy_version 92933 (0.0023) +[2024-06-18 08:11:56,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1522712576. Throughput: 0: 42939.0. Samples: 1522879840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:11:56,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 08:11:58,151][12883] Updated weights for policy 0, policy_version 92943 (0.0025) +[2024-06-18 08:12:01,994][12645] Fps is (10 sec: 36044.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1522892800. Throughput: 0: 42828.4. Samples: 1522996020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:12:01,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 08:12:02,786][12883] Updated weights for policy 0, policy_version 92953 (0.0034) +[2024-06-18 08:12:05,679][12883] Updated weights for policy 0, policy_version 92963 (0.0028) +[2024-06-18 08:12:06,996][12645] Fps is (10 sec: 47503.4, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 1523187712. Throughput: 0: 43093.4. Samples: 1523262980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:12:06,997][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 08:12:10,543][12883] Updated weights for policy 0, policy_version 92973 (0.0041) +[2024-06-18 08:12:11,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1523351552. Throughput: 0: 42897.5. Samples: 1523524700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:12:11,996][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 08:12:13,295][12883] Updated weights for policy 0, policy_version 92983 (0.0040) +[2024-06-18 08:12:16,994][12645] Fps is (10 sec: 36052.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1523548160. Throughput: 0: 42891.0. Samples: 1523642420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:12:16,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 08:12:18,065][12883] Updated weights for policy 0, policy_version 92993 (0.0033) +[2024-06-18 08:12:20,804][12883] Updated weights for policy 0, policy_version 93003 (0.0038) +[2024-06-18 08:12:21,996][12645] Fps is (10 sec: 49151.9, 60 sec: 43143.0, 300 sec: 42931.3). Total num frames: 1523843072. Throughput: 0: 43055.7. Samples: 1523907780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:12:21,997][12645] Avg episode reward: [(0, '0.115')] +[2024-06-18 08:12:25,671][12883] Updated weights for policy 0, policy_version 93013 (0.0032) +[2024-06-18 08:12:26,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1523990528. Throughput: 0: 42977.3. Samples: 1524171940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:12:26,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 08:12:28,445][12883] Updated weights for policy 0, policy_version 93023 (0.0028) +[2024-06-18 08:12:31,996][12645] Fps is (10 sec: 36044.7, 60 sec: 43142.8, 300 sec: 42709.2). Total num frames: 1524203520. Throughput: 0: 42784.0. Samples: 1524279360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:12:31,997][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 08:12:33,554][12883] Updated weights for policy 0, policy_version 93033 (0.0051) +[2024-06-18 08:12:36,054][12883] Updated weights for policy 0, policy_version 93043 (0.0030) +[2024-06-18 08:12:36,994][12645] Fps is (10 sec: 47513.8, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 1524465664. Throughput: 0: 42893.8. Samples: 1524546220. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:12:36,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 08:12:37,066][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093047_1524482048.pth... +[2024-06-18 08:12:37,134][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092416_1514143744.pth +[2024-06-18 08:12:41,096][12883] Updated weights for policy 0, policy_version 93053 (0.0034) +[2024-06-18 08:12:41,996][12645] Fps is (10 sec: 40960.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 1524613120. Throughput: 0: 42875.8. Samples: 1524809340. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:12:41,996][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 08:12:43,811][12883] Updated weights for policy 0, policy_version 93063 (0.0033) +[2024-06-18 08:12:46,994][12645] Fps is (10 sec: 39320.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1524858880. Throughput: 0: 42872.8. Samples: 1524925300. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:12:46,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 08:12:48,661][12883] Updated weights for policy 0, policy_version 93073 (0.0033) +[2024-06-18 08:12:49,832][12862] Signal inference workers to stop experience collection... (22250 times) +[2024-06-18 08:12:49,833][12862] Signal inference workers to resume experience collection... (22250 times) +[2024-06-18 08:12:49,864][12883] InferenceWorker_p0-w0: stopping experience collection (22250 times) +[2024-06-18 08:12:49,864][12883] InferenceWorker_p0-w0: resuming experience collection (22250 times) +[2024-06-18 08:12:51,361][12883] Updated weights for policy 0, policy_version 93083 (0.0036) +[2024-06-18 08:12:51,994][12645] Fps is (10 sec: 49162.1, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 1525104640. Throughput: 0: 42938.9. Samples: 1525195140. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:12:51,995][12645] Avg episode reward: [(0, '0.159')] +[2024-06-18 08:12:56,249][12883] Updated weights for policy 0, policy_version 93093 (0.0031) +[2024-06-18 08:12:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 1525252096. Throughput: 0: 42726.0. Samples: 1525447280. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:12:56,994][12645] Avg episode reward: [(0, '0.599')] +[2024-06-18 08:12:59,316][12883] Updated weights for policy 0, policy_version 93103 (0.0031) +[2024-06-18 08:13:01,994][12645] Fps is (10 sec: 39322.5, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 1525497856. Throughput: 0: 42706.8. Samples: 1525564220. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:13:01,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 08:13:03,805][12883] Updated weights for policy 0, policy_version 93113 (0.0023) +[2024-06-18 08:13:06,909][12883] Updated weights for policy 0, policy_version 93123 (0.0034) +[2024-06-18 08:13:06,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42326.9, 300 sec: 42820.6). Total num frames: 1525727232. Throughput: 0: 42859.4. Samples: 1525836360. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:13:06,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 08:13:11,409][12883] Updated weights for policy 0, policy_version 93133 (0.0029) +[2024-06-18 08:13:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1525891072. Throughput: 0: 42570.1. Samples: 1526087600. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:13:11,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 08:13:14,442][12883] Updated weights for policy 0, policy_version 93143 (0.0036) +[2024-06-18 08:13:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1526153216. Throughput: 0: 42857.6. Samples: 1526207860. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:13:16,995][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 08:13:19,063][12883] Updated weights for policy 0, policy_version 93153 (0.0032) +[2024-06-18 08:13:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 41780.8, 300 sec: 42821.6). Total num frames: 1526349824. Throughput: 0: 42897.2. Samples: 1526476600. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:13:21,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 08:13:22,290][12883] Updated weights for policy 0, policy_version 93163 (0.0041) +[2024-06-18 08:13:26,677][12883] Updated weights for policy 0, policy_version 93173 (0.0027) +[2024-06-18 08:13:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1526546432. Throughput: 0: 42743.4. Samples: 1526732700. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) +[2024-06-18 08:13:26,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 08:13:29,788][12883] Updated weights for policy 0, policy_version 93183 (0.0041) +[2024-06-18 08:13:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43419.2, 300 sec: 42931.6). Total num frames: 1526808576. Throughput: 0: 42866.3. Samples: 1526854280. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:13:31,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 08:13:34,173][12883] Updated weights for policy 0, policy_version 93193 (0.0044) +[2024-06-18 08:13:36,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42323.7, 300 sec: 42875.8). Total num frames: 1527005184. Throughput: 0: 42750.9. Samples: 1527119020. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:13:36,997][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 08:13:37,411][12883] Updated weights for policy 0, policy_version 93203 (0.0031) +[2024-06-18 08:13:41,654][12883] Updated weights for policy 0, policy_version 93213 (0.0042) +[2024-06-18 08:13:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 1527201792. Throughput: 0: 42787.2. Samples: 1527372700. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:13:41,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 08:13:45,175][12883] Updated weights for policy 0, policy_version 93223 (0.0028) +[2024-06-18 08:13:46,994][12645] Fps is (10 sec: 44246.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1527447552. Throughput: 0: 42951.4. Samples: 1527497040. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:13:46,994][12645] Avg episode reward: [(0, '0.675')] +[2024-06-18 08:13:49,177][12883] Updated weights for policy 0, policy_version 93233 (0.0040) +[2024-06-18 08:13:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 1527644160. Throughput: 0: 42772.9. Samples: 1527761140. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:13:51,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 08:13:52,722][12883] Updated weights for policy 0, policy_version 93243 (0.0032) +[2024-06-18 08:13:56,784][12862] Signal inference workers to stop experience collection... (22300 times) +[2024-06-18 08:13:56,785][12862] Signal inference workers to resume experience collection... (22300 times) +[2024-06-18 08:13:56,805][12883] InferenceWorker_p0-w0: stopping experience collection (22300 times) +[2024-06-18 08:13:56,805][12883] InferenceWorker_p0-w0: resuming experience collection (22300 times) +[2024-06-18 08:13:56,936][12883] Updated weights for policy 0, policy_version 93253 (0.0036) +[2024-06-18 08:13:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1527857152. Throughput: 0: 42783.0. Samples: 1528012840. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:13:56,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 08:14:00,520][12883] Updated weights for policy 0, policy_version 93263 (0.0031) +[2024-06-18 08:14:02,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42867.0, 300 sec: 42820.0). Total num frames: 1528070144. Throughput: 0: 42893.3. Samples: 1528138320. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:14:02,000][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 08:14:04,584][12883] Updated weights for policy 0, policy_version 93273 (0.0041) +[2024-06-18 08:14:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1528283136. Throughput: 0: 42846.6. Samples: 1528404700. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:14:06,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 08:14:08,210][12883] Updated weights for policy 0, policy_version 93283 (0.0032) +[2024-06-18 08:14:11,994][12645] Fps is (10 sec: 42624.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1528496128. Throughput: 0: 42524.4. Samples: 1528646300. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:14:11,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 08:14:12,158][12883] Updated weights for policy 0, policy_version 93293 (0.0039) +[2024-06-18 08:14:15,875][12883] Updated weights for policy 0, policy_version 93303 (0.0027) +[2024-06-18 08:14:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1528725504. Throughput: 0: 42766.8. Samples: 1528778780. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:14:16,994][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 08:14:19,738][12883] Updated weights for policy 0, policy_version 93313 (0.0039) +[2024-06-18 08:14:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1528889344. Throughput: 0: 42536.3. Samples: 1529033060. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:14:21,994][12645] Avg episode reward: [(0, '0.126')] +[2024-06-18 08:14:23,590][12883] Updated weights for policy 0, policy_version 93323 (0.0025) +[2024-06-18 08:14:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 1529135104. Throughput: 0: 42514.6. Samples: 1529285860. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) +[2024-06-18 08:14:26,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 08:14:27,453][12883] Updated weights for policy 0, policy_version 93333 (0.0026) +[2024-06-18 08:14:31,158][12883] Updated weights for policy 0, policy_version 93343 (0.0035) +[2024-06-18 08:14:31,994][12645] Fps is (10 sec: 47514.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1529364480. Throughput: 0: 42695.7. Samples: 1529418340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:14:31,994][12645] Avg episode reward: [(0, '0.169')] +[2024-06-18 08:14:35,258][12883] Updated weights for policy 0, policy_version 93353 (0.0027) +[2024-06-18 08:14:36,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42325.3, 300 sec: 42709.2). Total num frames: 1529544704. Throughput: 0: 42527.2. Samples: 1529674960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:14:36,997][12645] Avg episode reward: [(0, '0.205')] +[2024-06-18 08:14:37,151][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093357_1529561088.pth... +[2024-06-18 08:14:37,207][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092730_1519288320.pth +[2024-06-18 08:14:38,715][12883] Updated weights for policy 0, policy_version 93363 (0.0032) +[2024-06-18 08:14:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1529774080. Throughput: 0: 42741.9. Samples: 1529936220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:14:41,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 08:14:43,000][12883] Updated weights for policy 0, policy_version 93373 (0.0029) +[2024-06-18 08:14:46,260][12883] Updated weights for policy 0, policy_version 93383 (0.0038) +[2024-06-18 08:14:46,994][12645] Fps is (10 sec: 47523.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1530019840. Throughput: 0: 42847.2. Samples: 1530066180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:14:46,994][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 08:14:50,705][12883] Updated weights for policy 0, policy_version 93393 (0.0031) +[2024-06-18 08:14:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1530183680. Throughput: 0: 42647.6. Samples: 1530323840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:14:51,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 08:14:53,768][12883] Updated weights for policy 0, policy_version 93403 (0.0036) +[2024-06-18 08:14:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1530413056. Throughput: 0: 42994.1. Samples: 1530581040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:14:57,000][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 08:14:58,486][12883] Updated weights for policy 0, policy_version 93413 (0.0037) +[2024-06-18 08:15:01,554][12883] Updated weights for policy 0, policy_version 93423 (0.0029) +[2024-06-18 08:15:01,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 1530658816. Throughput: 0: 42930.2. Samples: 1530710640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:15:01,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 08:15:06,340][12883] Updated weights for policy 0, policy_version 93433 (0.0030) +[2024-06-18 08:15:06,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1530822656. Throughput: 0: 42957.0. Samples: 1530966120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:15:06,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 08:15:07,744][12862] Signal inference workers to stop experience collection... (22350 times) +[2024-06-18 08:15:07,744][12862] Signal inference workers to resume experience collection... (22350 times) +[2024-06-18 08:15:07,763][12883] InferenceWorker_p0-w0: stopping experience collection (22350 times) +[2024-06-18 08:15:07,763][12883] InferenceWorker_p0-w0: resuming experience collection (22350 times) +[2024-06-18 08:15:09,093][12883] Updated weights for policy 0, policy_version 93443 (0.0034) +[2024-06-18 08:15:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1531052032. Throughput: 0: 43072.5. Samples: 1531224120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:15:11,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 08:15:13,959][12883] Updated weights for policy 0, policy_version 93453 (0.0029) +[2024-06-18 08:15:16,721][12883] Updated weights for policy 0, policy_version 93463 (0.0035) +[2024-06-18 08:15:16,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1531297792. Throughput: 0: 43136.3. Samples: 1531359480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:15:16,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 08:15:21,831][12883] Updated weights for policy 0, policy_version 93473 (0.0032) +[2024-06-18 08:15:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1531461632. Throughput: 0: 43032.9. Samples: 1531611340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) +[2024-06-18 08:15:21,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 08:15:24,244][12883] Updated weights for policy 0, policy_version 93483 (0.0030) +[2024-06-18 08:15:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1531691008. Throughput: 0: 42834.6. Samples: 1531863780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:15:26,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 08:15:29,617][12883] Updated weights for policy 0, policy_version 93493 (0.0033) +[2024-06-18 08:15:31,867][12883] Updated weights for policy 0, policy_version 93503 (0.0029) +[2024-06-18 08:15:31,994][12645] Fps is (10 sec: 49152.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1531953152. Throughput: 0: 42940.6. Samples: 1531998500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:15:31,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 08:15:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1532084224. Throughput: 0: 42723.0. Samples: 1532246380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:15:36,994][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 08:15:37,215][12883] Updated weights for policy 0, policy_version 93513 (0.0037) +[2024-06-18 08:15:39,631][12883] Updated weights for policy 0, policy_version 93523 (0.0044) +[2024-06-18 08:15:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1532329984. Throughput: 0: 42625.9. Samples: 1532499200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:15:41,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 08:15:44,771][12883] Updated weights for policy 0, policy_version 93533 (0.0028) +[2024-06-18 08:15:46,994][12645] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1532575744. Throughput: 0: 42719.9. Samples: 1532633040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:15:46,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 08:15:47,276][12883] Updated weights for policy 0, policy_version 93543 (0.0039) +[2024-06-18 08:15:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1532739584. Throughput: 0: 42548.7. Samples: 1532880820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:15:51,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 08:15:52,359][12883] Updated weights for policy 0, policy_version 93553 (0.0027) +[2024-06-18 08:15:55,214][12883] Updated weights for policy 0, policy_version 93563 (0.0038) +[2024-06-18 08:15:56,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1532985344. Throughput: 0: 42411.8. Samples: 1533132660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:15:56,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 08:15:59,997][12883] Updated weights for policy 0, policy_version 93573 (0.0037) +[2024-06-18 08:16:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1533181952. Throughput: 0: 42359.5. Samples: 1533265660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:16:01,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 08:16:03,026][12883] Updated weights for policy 0, policy_version 93583 (0.0031) +[2024-06-18 08:16:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1533394944. Throughput: 0: 42186.6. Samples: 1533509740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:16:06,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 08:16:08,063][12883] Updated weights for policy 0, policy_version 93593 (0.0038) +[2024-06-18 08:16:10,434][12862] Signal inference workers to stop experience collection... (22400 times) +[2024-06-18 08:16:10,465][12883] InferenceWorker_p0-w0: stopping experience collection (22400 times) +[2024-06-18 08:16:10,490][12862] Signal inference workers to resume experience collection... (22400 times) +[2024-06-18 08:16:10,491][12883] InferenceWorker_p0-w0: resuming experience collection (22400 times) +[2024-06-18 08:16:10,811][12883] Updated weights for policy 0, policy_version 93603 (0.0028) +[2024-06-18 08:16:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1533624320. Throughput: 0: 42256.9. Samples: 1533765340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:16:11,996][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 08:16:15,569][12883] Updated weights for policy 0, policy_version 93613 (0.0036) +[2024-06-18 08:16:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1533804544. Throughput: 0: 42189.2. Samples: 1533897020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:16:16,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 08:16:18,556][12883] Updated weights for policy 0, policy_version 93623 (0.0038) +[2024-06-18 08:16:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1534033920. Throughput: 0: 42316.9. Samples: 1534150640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:16:22,000][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 08:16:23,103][12883] Updated weights for policy 0, policy_version 93633 (0.0025) +[2024-06-18 08:16:26,448][12883] Updated weights for policy 0, policy_version 93643 (0.0028) +[2024-06-18 08:16:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1534246912. Throughput: 0: 42294.3. Samples: 1534402440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:16:26,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 08:16:30,676][12883] Updated weights for policy 0, policy_version 93653 (0.0027) +[2024-06-18 08:16:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 1534459904. Throughput: 0: 42244.1. Samples: 1534534020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:16:31,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 08:16:34,318][12883] Updated weights for policy 0, policy_version 93663 (0.0040) +[2024-06-18 08:16:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 1534689280. Throughput: 0: 42412.9. Samples: 1534789400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:16:36,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 08:16:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093670_1534689280.pth... +[2024-06-18 08:16:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093047_1524482048.pth +[2024-06-18 08:16:38,423][12883] Updated weights for policy 0, policy_version 93673 (0.0038) +[2024-06-18 08:16:41,975][12883] Updated weights for policy 0, policy_version 93683 (0.0039) +[2024-06-18 08:16:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1534902272. Throughput: 0: 42655.8. Samples: 1535052160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:16:41,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 08:16:46,050][12883] Updated weights for policy 0, policy_version 93693 (0.0041) +[2024-06-18 08:16:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 1535082496. Throughput: 0: 42377.3. Samples: 1535172640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:16:46,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 08:16:49,838][12883] Updated weights for policy 0, policy_version 93703 (0.0030) +[2024-06-18 08:16:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1535328256. Throughput: 0: 42614.7. Samples: 1535427400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:16:51,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 08:16:53,617][12883] Updated weights for policy 0, policy_version 93713 (0.0039) +[2024-06-18 08:16:56,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 1535524864. Throughput: 0: 42813.4. Samples: 1535691940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:16:56,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 08:16:57,514][12883] Updated weights for policy 0, policy_version 93723 (0.0030) +[2024-06-18 08:17:01,078][12883] Updated weights for policy 0, policy_version 93733 (0.0028) +[2024-06-18 08:17:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 1535737856. Throughput: 0: 42754.8. Samples: 1535820980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:17:01,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 08:17:05,143][12883] Updated weights for policy 0, policy_version 93743 (0.0037) +[2024-06-18 08:17:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1535950848. Throughput: 0: 42786.4. Samples: 1536076020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:17:06,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 08:17:08,626][12883] Updated weights for policy 0, policy_version 93753 (0.0025) +[2024-06-18 08:17:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1536180224. Throughput: 0: 42844.9. Samples: 1536330460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:17:11,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 08:17:12,748][12883] Updated weights for policy 0, policy_version 93763 (0.0029) +[2024-06-18 08:17:16,813][12883] Updated weights for policy 0, policy_version 93773 (0.0043) +[2024-06-18 08:17:16,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42869.9, 300 sec: 42487.3). Total num frames: 1536376832. Throughput: 0: 42775.2. Samples: 1536459000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 08:17:16,996][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 08:17:20,287][12883] Updated weights for policy 0, policy_version 93783 (0.0044) +[2024-06-18 08:17:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1536589824. Throughput: 0: 42834.0. Samples: 1536716920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:17:21,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 08:17:24,296][12883] Updated weights for policy 0, policy_version 93793 (0.0029) +[2024-06-18 08:17:26,994][12645] Fps is (10 sec: 44246.2, 60 sec: 42871.3, 300 sec: 42765.3). Total num frames: 1536819200. Throughput: 0: 42709.6. Samples: 1536974100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:17:26,994][12645] Avg episode reward: [(0, '0.061')] +[2024-06-18 08:17:27,873][12883] Updated weights for policy 0, policy_version 93803 (0.0028) +[2024-06-18 08:17:31,750][12883] Updated weights for policy 0, policy_version 93813 (0.0031) +[2024-06-18 08:17:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1537032192. Throughput: 0: 42869.9. Samples: 1537101780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:17:31,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 08:17:35,886][12883] Updated weights for policy 0, policy_version 93823 (0.0019) +[2024-06-18 08:17:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.3). Total num frames: 1537228800. Throughput: 0: 42886.7. Samples: 1537357300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:17:36,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 08:17:37,615][12862] Signal inference workers to stop experience collection... (22450 times) +[2024-06-18 08:17:37,665][12862] Signal inference workers to resume experience collection... (22450 times) +[2024-06-18 08:17:37,666][12883] InferenceWorker_p0-w0: stopping experience collection (22450 times) +[2024-06-18 08:17:37,683][12883] InferenceWorker_p0-w0: resuming experience collection (22450 times) +[2024-06-18 08:17:39,120][12883] Updated weights for policy 0, policy_version 93833 (0.0032) +[2024-06-18 08:17:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1537458176. Throughput: 0: 42796.4. Samples: 1537617780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:17:41,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 08:17:43,491][12883] Updated weights for policy 0, policy_version 93843 (0.0042) +[2024-06-18 08:17:46,678][12883] Updated weights for policy 0, policy_version 93853 (0.0035) +[2024-06-18 08:17:46,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43417.8, 300 sec: 42654.0). Total num frames: 1537687552. Throughput: 0: 42703.6. Samples: 1537742640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:17:46,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 08:17:51,222][12883] Updated weights for policy 0, policy_version 93863 (0.0032) +[2024-06-18 08:17:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1537884160. Throughput: 0: 42732.3. Samples: 1537998980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:17:51,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 08:17:54,323][12883] Updated weights for policy 0, policy_version 93873 (0.0032) +[2024-06-18 08:17:56,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1538097152. Throughput: 0: 42888.7. Samples: 1538260460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:17:56,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 08:17:58,650][12883] Updated weights for policy 0, policy_version 93883 (0.0040) +[2024-06-18 08:18:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1538326528. Throughput: 0: 42889.7. Samples: 1538388940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:18:01,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 08:18:02,042][12883] Updated weights for policy 0, policy_version 93893 (0.0036) +[2024-06-18 08:18:06,271][12883] Updated weights for policy 0, policy_version 93903 (0.0029) +[2024-06-18 08:18:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1538539520. Throughput: 0: 42950.1. Samples: 1538649680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:18:06,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 08:18:10,271][12883] Updated weights for policy 0, policy_version 93913 (0.0032) +[2024-06-18 08:18:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1538719744. Throughput: 0: 42986.4. Samples: 1538908480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:18:11,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 08:18:13,830][12883] Updated weights for policy 0, policy_version 93923 (0.0022) +[2024-06-18 08:18:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 1538949120. Throughput: 0: 42934.5. Samples: 1539033840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 08:18:16,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 08:18:17,968][12883] Updated weights for policy 0, policy_version 93933 (0.0035) +[2024-06-18 08:18:21,928][12883] Updated weights for policy 0, policy_version 93943 (0.0036) +[2024-06-18 08:18:21,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1539162112. Throughput: 0: 42798.1. Samples: 1539283220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:18:21,995][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 08:18:25,671][12883] Updated weights for policy 0, policy_version 93953 (0.0025) +[2024-06-18 08:18:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1539358720. Throughput: 0: 42851.1. Samples: 1539546080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:18:26,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 08:18:29,549][12883] Updated weights for policy 0, policy_version 93963 (0.0026) +[2024-06-18 08:18:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 1539604480. Throughput: 0: 42903.0. Samples: 1539673280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:18:31,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 08:18:33,326][12883] Updated weights for policy 0, policy_version 93973 (0.0035) +[2024-06-18 08:18:36,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1539801088. Throughput: 0: 42744.3. Samples: 1539922480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:18:36,995][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 08:18:37,125][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093983_1539817472.pth... +[2024-06-18 08:18:37,132][12883] Updated weights for policy 0, policy_version 93983 (0.0036) +[2024-06-18 08:18:37,190][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093357_1529561088.pth +[2024-06-18 08:18:41,046][12883] Updated weights for policy 0, policy_version 93993 (0.0035) +[2024-06-18 08:18:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1539997696. Throughput: 0: 42673.5. Samples: 1540180760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:18:41,994][12645] Avg episode reward: [(0, '0.239')] +[2024-06-18 08:18:44,802][12883] Updated weights for policy 0, policy_version 94003 (0.0035) +[2024-06-18 08:18:46,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1540243456. Throughput: 0: 42685.8. Samples: 1540309800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:18:46,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 08:18:48,695][12883] Updated weights for policy 0, policy_version 94013 (0.0038) +[2024-06-18 08:18:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1540440064. Throughput: 0: 42453.4. Samples: 1540560080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:18:52,000][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 08:18:52,433][12883] Updated weights for policy 0, policy_version 94023 (0.0036) +[2024-06-18 08:18:56,045][12862] Signal inference workers to stop experience collection... (22500 times) +[2024-06-18 08:18:56,045][12862] Signal inference workers to resume experience collection... (22500 times) +[2024-06-18 08:18:56,061][12883] InferenceWorker_p0-w0: stopping experience collection (22500 times) +[2024-06-18 08:18:56,061][12883] InferenceWorker_p0-w0: resuming experience collection (22500 times) +[2024-06-18 08:18:56,197][12883] Updated weights for policy 0, policy_version 94033 (0.0036) +[2024-06-18 08:18:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 1540653056. Throughput: 0: 42529.8. Samples: 1540822320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:18:56,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 08:18:59,877][12883] Updated weights for policy 0, policy_version 94043 (0.0029) +[2024-06-18 08:19:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1540866048. Throughput: 0: 42574.8. Samples: 1540949700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:19:01,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 08:19:03,842][12883] Updated weights for policy 0, policy_version 94053 (0.0027) +[2024-06-18 08:19:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1541095424. Throughput: 0: 42653.1. Samples: 1541202600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:19:06,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 08:19:07,496][12883] Updated weights for policy 0, policy_version 94063 (0.0028) +[2024-06-18 08:19:11,715][12883] Updated weights for policy 0, policy_version 94073 (0.0030) +[2024-06-18 08:19:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1541292032. Throughput: 0: 42630.3. Samples: 1541464440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 08:19:11,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 08:19:15,220][12883] Updated weights for policy 0, policy_version 94083 (0.0048) +[2024-06-18 08:19:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1541505024. Throughput: 0: 42611.1. Samples: 1541590780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:16,994][12645] Avg episode reward: [(0, '0.666')] +[2024-06-18 08:19:19,829][12883] Updated weights for policy 0, policy_version 94093 (0.0036) +[2024-06-18 08:19:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1541734400. Throughput: 0: 42707.3. Samples: 1541844300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:21,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 08:19:22,783][12883] Updated weights for policy 0, policy_version 94103 (0.0038) +[2024-06-18 08:19:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1541931008. Throughput: 0: 42804.6. Samples: 1542106980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:26,995][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 08:19:27,219][12883] Updated weights for policy 0, policy_version 94113 (0.0034) +[2024-06-18 08:19:30,352][12883] Updated weights for policy 0, policy_version 94123 (0.0036) +[2024-06-18 08:19:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 1542160384. Throughput: 0: 42668.3. Samples: 1542229880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:31,994][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 08:19:34,841][12883] Updated weights for policy 0, policy_version 94133 (0.0036) +[2024-06-18 08:19:37,000][12645] Fps is (10 sec: 44210.1, 60 sec: 42867.1, 300 sec: 42708.6). Total num frames: 1542373376. Throughput: 0: 42939.3. Samples: 1542492620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:37,000][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 08:19:37,868][12883] Updated weights for policy 0, policy_version 94143 (0.0024) +[2024-06-18 08:19:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 1542569984. Throughput: 0: 42761.2. Samples: 1542746580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:41,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 08:19:42,664][12883] Updated weights for policy 0, policy_version 94153 (0.0042) +[2024-06-18 08:19:45,482][12883] Updated weights for policy 0, policy_version 94163 (0.0025) +[2024-06-18 08:19:46,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1542799360. Throughput: 0: 42741.3. Samples: 1542873060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:46,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 08:19:50,227][12883] Updated weights for policy 0, policy_version 94173 (0.0030) +[2024-06-18 08:19:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1543012352. Throughput: 0: 42978.5. Samples: 1543136640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:51,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 08:19:53,117][12883] Updated weights for policy 0, policy_version 94183 (0.0038) +[2024-06-18 08:19:56,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 1543225344. Throughput: 0: 42743.6. Samples: 1543388000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:19:57,005][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 08:19:58,125][12883] Updated weights for policy 0, policy_version 94193 (0.0039) +[2024-06-18 08:20:00,931][12883] Updated weights for policy 0, policy_version 94203 (0.0025) +[2024-06-18 08:20:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1543454720. Throughput: 0: 42727.6. Samples: 1543513520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:20:01,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 08:20:05,755][12883] Updated weights for policy 0, policy_version 94213 (0.0026) +[2024-06-18 08:20:06,994][12645] Fps is (10 sec: 44246.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1543667712. Throughput: 0: 43056.3. Samples: 1543781840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:20:06,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 08:20:08,379][12883] Updated weights for policy 0, policy_version 94223 (0.0042) +[2024-06-18 08:20:11,703][12862] Signal inference workers to stop experience collection... (22550 times) +[2024-06-18 08:20:11,703][12862] Signal inference workers to resume experience collection... (22550 times) +[2024-06-18 08:20:11,726][12883] InferenceWorker_p0-w0: stopping experience collection (22550 times) +[2024-06-18 08:20:11,726][12883] InferenceWorker_p0-w0: resuming experience collection (22550 times) +[2024-06-18 08:20:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1543864320. Throughput: 0: 42929.9. Samples: 1544038820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:11,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 08:20:13,383][12883] Updated weights for policy 0, policy_version 94233 (0.0031) +[2024-06-18 08:20:16,008][12883] Updated weights for policy 0, policy_version 94243 (0.0031) +[2024-06-18 08:20:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1544093696. Throughput: 0: 42823.6. Samples: 1544156940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:16,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 08:20:20,978][12883] Updated weights for policy 0, policy_version 94253 (0.0040) +[2024-06-18 08:20:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1544306688. Throughput: 0: 42721.5. Samples: 1544414820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:21,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 08:20:23,958][12883] Updated weights for policy 0, policy_version 94263 (0.0034) +[2024-06-18 08:20:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1544503296. Throughput: 0: 42697.9. Samples: 1544667980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:26,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 08:20:28,794][12883] Updated weights for policy 0, policy_version 94273 (0.0030) +[2024-06-18 08:20:31,530][12883] Updated weights for policy 0, policy_version 94283 (0.0037) +[2024-06-18 08:20:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1544732672. Throughput: 0: 42663.2. Samples: 1544792900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:31,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 08:20:36,262][12883] Updated weights for policy 0, policy_version 94293 (0.0041) +[2024-06-18 08:20:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 1544945664. Throughput: 0: 42720.9. Samples: 1545059080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:36,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 08:20:37,137][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094297_1544962048.pth... +[2024-06-18 08:20:37,200][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093670_1534689280.pth +[2024-06-18 08:20:39,052][12883] Updated weights for policy 0, policy_version 94303 (0.0036) +[2024-06-18 08:20:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1545142272. Throughput: 0: 42813.8. Samples: 1545314520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:41,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 08:20:43,831][12883] Updated weights for policy 0, policy_version 94313 (0.0034) +[2024-06-18 08:20:46,580][12883] Updated weights for policy 0, policy_version 94323 (0.0039) +[2024-06-18 08:20:46,995][12645] Fps is (10 sec: 44229.5, 60 sec: 43143.3, 300 sec: 42875.9). Total num frames: 1545388032. Throughput: 0: 42880.5. Samples: 1545443220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:46,996][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 08:20:51,537][12883] Updated weights for policy 0, policy_version 94333 (0.0046) +[2024-06-18 08:20:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1545568256. Throughput: 0: 42664.2. Samples: 1545701720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:51,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 08:20:54,584][12883] Updated weights for policy 0, policy_version 94343 (0.0036) +[2024-06-18 08:20:56,994][12645] Fps is (10 sec: 39328.4, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1545781248. Throughput: 0: 42556.5. Samples: 1545953860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:20:56,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 08:20:59,197][12883] Updated weights for policy 0, policy_version 94353 (0.0039) +[2024-06-18 08:21:01,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1546027008. Throughput: 0: 42758.3. Samples: 1546081060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:21:01,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 08:21:02,189][12883] Updated weights for policy 0, policy_version 94363 (0.0036) +[2024-06-18 08:21:06,910][12883] Updated weights for policy 0, policy_version 94373 (0.0035) +[2024-06-18 08:21:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1546207232. Throughput: 0: 42644.8. Samples: 1546333840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:21:06,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 08:21:09,761][12883] Updated weights for policy 0, policy_version 94383 (0.0042) +[2024-06-18 08:21:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1546420224. Throughput: 0: 42684.9. Samples: 1546588800. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:11,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 08:21:14,846][12883] Updated weights for policy 0, policy_version 94393 (0.0038) +[2024-06-18 08:21:16,994][12645] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1546682368. Throughput: 0: 42801.8. Samples: 1546718980. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:16,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 08:21:17,309][12883] Updated weights for policy 0, policy_version 94403 (0.0021) +[2024-06-18 08:21:20,349][12862] Signal inference workers to stop experience collection... (22600 times) +[2024-06-18 08:21:20,404][12883] InferenceWorker_p0-w0: stopping experience collection (22600 times) +[2024-06-18 08:21:20,463][12862] Signal inference workers to resume experience collection... (22600 times) +[2024-06-18 08:21:20,464][12883] InferenceWorker_p0-w0: resuming experience collection (22600 times) +[2024-06-18 08:21:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 1546813440. Throughput: 0: 42452.9. Samples: 1546969460. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:22,007][12645] Avg episode reward: [(0, '0.773')] +[2024-06-18 08:21:22,014][12862] Saving new best policy, reward=0.773! +[2024-06-18 08:21:22,504][12883] Updated weights for policy 0, policy_version 94413 (0.0026) +[2024-06-18 08:21:25,223][12883] Updated weights for policy 0, policy_version 94423 (0.0032) +[2024-06-18 08:21:26,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1547059200. Throughput: 0: 42411.5. Samples: 1547223040. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:26,994][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 08:21:29,998][12883] Updated weights for policy 0, policy_version 94433 (0.0035) +[2024-06-18 08:21:31,994][12645] Fps is (10 sec: 49152.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1547304960. Throughput: 0: 42606.1. Samples: 1547360420. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:31,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 08:21:32,781][12883] Updated weights for policy 0, policy_version 94443 (0.0034) +[2024-06-18 08:21:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1547468800. Throughput: 0: 42586.9. Samples: 1547618140. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:36,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 08:21:37,591][12883] Updated weights for policy 0, policy_version 94453 (0.0025) +[2024-06-18 08:21:40,319][12883] Updated weights for policy 0, policy_version 94463 (0.0032) +[2024-06-18 08:21:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1547714560. Throughput: 0: 42457.7. Samples: 1547864460. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:41,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 08:21:45,303][12883] Updated weights for policy 0, policy_version 94473 (0.0036) +[2024-06-18 08:21:46,994][12645] Fps is (10 sec: 49153.0, 60 sec: 42872.8, 300 sec: 42820.6). Total num frames: 1547960320. Throughput: 0: 42677.4. Samples: 1548001540. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:46,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 08:21:47,712][12883] Updated weights for policy 0, policy_version 94483 (0.0040) +[2024-06-18 08:21:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1548124160. Throughput: 0: 42760.0. Samples: 1548258040. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:51,995][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 08:21:52,891][12883] Updated weights for policy 0, policy_version 94493 (0.0040) +[2024-06-18 08:21:55,497][12883] Updated weights for policy 0, policy_version 94503 (0.0040) +[2024-06-18 08:21:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1548369920. Throughput: 0: 42630.6. Samples: 1548507180. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:21:56,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 08:22:00,657][12883] Updated weights for policy 0, policy_version 94513 (0.0045) +[2024-06-18 08:22:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1548566528. Throughput: 0: 42778.2. Samples: 1548644000. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) +[2024-06-18 08:22:01,999][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 08:22:03,725][12883] Updated weights for policy 0, policy_version 94523 (0.0038) +[2024-06-18 08:22:06,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1548746752. Throughput: 0: 42791.5. Samples: 1548895080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:07,008][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 08:22:08,278][12883] Updated weights for policy 0, policy_version 94533 (0.0032) +[2024-06-18 08:22:11,357][12883] Updated weights for policy 0, policy_version 94543 (0.0035) +[2024-06-18 08:22:11,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 1549025280. Throughput: 0: 42753.6. Samples: 1549146960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:11,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 08:22:16,061][12883] Updated weights for policy 0, policy_version 94553 (0.0028) +[2024-06-18 08:22:16,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1549205504. Throughput: 0: 42784.0. Samples: 1549285700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:16,995][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 08:22:18,926][12883] Updated weights for policy 0, policy_version 94563 (0.0043) +[2024-06-18 08:22:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1549402112. Throughput: 0: 42600.5. Samples: 1549535160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:21,995][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 08:22:23,633][12883] Updated weights for policy 0, policy_version 94573 (0.0035) +[2024-06-18 08:22:26,403][12883] Updated weights for policy 0, policy_version 94583 (0.0054) +[2024-06-18 08:22:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1549664256. Throughput: 0: 42644.4. Samples: 1549783460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:26,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 08:22:31,323][12883] Updated weights for policy 0, policy_version 94593 (0.0027) +[2024-06-18 08:22:31,772][12862] Signal inference workers to stop experience collection... (22650 times) +[2024-06-18 08:22:31,772][12862] Signal inference workers to resume experience collection... (22650 times) +[2024-06-18 08:22:31,824][12883] InferenceWorker_p0-w0: stopping experience collection (22650 times) +[2024-06-18 08:22:31,824][12883] InferenceWorker_p0-w0: resuming experience collection (22650 times) +[2024-06-18 08:22:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1549844480. Throughput: 0: 42708.3. Samples: 1549923420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:31,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 08:22:34,293][12883] Updated weights for policy 0, policy_version 94603 (0.0027) +[2024-06-18 08:22:36,998][12645] Fps is (10 sec: 39303.4, 60 sec: 43141.2, 300 sec: 42708.8). Total num frames: 1550057472. Throughput: 0: 42670.3. Samples: 1550178400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:36,999][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 08:22:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094608_1550057472.pth... +[2024-06-18 08:22:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093983_1539817472.pth +[2024-06-18 08:22:38,782][12883] Updated weights for policy 0, policy_version 94613 (0.0042) +[2024-06-18 08:22:41,913][12883] Updated weights for policy 0, policy_version 94623 (0.0043) +[2024-06-18 08:22:42,000][12645] Fps is (10 sec: 45847.1, 60 sec: 43140.1, 300 sec: 42764.1). Total num frames: 1550303232. Throughput: 0: 42661.7. Samples: 1550427220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:42,001][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 08:22:46,423][12883] Updated weights for policy 0, policy_version 94633 (0.0033) +[2024-06-18 08:22:46,994][12645] Fps is (10 sec: 42618.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1550483456. Throughput: 0: 42608.4. Samples: 1550561380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:46,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 08:22:49,579][12883] Updated weights for policy 0, policy_version 94643 (0.0036) +[2024-06-18 08:22:51,994][12645] Fps is (10 sec: 37705.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1550680064. Throughput: 0: 42636.3. Samples: 1550813720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:51,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 08:22:54,425][12883] Updated weights for policy 0, policy_version 94653 (0.0027) +[2024-06-18 08:22:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1550942208. Throughput: 0: 42691.2. Samples: 1551068060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:22:56,994][12645] Avg episode reward: [(0, '0.119')] +[2024-06-18 08:22:57,550][12883] Updated weights for policy 0, policy_version 94663 (0.0044) +[2024-06-18 08:23:01,836][12883] Updated weights for policy 0, policy_version 94673 (0.0032) +[2024-06-18 08:23:01,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1551122432. Throughput: 0: 42663.1. Samples: 1551205540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:23:01,994][12645] Avg episode reward: [(0, '0.109')] +[2024-06-18 08:23:05,033][12883] Updated weights for policy 0, policy_version 94683 (0.0036) +[2024-06-18 08:23:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1551335424. Throughput: 0: 42710.2. Samples: 1551457120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:06,994][12645] Avg episode reward: [(0, '0.167')] +[2024-06-18 08:23:09,735][12883] Updated weights for policy 0, policy_version 94693 (0.0035) +[2024-06-18 08:23:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1551564800. Throughput: 0: 42803.7. Samples: 1551709620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:11,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 08:23:12,598][12883] Updated weights for policy 0, policy_version 94703 (0.0042) +[2024-06-18 08:23:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1551761408. Throughput: 0: 42710.7. Samples: 1551845400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:16,994][12645] Avg episode reward: [(0, '0.648')] +[2024-06-18 08:23:17,352][12883] Updated weights for policy 0, policy_version 94713 (0.0031) +[2024-06-18 08:23:20,246][12883] Updated weights for policy 0, policy_version 94723 (0.0031) +[2024-06-18 08:23:21,998][12645] Fps is (10 sec: 42581.4, 60 sec: 43141.7, 300 sec: 42820.0). Total num frames: 1551990784. Throughput: 0: 42620.2. Samples: 1552096280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:21,998][12645] Avg episode reward: [(0, '0.692')] +[2024-06-18 08:23:24,920][12883] Updated weights for policy 0, policy_version 94733 (0.0036) +[2024-06-18 08:23:26,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 1552203776. Throughput: 0: 42883.0. Samples: 1552356700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:26,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 08:23:27,925][12883] Updated weights for policy 0, policy_version 94743 (0.0035) +[2024-06-18 08:23:31,994][12645] Fps is (10 sec: 40976.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1552400384. Throughput: 0: 42793.3. Samples: 1552487080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:31,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 08:23:32,348][12883] Updated weights for policy 0, policy_version 94753 (0.0034) +[2024-06-18 08:23:35,612][12883] Updated weights for policy 0, policy_version 94763 (0.0042) +[2024-06-18 08:23:36,994][12645] Fps is (10 sec: 42599.9, 60 sec: 42874.9, 300 sec: 42820.6). Total num frames: 1552629760. Throughput: 0: 42842.1. Samples: 1552741600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:36,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 08:23:39,957][12883] Updated weights for policy 0, policy_version 94773 (0.0033) +[2024-06-18 08:23:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42056.6, 300 sec: 42653.9). Total num frames: 1552826368. Throughput: 0: 42971.6. Samples: 1553001780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:41,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 08:23:43,441][12883] Updated weights for policy 0, policy_version 94783 (0.0031) +[2024-06-18 08:23:46,151][12862] Signal inference workers to stop experience collection... (22700 times) +[2024-06-18 08:23:46,152][12862] Signal inference workers to resume experience collection... (22700 times) +[2024-06-18 08:23:46,194][12883] InferenceWorker_p0-w0: stopping experience collection (22700 times) +[2024-06-18 08:23:46,194][12883] InferenceWorker_p0-w0: resuming experience collection (22700 times) +[2024-06-18 08:23:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1553039360. Throughput: 0: 42757.9. Samples: 1553129640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:46,994][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 08:23:47,681][12883] Updated weights for policy 0, policy_version 94793 (0.0023) +[2024-06-18 08:23:51,119][12883] Updated weights for policy 0, policy_version 94803 (0.0030) +[2024-06-18 08:23:51,994][12645] Fps is (10 sec: 45875.6, 60 sec: 43417.9, 300 sec: 42820.6). Total num frames: 1553285120. Throughput: 0: 42785.5. Samples: 1553382460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:51,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 08:23:55,667][12883] Updated weights for policy 0, policy_version 94813 (0.0023) +[2024-06-18 08:23:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1553481728. Throughput: 0: 42808.9. Samples: 1553636020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 08:23:56,994][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 08:23:58,848][12883] Updated weights for policy 0, policy_version 94823 (0.0042) +[2024-06-18 08:24:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1553678336. Throughput: 0: 42629.0. Samples: 1553763700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:01,994][12645] Avg episode reward: [(0, '0.077')] +[2024-06-18 08:24:03,208][12883] Updated weights for policy 0, policy_version 94833 (0.0036) +[2024-06-18 08:24:06,537][12883] Updated weights for policy 0, policy_version 94843 (0.0034) +[2024-06-18 08:24:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1553924096. Throughput: 0: 42818.9. Samples: 1554022960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:06,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 08:24:10,711][12883] Updated weights for policy 0, policy_version 94853 (0.0033) +[2024-06-18 08:24:11,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 1554137088. Throughput: 0: 42704.8. Samples: 1554278500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:11,997][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 08:24:13,930][12883] Updated weights for policy 0, policy_version 94863 (0.0040) +[2024-06-18 08:24:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1554317312. Throughput: 0: 42758.7. Samples: 1554411220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:16,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 08:24:18,177][12883] Updated weights for policy 0, policy_version 94873 (0.0040) +[2024-06-18 08:24:21,866][12883] Updated weights for policy 0, policy_version 94883 (0.0040) +[2024-06-18 08:24:21,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42874.4, 300 sec: 42820.6). Total num frames: 1554563072. Throughput: 0: 42709.8. Samples: 1554663540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:21,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 08:24:25,658][12883] Updated weights for policy 0, policy_version 94893 (0.0029) +[2024-06-18 08:24:26,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 1554776064. Throughput: 0: 42715.1. Samples: 1554923960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:26,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 08:24:29,331][12883] Updated weights for policy 0, policy_version 94903 (0.0029) +[2024-06-18 08:24:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42710.4). Total num frames: 1554972672. Throughput: 0: 42730.7. Samples: 1555052520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:31,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 08:24:33,161][12883] Updated weights for policy 0, policy_version 94913 (0.0036) +[2024-06-18 08:24:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1555202048. Throughput: 0: 42956.3. Samples: 1555315500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:36,994][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 08:24:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094922_1555202048.pth... +[2024-06-18 08:24:37,103][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094297_1544962048.pth +[2024-06-18 08:24:37,247][12883] Updated weights for policy 0, policy_version 94923 (0.0028) +[2024-06-18 08:24:40,713][12883] Updated weights for policy 0, policy_version 94933 (0.0034) +[2024-06-18 08:24:41,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1555415040. Throughput: 0: 43064.8. Samples: 1555573940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:41,994][12645] Avg episode reward: [(0, '0.729')] +[2024-06-18 08:24:44,694][12883] Updated weights for policy 0, policy_version 94943 (0.0031) +[2024-06-18 08:24:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1555611648. Throughput: 0: 43039.1. Samples: 1555700460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:46,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 08:24:48,299][12883] Updated weights for policy 0, policy_version 94953 (0.0021) +[2024-06-18 08:24:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 1555841024. Throughput: 0: 43003.2. Samples: 1555958100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:51,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 08:24:52,586][12883] Updated weights for policy 0, policy_version 94963 (0.0028) +[2024-06-18 08:24:56,613][12883] Updated weights for policy 0, policy_version 94973 (0.0028) +[2024-06-18 08:24:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1556070400. Throughput: 0: 43051.0. Samples: 1556215700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 08:24:56,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 08:25:00,150][12883] Updated weights for policy 0, policy_version 94983 (0.0021) +[2024-06-18 08:25:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1556250624. Throughput: 0: 42828.5. Samples: 1556338500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:02,000][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 08:25:04,035][12883] Updated weights for policy 0, policy_version 94993 (0.0027) +[2024-06-18 08:25:04,619][12862] Signal inference workers to stop experience collection... (22750 times) +[2024-06-18 08:25:04,619][12862] Signal inference workers to resume experience collection... (22750 times) +[2024-06-18 08:25:04,663][12883] InferenceWorker_p0-w0: stopping experience collection (22750 times) +[2024-06-18 08:25:04,663][12883] InferenceWorker_p0-w0: resuming experience collection (22750 times) +[2024-06-18 08:25:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1556496384. Throughput: 0: 43078.2. Samples: 1556602060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:06,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 08:25:07,686][12883] Updated weights for policy 0, policy_version 95003 (0.0043) +[2024-06-18 08:25:11,553][12883] Updated weights for policy 0, policy_version 95013 (0.0029) +[2024-06-18 08:25:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1556709376. Throughput: 0: 42948.0. Samples: 1556856620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:11,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 08:25:15,243][12883] Updated weights for policy 0, policy_version 95023 (0.0040) +[2024-06-18 08:25:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1556889600. Throughput: 0: 42917.3. Samples: 1556983800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:16,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 08:25:19,395][12883] Updated weights for policy 0, policy_version 95033 (0.0039) +[2024-06-18 08:25:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1557135360. Throughput: 0: 42698.7. Samples: 1557236940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:21,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 08:25:23,219][12883] Updated weights for policy 0, policy_version 95043 (0.0030) +[2024-06-18 08:25:26,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 1557331968. Throughput: 0: 42742.3. Samples: 1557497440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:26,997][12645] Avg episode reward: [(0, '0.677')] +[2024-06-18 08:25:27,080][12883] Updated weights for policy 0, policy_version 95053 (0.0030) +[2024-06-18 08:25:30,680][12883] Updated weights for policy 0, policy_version 95063 (0.0040) +[2024-06-18 08:25:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1557544960. Throughput: 0: 42710.2. Samples: 1557622420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:31,994][12645] Avg episode reward: [(0, '0.704')] +[2024-06-18 08:25:34,435][12883] Updated weights for policy 0, policy_version 95073 (0.0046) +[2024-06-18 08:25:36,994][12645] Fps is (10 sec: 45885.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1557790720. Throughput: 0: 42808.4. Samples: 1557884480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:36,994][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 08:25:38,829][12883] Updated weights for policy 0, policy_version 95083 (0.0040) +[2024-06-18 08:25:41,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42709.7). Total num frames: 1557987328. Throughput: 0: 42907.0. Samples: 1558146520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:41,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 08:25:42,181][12883] Updated weights for policy 0, policy_version 95093 (0.0030) +[2024-06-18 08:25:46,275][12883] Updated weights for policy 0, policy_version 95103 (0.0038) +[2024-06-18 08:25:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1558200320. Throughput: 0: 43023.9. Samples: 1558274580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:46,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 08:25:49,632][12883] Updated weights for policy 0, policy_version 95113 (0.0026) +[2024-06-18 08:25:51,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1558413312. Throughput: 0: 42924.9. Samples: 1558533680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 08:25:51,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 08:25:53,589][12883] Updated weights for policy 0, policy_version 95123 (0.0032) +[2024-06-18 08:25:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1558626304. Throughput: 0: 43031.9. Samples: 1558793060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:25:56,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 08:25:57,319][12883] Updated weights for policy 0, policy_version 95133 (0.0039) +[2024-06-18 08:26:01,152][12883] Updated weights for policy 0, policy_version 95143 (0.0041) +[2024-06-18 08:26:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1558839296. Throughput: 0: 43018.6. Samples: 1558919640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:01,994][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 08:26:04,891][12883] Updated weights for policy 0, policy_version 95153 (0.0045) +[2024-06-18 08:26:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1559068672. Throughput: 0: 43191.2. Samples: 1559180540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:06,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 08:26:08,864][12883] Updated weights for policy 0, policy_version 95163 (0.0034) +[2024-06-18 08:26:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1559265280. Throughput: 0: 43074.7. Samples: 1559435700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:11,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 08:26:12,748][12883] Updated weights for policy 0, policy_version 95173 (0.0038) +[2024-06-18 08:26:16,506][12883] Updated weights for policy 0, policy_version 95183 (0.0032) +[2024-06-18 08:26:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 1559478272. Throughput: 0: 43208.4. Samples: 1559566800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:16,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 08:26:20,374][12883] Updated weights for policy 0, policy_version 95193 (0.0030) +[2024-06-18 08:26:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1559707648. Throughput: 0: 43124.6. Samples: 1559825080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:21,994][12645] Avg episode reward: [(0, '0.228')] +[2024-06-18 08:26:24,310][12883] Updated weights for policy 0, policy_version 95203 (0.0051) +[2024-06-18 08:26:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 1559920640. Throughput: 0: 42995.2. Samples: 1560081300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:26,995][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 08:26:28,081][12883] Updated weights for policy 0, policy_version 95213 (0.0032) +[2024-06-18 08:26:31,812][12883] Updated weights for policy 0, policy_version 95223 (0.0027) +[2024-06-18 08:26:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1560133632. Throughput: 0: 42908.9. Samples: 1560205480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:31,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 08:26:35,811][12883] Updated weights for policy 0, policy_version 95233 (0.0032) +[2024-06-18 08:26:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1560346624. Throughput: 0: 42964.7. Samples: 1560467100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:36,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 08:26:37,136][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095237_1560363008.pth... +[2024-06-18 08:26:37,188][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094608_1550057472.pth +[2024-06-18 08:26:39,486][12883] Updated weights for policy 0, policy_version 95243 (0.0029) +[2024-06-18 08:26:40,219][12862] Signal inference workers to stop experience collection... (22800 times) +[2024-06-18 08:26:40,267][12883] InferenceWorker_p0-w0: stopping experience collection (22800 times) +[2024-06-18 08:26:40,275][12862] Signal inference workers to resume experience collection... (22800 times) +[2024-06-18 08:26:40,284][12883] InferenceWorker_p0-w0: resuming experience collection (22800 times) +[2024-06-18 08:26:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1560576000. Throughput: 0: 42838.8. Samples: 1560720800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:41,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 08:26:43,739][12883] Updated weights for policy 0, policy_version 95253 (0.0045) +[2024-06-18 08:26:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1560772608. Throughput: 0: 42876.3. Samples: 1560849080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:46,995][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 08:26:47,135][12883] Updated weights for policy 0, policy_version 95263 (0.0038) +[2024-06-18 08:26:51,293][12883] Updated weights for policy 0, policy_version 95273 (0.0038) +[2024-06-18 08:26:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1560985600. Throughput: 0: 42759.5. Samples: 1561104720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 08:26:51,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 08:26:54,879][12883] Updated weights for policy 0, policy_version 95283 (0.0031) +[2024-06-18 08:26:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1561198592. Throughput: 0: 42774.5. Samples: 1561360560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:26:56,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 08:26:59,135][12883] Updated weights for policy 0, policy_version 95293 (0.0037) +[2024-06-18 08:27:01,996][12645] Fps is (10 sec: 44226.5, 60 sec: 43142.9, 300 sec: 42986.9). Total num frames: 1561427968. Throughput: 0: 42756.1. Samples: 1561490920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:01,997][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 08:27:02,437][12883] Updated weights for policy 0, policy_version 95303 (0.0037) +[2024-06-18 08:27:06,645][12883] Updated weights for policy 0, policy_version 95313 (0.0039) +[2024-06-18 08:27:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1561624576. Throughput: 0: 42806.5. Samples: 1561751380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:06,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 08:27:10,194][12883] Updated weights for policy 0, policy_version 95323 (0.0035) +[2024-06-18 08:27:11,994][12645] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1561853952. Throughput: 0: 42726.0. Samples: 1562003960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:11,994][12645] Avg episode reward: [(0, '0.080')] +[2024-06-18 08:27:14,127][12883] Updated weights for policy 0, policy_version 95333 (0.0030) +[2024-06-18 08:27:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1562066944. Throughput: 0: 42849.8. Samples: 1562133720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:16,996][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 08:27:17,785][12883] Updated weights for policy 0, policy_version 95343 (0.0042) +[2024-06-18 08:27:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1562247168. Throughput: 0: 42612.6. Samples: 1562384660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:21,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 08:27:22,109][12883] Updated weights for policy 0, policy_version 95353 (0.0029) +[2024-06-18 08:27:25,542][12883] Updated weights for policy 0, policy_version 95363 (0.0034) +[2024-06-18 08:27:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1562476544. Throughput: 0: 42658.2. Samples: 1562640420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:26,994][12645] Avg episode reward: [(0, '0.672')] +[2024-06-18 08:27:29,680][12883] Updated weights for policy 0, policy_version 95373 (0.0035) +[2024-06-18 08:27:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.7). Total num frames: 1562673152. Throughput: 0: 42701.1. Samples: 1562770620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:31,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 08:27:33,289][12883] Updated weights for policy 0, policy_version 95383 (0.0040) +[2024-06-18 08:27:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 1562902528. Throughput: 0: 42566.0. Samples: 1563020200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:36,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 08:27:37,272][12883] Updated weights for policy 0, policy_version 95393 (0.0031) +[2024-06-18 08:27:40,898][12883] Updated weights for policy 0, policy_version 95403 (0.0031) +[2024-06-18 08:27:41,993][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1563115520. Throughput: 0: 42646.4. Samples: 1563279640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:41,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 08:27:45,037][12883] Updated weights for policy 0, policy_version 95413 (0.0043) +[2024-06-18 08:27:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1563328512. Throughput: 0: 42720.4. Samples: 1563413240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 08:27:46,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 08:27:48,439][12883] Updated weights for policy 0, policy_version 95423 (0.0033) +[2024-06-18 08:27:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1563541504. Throughput: 0: 42468.9. Samples: 1563662480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:27:51,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 08:27:52,790][12883] Updated weights for policy 0, policy_version 95433 (0.0031) +[2024-06-18 08:27:56,401][12883] Updated weights for policy 0, policy_version 95443 (0.0027) +[2024-06-18 08:27:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1563754496. Throughput: 0: 42580.4. Samples: 1563920080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:27:56,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 08:28:00,398][12883] Updated weights for policy 0, policy_version 95453 (0.0030) +[2024-06-18 08:28:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 1563951104. Throughput: 0: 42577.4. Samples: 1564049700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:01,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 08:28:03,874][12883] Updated weights for policy 0, policy_version 95463 (0.0043) +[2024-06-18 08:28:06,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1564196864. Throughput: 0: 42640.7. Samples: 1564303500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:06,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 08:28:07,847][12883] Updated weights for policy 0, policy_version 95473 (0.0029) +[2024-06-18 08:28:07,863][12862] Signal inference workers to stop experience collection... (22850 times) +[2024-06-18 08:28:07,863][12862] Signal inference workers to resume experience collection... (22850 times) +[2024-06-18 08:28:07,905][12883] InferenceWorker_p0-w0: stopping experience collection (22850 times) +[2024-06-18 08:28:07,906][12883] InferenceWorker_p0-w0: resuming experience collection (22850 times) +[2024-06-18 08:28:11,476][12883] Updated weights for policy 0, policy_version 95483 (0.0033) +[2024-06-18 08:28:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 1564393472. Throughput: 0: 42781.2. Samples: 1564565580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:11,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 08:28:15,376][12883] Updated weights for policy 0, policy_version 95493 (0.0033) +[2024-06-18 08:28:16,996][12645] Fps is (10 sec: 40951.5, 60 sec: 42323.8, 300 sec: 42765.3). Total num frames: 1564606464. Throughput: 0: 42778.3. Samples: 1564695740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:16,997][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 08:28:19,163][12883] Updated weights for policy 0, policy_version 95503 (0.0043) +[2024-06-18 08:28:21,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1564835840. Throughput: 0: 42873.5. Samples: 1564949500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:21,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 08:28:22,875][12883] Updated weights for policy 0, policy_version 95513 (0.0030) +[2024-06-18 08:28:26,733][12883] Updated weights for policy 0, policy_version 95523 (0.0035) +[2024-06-18 08:28:26,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1565048832. Throughput: 0: 42992.7. Samples: 1565214320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:26,994][12645] Avg episode reward: [(0, '0.188')] +[2024-06-18 08:28:30,464][12883] Updated weights for policy 0, policy_version 95533 (0.0037) +[2024-06-18 08:28:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1565261824. Throughput: 0: 42808.0. Samples: 1565339600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:31,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 08:28:34,280][12883] Updated weights for policy 0, policy_version 95543 (0.0037) +[2024-06-18 08:28:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1565491200. Throughput: 0: 42969.8. Samples: 1565596120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:36,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 08:28:37,117][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095551_1565507584.pth... +[2024-06-18 08:28:37,170][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094922_1555202048.pth +[2024-06-18 08:28:38,146][12883] Updated weights for policy 0, policy_version 95553 (0.0025) +[2024-06-18 08:28:41,903][12883] Updated weights for policy 0, policy_version 95563 (0.0035) +[2024-06-18 08:28:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1565704192. Throughput: 0: 43018.1. Samples: 1565855900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:41,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 08:28:45,566][12883] Updated weights for policy 0, policy_version 95573 (0.0032) +[2024-06-18 08:28:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1565900800. Throughput: 0: 42933.7. Samples: 1565981720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) +[2024-06-18 08:28:46,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 08:28:49,480][12883] Updated weights for policy 0, policy_version 95583 (0.0033) +[2024-06-18 08:28:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1566130176. Throughput: 0: 43079.2. Samples: 1566242060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:28:51,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 08:28:53,537][12883] Updated weights for policy 0, policy_version 95593 (0.0041) +[2024-06-18 08:28:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1566343168. Throughput: 0: 42951.7. Samples: 1566498400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:28:56,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 08:28:57,093][12883] Updated weights for policy 0, policy_version 95603 (0.0042) +[2024-06-18 08:29:01,612][12883] Updated weights for policy 0, policy_version 95613 (0.0027) +[2024-06-18 08:29:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1566539776. Throughput: 0: 42873.3. Samples: 1566624940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:01,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 08:29:04,605][12883] Updated weights for policy 0, policy_version 95623 (0.0031) +[2024-06-18 08:29:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1566769152. Throughput: 0: 43004.3. Samples: 1566884700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:06,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 08:29:09,229][12883] Updated weights for policy 0, policy_version 95633 (0.0028) +[2024-06-18 08:29:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1566982144. Throughput: 0: 42870.7. Samples: 1567143500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:11,996][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 08:29:12,218][12883] Updated weights for policy 0, policy_version 95643 (0.0028) +[2024-06-18 08:29:16,830][12883] Updated weights for policy 0, policy_version 95653 (0.0037) +[2024-06-18 08:29:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43146.1, 300 sec: 42820.5). Total num frames: 1567195136. Throughput: 0: 42961.2. Samples: 1567272860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:16,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 08:29:19,867][12883] Updated weights for policy 0, policy_version 95663 (0.0034) +[2024-06-18 08:29:21,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1567408128. Throughput: 0: 42745.9. Samples: 1567519780. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:21,996][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 08:29:24,333][12883] Updated weights for policy 0, policy_version 95673 (0.0042) +[2024-06-18 08:29:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1567621120. Throughput: 0: 42913.8. Samples: 1567787020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:26,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 08:29:27,644][12883] Updated weights for policy 0, policy_version 95683 (0.0050) +[2024-06-18 08:29:31,983][12883] Updated weights for policy 0, policy_version 95693 (0.0032) +[2024-06-18 08:29:31,996][12645] Fps is (10 sec: 42598.4, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 1567834112. Throughput: 0: 42873.5. Samples: 1567911120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:31,996][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 08:29:35,294][12883] Updated weights for policy 0, policy_version 95703 (0.0037) +[2024-06-18 08:29:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1568047104. Throughput: 0: 42659.6. Samples: 1568161740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:36,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 08:29:37,190][12862] Signal inference workers to stop experience collection... (22900 times) +[2024-06-18 08:29:37,238][12883] InferenceWorker_p0-w0: stopping experience collection (22900 times) +[2024-06-18 08:29:37,248][12862] Signal inference workers to resume experience collection... (22900 times) +[2024-06-18 08:29:37,260][12883] InferenceWorker_p0-w0: resuming experience collection (22900 times) +[2024-06-18 08:29:39,566][12883] Updated weights for policy 0, policy_version 95713 (0.0030) +[2024-06-18 08:29:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1568243712. Throughput: 0: 42854.7. Samples: 1568426860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 08:29:41,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 08:29:43,001][12883] Updated weights for policy 0, policy_version 95723 (0.0028) +[2024-06-18 08:29:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1568473088. Throughput: 0: 42760.2. Samples: 1568549160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:29:46,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 08:29:47,139][12883] Updated weights for policy 0, policy_version 95733 (0.0036) +[2024-06-18 08:29:50,670][12883] Updated weights for policy 0, policy_version 95743 (0.0041) +[2024-06-18 08:29:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1568702464. Throughput: 0: 42650.2. Samples: 1568803960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:29:51,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 08:29:54,918][12883] Updated weights for policy 0, policy_version 95753 (0.0037) +[2024-06-18 08:29:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1568899072. Throughput: 0: 42702.2. Samples: 1569065100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:29:56,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 08:29:58,221][12883] Updated weights for policy 0, policy_version 95763 (0.0033) +[2024-06-18 08:30:02,000][12645] Fps is (10 sec: 40934.6, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 1569112064. Throughput: 0: 42591.5. Samples: 1569189740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:02,001][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 08:30:03,096][12883] Updated weights for policy 0, policy_version 95773 (0.0044) +[2024-06-18 08:30:06,045][12883] Updated weights for policy 0, policy_version 95783 (0.0030) +[2024-06-18 08:30:06,996][12645] Fps is (10 sec: 45864.9, 60 sec: 43143.0, 300 sec: 42875.8). Total num frames: 1569357824. Throughput: 0: 42860.5. Samples: 1569448500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:06,997][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 08:30:10,781][12883] Updated weights for policy 0, policy_version 95793 (0.0030) +[2024-06-18 08:30:11,994][12645] Fps is (10 sec: 44264.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1569554432. Throughput: 0: 42733.0. Samples: 1569710000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:11,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 08:30:13,511][12883] Updated weights for policy 0, policy_version 95803 (0.0032) +[2024-06-18 08:30:16,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1569767424. Throughput: 0: 42696.8. Samples: 1569832380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:16,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 08:30:18,319][12883] Updated weights for policy 0, policy_version 95813 (0.0033) +[2024-06-18 08:30:21,015][12883] Updated weights for policy 0, policy_version 95823 (0.0035) +[2024-06-18 08:30:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43146.1, 300 sec: 42932.0). Total num frames: 1569996800. Throughput: 0: 42861.3. Samples: 1570090500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:21,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 08:30:25,887][12883] Updated weights for policy 0, policy_version 95833 (0.0043) +[2024-06-18 08:30:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1570177024. Throughput: 0: 42820.9. Samples: 1570353800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:26,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 08:30:28,873][12883] Updated weights for policy 0, policy_version 95843 (0.0026) +[2024-06-18 08:30:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 1570422784. Throughput: 0: 42791.3. Samples: 1570474760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:31,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 08:30:33,432][12883] Updated weights for policy 0, policy_version 95853 (0.0037) +[2024-06-18 08:30:36,464][12883] Updated weights for policy 0, policy_version 95863 (0.0044) +[2024-06-18 08:30:37,000][12645] Fps is (10 sec: 45846.3, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 1570635776. Throughput: 0: 42861.2. Samples: 1570732980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:37,001][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 08:30:37,091][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095865_1570652160.pth... +[2024-06-18 08:30:37,153][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095237_1560363008.pth +[2024-06-18 08:30:41,643][12883] Updated weights for policy 0, policy_version 95873 (0.0041) +[2024-06-18 08:30:41,994][12645] Fps is (10 sec: 36044.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1570783232. Throughput: 0: 42934.7. Samples: 1570997160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:30:41,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 08:30:44,235][12883] Updated weights for policy 0, policy_version 95883 (0.0028) +[2024-06-18 08:30:46,996][12645] Fps is (10 sec: 40976.5, 60 sec: 42870.0, 300 sec: 42820.2). Total num frames: 1571045376. Throughput: 0: 42602.5. Samples: 1571106680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:30:46,997][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 08:30:49,336][12883] Updated weights for policy 0, policy_version 95893 (0.0033) +[2024-06-18 08:30:51,922][12883] Updated weights for policy 0, policy_version 95903 (0.0031) +[2024-06-18 08:30:51,994][12645] Fps is (10 sec: 49151.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1571274752. Throughput: 0: 42622.6. Samples: 1571366420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:30:51,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 08:30:56,996][12645] Fps is (10 sec: 37683.1, 60 sec: 42050.7, 300 sec: 42653.6). Total num frames: 1571422208. Throughput: 0: 42645.0. Samples: 1571629120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:30:56,996][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 08:30:57,086][12883] Updated weights for policy 0, policy_version 95913 (0.0034) +[2024-06-18 08:30:59,531][12883] Updated weights for policy 0, policy_version 95923 (0.0027) +[2024-06-18 08:31:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 1571684352. Throughput: 0: 42511.1. Samples: 1571745380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:31:01,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 08:31:04,657][12883] Updated weights for policy 0, policy_version 95933 (0.0032) +[2024-06-18 08:31:06,321][12862] Signal inference workers to stop experience collection... (22950 times) +[2024-06-18 08:31:06,321][12862] Signal inference workers to resume experience collection... (22950 times) +[2024-06-18 08:31:06,332][12883] InferenceWorker_p0-w0: stopping experience collection (22950 times) +[2024-06-18 08:31:06,332][12883] InferenceWorker_p0-w0: resuming experience collection (22950 times) +[2024-06-18 08:31:06,994][12645] Fps is (10 sec: 49163.3, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 1571913728. Throughput: 0: 42755.2. Samples: 1572014480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:31:06,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 08:31:07,100][12883] Updated weights for policy 0, policy_version 95943 (0.0040) +[2024-06-18 08:31:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1572077568. Throughput: 0: 42671.5. Samples: 1572274020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:31:11,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 08:31:12,247][12883] Updated weights for policy 0, policy_version 95953 (0.0031) +[2024-06-18 08:31:15,040][12883] Updated weights for policy 0, policy_version 95963 (0.0041) +[2024-06-18 08:31:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1572323328. Throughput: 0: 42570.6. Samples: 1572390440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:31:16,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 08:31:19,882][12883] Updated weights for policy 0, policy_version 95973 (0.0031) +[2024-06-18 08:31:21,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1572552704. Throughput: 0: 42727.7. Samples: 1572655460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:31:22,005][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 08:31:22,665][12883] Updated weights for policy 0, policy_version 95983 (0.0032) +[2024-06-18 08:31:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1572716544. Throughput: 0: 42491.5. Samples: 1572909280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:31:26,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 08:31:27,559][12883] Updated weights for policy 0, policy_version 95993 (0.0034) +[2024-06-18 08:31:30,242][12883] Updated weights for policy 0, policy_version 96003 (0.0038) +[2024-06-18 08:31:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1572962304. Throughput: 0: 42691.4. Samples: 1573027700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:31:31,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 08:31:35,117][12883] Updated weights for policy 0, policy_version 96013 (0.0037) +[2024-06-18 08:31:36,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 1573191680. Throughput: 0: 42878.2. Samples: 1573295940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 08:31:36,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 08:31:37,792][12883] Updated weights for policy 0, policy_version 96023 (0.0023) +[2024-06-18 08:31:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1573371904. Throughput: 0: 42666.5. Samples: 1573549020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:31:41,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 08:31:42,556][12883] Updated weights for policy 0, policy_version 96033 (0.0029) +[2024-06-18 08:31:45,481][12883] Updated weights for policy 0, policy_version 96043 (0.0028) +[2024-06-18 08:31:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42873.0, 300 sec: 42820.5). Total num frames: 1573617664. Throughput: 0: 42895.0. Samples: 1573675660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:31:46,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 08:31:50,562][12883] Updated weights for policy 0, policy_version 96053 (0.0033) +[2024-06-18 08:31:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1573814272. Throughput: 0: 42780.8. Samples: 1573939620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:31:51,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 08:31:53,162][12883] Updated weights for policy 0, policy_version 96063 (0.0029) +[2024-06-18 08:31:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43419.2, 300 sec: 42709.8). Total num frames: 1574027264. Throughput: 0: 42551.2. Samples: 1574188820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:31:56,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 08:31:58,159][12883] Updated weights for policy 0, policy_version 96073 (0.0029) +[2024-06-18 08:32:00,764][12883] Updated weights for policy 0, policy_version 96083 (0.0032) +[2024-06-18 08:32:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1574256640. Throughput: 0: 42807.4. Samples: 1574316780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:32:01,994][12645] Avg episode reward: [(0, '0.680')] +[2024-06-18 08:32:05,772][12883] Updated weights for policy 0, policy_version 96093 (0.0035) +[2024-06-18 08:32:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1574469632. Throughput: 0: 42895.3. Samples: 1574585740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:32:06,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 08:32:08,254][12883] Updated weights for policy 0, policy_version 96103 (0.0033) +[2024-06-18 08:32:10,180][12862] Signal inference workers to stop experience collection... (23000 times) +[2024-06-18 08:32:10,180][12862] Signal inference workers to resume experience collection... (23000 times) +[2024-06-18 08:32:10,196][12883] InferenceWorker_p0-w0: stopping experience collection (23000 times) +[2024-06-18 08:32:10,197][12883] InferenceWorker_p0-w0: resuming experience collection (23000 times) +[2024-06-18 08:32:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1574682624. Throughput: 0: 42758.3. Samples: 1574833400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:32:11,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 08:32:13,241][12883] Updated weights for policy 0, policy_version 96113 (0.0037) +[2024-06-18 08:32:15,835][12883] Updated weights for policy 0, policy_version 96123 (0.0034) +[2024-06-18 08:32:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1574895616. Throughput: 0: 42949.4. Samples: 1574960420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:32:16,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 08:32:20,795][12883] Updated weights for policy 0, policy_version 96133 (0.0037) +[2024-06-18 08:32:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1575092224. Throughput: 0: 42711.1. Samples: 1575217940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:32:21,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 08:32:23,789][12883] Updated weights for policy 0, policy_version 96143 (0.0029) +[2024-06-18 08:32:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1575305216. Throughput: 0: 42731.1. Samples: 1575471920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:32:26,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 08:32:28,495][12883] Updated weights for policy 0, policy_version 96153 (0.0035) +[2024-06-18 08:32:31,560][12883] Updated weights for policy 0, policy_version 96163 (0.0038) +[2024-06-18 08:32:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1575550976. Throughput: 0: 42758.3. Samples: 1575599780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:32:31,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 08:32:36,042][12883] Updated weights for policy 0, policy_version 96173 (0.0044) +[2024-06-18 08:32:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1575731200. Throughput: 0: 42626.8. Samples: 1575857820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 08:32:36,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 08:32:37,077][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096176_1575747584.pth... +[2024-06-18 08:32:37,141][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095551_1565507584.pth +[2024-06-18 08:32:39,293][12883] Updated weights for policy 0, policy_version 96183 (0.0028) +[2024-06-18 08:32:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1575944192. Throughput: 0: 42588.9. Samples: 1576105320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:32:41,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 08:32:43,743][12883] Updated weights for policy 0, policy_version 96193 (0.0032) +[2024-06-18 08:32:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1576173568. Throughput: 0: 42686.0. Samples: 1576237640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:32:46,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 08:32:47,103][12883] Updated weights for policy 0, policy_version 96203 (0.0040) +[2024-06-18 08:32:51,458][12883] Updated weights for policy 0, policy_version 96213 (0.0041) +[2024-06-18 08:32:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1576370176. Throughput: 0: 42497.7. Samples: 1576498140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:32:51,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 08:32:54,826][12883] Updated weights for policy 0, policy_version 96223 (0.0035) +[2024-06-18 08:32:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1576583168. Throughput: 0: 42623.5. Samples: 1576751460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:32:56,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 08:32:59,310][12883] Updated weights for policy 0, policy_version 96233 (0.0041) +[2024-06-18 08:33:01,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1576828928. Throughput: 0: 42553.4. Samples: 1576875320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:33:01,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 08:33:02,441][12883] Updated weights for policy 0, policy_version 96243 (0.0034) +[2024-06-18 08:33:06,838][12883] Updated weights for policy 0, policy_version 96253 (0.0044) +[2024-06-18 08:33:06,996][12645] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 1577009152. Throughput: 0: 42588.0. Samples: 1577134500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:33:06,997][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 08:33:10,255][12883] Updated weights for policy 0, policy_version 96263 (0.0032) +[2024-06-18 08:33:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 1577222144. Throughput: 0: 42672.6. Samples: 1577392180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:33:11,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 08:33:14,455][12883] Updated weights for policy 0, policy_version 96273 (0.0032) +[2024-06-18 08:33:16,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1577467904. Throughput: 0: 42689.5. Samples: 1577520820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:33:16,995][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 08:33:17,881][12883] Updated weights for policy 0, policy_version 96283 (0.0042) +[2024-06-18 08:33:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1577648128. Throughput: 0: 42713.4. Samples: 1577779920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:33:21,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 08:33:22,051][12883] Updated weights for policy 0, policy_version 96293 (0.0031) +[2024-06-18 08:33:25,695][12883] Updated weights for policy 0, policy_version 96303 (0.0031) +[2024-06-18 08:33:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1577877504. Throughput: 0: 42765.7. Samples: 1578029780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:33:26,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 08:33:27,872][12862] Signal inference workers to stop experience collection... (23050 times) +[2024-06-18 08:33:27,928][12883] InferenceWorker_p0-w0: stopping experience collection (23050 times) +[2024-06-18 08:33:27,928][12862] Signal inference workers to resume experience collection... (23050 times) +[2024-06-18 08:33:27,940][12883] InferenceWorker_p0-w0: resuming experience collection (23050 times) +[2024-06-18 08:33:29,698][12883] Updated weights for policy 0, policy_version 96313 (0.0043) +[2024-06-18 08:33:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1578106880. Throughput: 0: 42779.0. Samples: 1578162700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:33:31,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 08:33:33,395][12883] Updated weights for policy 0, policy_version 96323 (0.0034) +[2024-06-18 08:33:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1578287104. Throughput: 0: 42735.0. Samples: 1578421220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 08:33:36,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 08:33:37,425][12883] Updated weights for policy 0, policy_version 96333 (0.0031) +[2024-06-18 08:33:41,122][12883] Updated weights for policy 0, policy_version 96343 (0.0034) +[2024-06-18 08:33:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1578516480. Throughput: 0: 42704.0. Samples: 1578673140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:33:41,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 08:33:45,095][12883] Updated weights for policy 0, policy_version 96353 (0.0045) +[2024-06-18 08:33:46,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 1578762240. Throughput: 0: 42824.8. Samples: 1578802440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:33:46,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 08:33:48,665][12883] Updated weights for policy 0, policy_version 96363 (0.0033) +[2024-06-18 08:33:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1578926080. Throughput: 0: 42851.9. Samples: 1579062740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:33:51,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 08:33:52,722][12883] Updated weights for policy 0, policy_version 96373 (0.0026) +[2024-06-18 08:33:56,446][12883] Updated weights for policy 0, policy_version 96383 (0.0039) +[2024-06-18 08:33:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1579155456. Throughput: 0: 42640.8. Samples: 1579311020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:33:56,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 08:34:00,491][12883] Updated weights for policy 0, policy_version 96393 (0.0032) +[2024-06-18 08:34:01,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1579384832. Throughput: 0: 42743.8. Samples: 1579444280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:34:01,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 08:34:04,026][12883] Updated weights for policy 0, policy_version 96403 (0.0038) +[2024-06-18 08:34:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1579581440. Throughput: 0: 42703.5. Samples: 1579701580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:34:06,994][12645] Avg episode reward: [(0, '0.189')] +[2024-06-18 08:34:08,097][12883] Updated weights for policy 0, policy_version 96413 (0.0038) +[2024-06-18 08:34:11,649][12883] Updated weights for policy 0, policy_version 96423 (0.0041) +[2024-06-18 08:34:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1579810816. Throughput: 0: 42716.5. Samples: 1579952020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:34:11,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 08:34:15,761][12883] Updated weights for policy 0, policy_version 96433 (0.0043) +[2024-06-18 08:34:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42765.3). Total num frames: 1580023808. Throughput: 0: 42604.4. Samples: 1580079900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:34:16,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 08:34:19,334][12883] Updated weights for policy 0, policy_version 96443 (0.0042) +[2024-06-18 08:34:21,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 1580204032. Throughput: 0: 42588.1. Samples: 1580337780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:34:21,996][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 08:34:23,327][12883] Updated weights for policy 0, policy_version 96453 (0.0028) +[2024-06-18 08:34:26,768][12883] Updated weights for policy 0, policy_version 96463 (0.0032) +[2024-06-18 08:34:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 1580449792. Throughput: 0: 42697.3. Samples: 1580594520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:34:26,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 08:34:30,874][12883] Updated weights for policy 0, policy_version 96473 (0.0031) +[2024-06-18 08:34:31,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1580662784. Throughput: 0: 42804.8. Samples: 1580728660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:34:31,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 08:34:34,325][12883] Updated weights for policy 0, policy_version 96483 (0.0031) +[2024-06-18 08:34:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1580843008. Throughput: 0: 42615.2. Samples: 1580980420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:34:36,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 08:34:37,097][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096488_1580859392.pth... +[2024-06-18 08:34:37,163][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095865_1570652160.pth +[2024-06-18 08:34:38,916][12883] Updated weights for policy 0, policy_version 96493 (0.0029) +[2024-06-18 08:34:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1581088768. Throughput: 0: 42676.7. Samples: 1581231480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:34:41,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 08:34:42,126][12883] Updated weights for policy 0, policy_version 96503 (0.0034) +[2024-06-18 08:34:46,546][12883] Updated weights for policy 0, policy_version 96513 (0.0036) +[2024-06-18 08:34:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1581301760. Throughput: 0: 42673.3. Samples: 1581364580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:34:46,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 08:34:49,691][12883] Updated weights for policy 0, policy_version 96523 (0.0036) +[2024-06-18 08:34:50,872][12862] Signal inference workers to stop experience collection... (23100 times) +[2024-06-18 08:34:50,872][12862] Signal inference workers to resume experience collection... (23100 times) +[2024-06-18 08:34:50,910][12883] InferenceWorker_p0-w0: stopping experience collection (23100 times) +[2024-06-18 08:34:50,910][12883] InferenceWorker_p0-w0: resuming experience collection (23100 times) +[2024-06-18 08:34:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1581498368. Throughput: 0: 42646.2. Samples: 1581620660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:34:51,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 08:34:54,120][12883] Updated weights for policy 0, policy_version 96533 (0.0027) +[2024-06-18 08:34:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 1581727744. Throughput: 0: 42781.2. Samples: 1581877180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:34:56,994][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 08:34:57,521][12883] Updated weights for policy 0, policy_version 96543 (0.0035) +[2024-06-18 08:35:01,604][12883] Updated weights for policy 0, policy_version 96553 (0.0039) +[2024-06-18 08:35:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1581940736. Throughput: 0: 42815.7. Samples: 1582006600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:35:01,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 08:35:05,494][12883] Updated weights for policy 0, policy_version 96563 (0.0040) +[2024-06-18 08:35:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1582120960. Throughput: 0: 42535.4. Samples: 1582251780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:35:06,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 08:35:09,553][12883] Updated weights for policy 0, policy_version 96573 (0.0045) +[2024-06-18 08:35:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1582350336. Throughput: 0: 42499.1. Samples: 1582506980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:35:11,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 08:35:13,380][12883] Updated weights for policy 0, policy_version 96583 (0.0023) +[2024-06-18 08:35:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1582563328. Throughput: 0: 42448.5. Samples: 1582638840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:35:16,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 08:35:17,100][12883] Updated weights for policy 0, policy_version 96593 (0.0037) +[2024-06-18 08:35:21,175][12883] Updated weights for policy 0, policy_version 96603 (0.0035) +[2024-06-18 08:35:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 1582776320. Throughput: 0: 42476.9. Samples: 1582891880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:35:21,994][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 08:35:24,783][12883] Updated weights for policy 0, policy_version 96613 (0.0040) +[2024-06-18 08:35:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1582989312. Throughput: 0: 42497.0. Samples: 1583143840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:35:26,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 08:35:29,077][12883] Updated weights for policy 0, policy_version 96623 (0.0036) +[2024-06-18 08:35:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 1583202304. Throughput: 0: 42498.7. Samples: 1583277020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:35:31,994][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 08:35:32,650][12883] Updated weights for policy 0, policy_version 96633 (0.0030) +[2024-06-18 08:35:36,784][12883] Updated weights for policy 0, policy_version 96643 (0.0038) +[2024-06-18 08:35:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1583398912. Throughput: 0: 42388.0. Samples: 1583528120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:35:36,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 08:35:40,266][12883] Updated weights for policy 0, policy_version 96653 (0.0036) +[2024-06-18 08:35:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 1583628288. Throughput: 0: 42247.7. Samples: 1583778320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:35:41,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 08:35:44,902][12883] Updated weights for policy 0, policy_version 96663 (0.0037) +[2024-06-18 08:35:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1583841280. Throughput: 0: 42317.3. Samples: 1583910880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:35:46,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 08:35:47,742][12883] Updated weights for policy 0, policy_version 96673 (0.0041) +[2024-06-18 08:35:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 1584021504. Throughput: 0: 42410.0. Samples: 1584160220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:35:51,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 08:35:52,631][12883] Updated weights for policy 0, policy_version 96683 (0.0023) +[2024-06-18 08:35:55,531][12883] Updated weights for policy 0, policy_version 96693 (0.0026) +[2024-06-18 08:35:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1584267264. Throughput: 0: 42393.4. Samples: 1584414680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:35:56,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 08:36:00,163][12883] Updated weights for policy 0, policy_version 96703 (0.0026) +[2024-06-18 08:36:01,994][12645] Fps is (10 sec: 47512.6, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1584496640. Throughput: 0: 42515.0. Samples: 1584552020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:36:01,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 08:36:03,006][12883] Updated weights for policy 0, policy_version 96713 (0.0032) +[2024-06-18 08:36:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1584676864. Throughput: 0: 42511.1. Samples: 1584804880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:36:06,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 08:36:07,747][12883] Updated weights for policy 0, policy_version 96723 (0.0028) +[2024-06-18 08:36:08,231][12862] Signal inference workers to stop experience collection... (23150 times) +[2024-06-18 08:36:08,236][12862] Signal inference workers to resume experience collection... (23150 times) +[2024-06-18 08:36:08,282][12883] InferenceWorker_p0-w0: stopping experience collection (23150 times) +[2024-06-18 08:36:08,283][12883] InferenceWorker_p0-w0: resuming experience collection (23150 times) +[2024-06-18 08:36:10,768][12883] Updated weights for policy 0, policy_version 96733 (0.0031) +[2024-06-18 08:36:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1584906240. Throughput: 0: 42550.7. Samples: 1585058620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:36:11,994][12645] Avg episode reward: [(0, '0.174')] +[2024-06-18 08:36:15,311][12883] Updated weights for policy 0, policy_version 96743 (0.0037) +[2024-06-18 08:36:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1585102848. Throughput: 0: 42735.6. Samples: 1585200120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:36:16,994][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 08:36:18,257][12883] Updated weights for policy 0, policy_version 96753 (0.0039) +[2024-06-18 08:36:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1585315840. Throughput: 0: 42707.7. Samples: 1585449960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:36:21,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 08:36:22,864][12883] Updated weights for policy 0, policy_version 96763 (0.0028) +[2024-06-18 08:36:25,806][12883] Updated weights for policy 0, policy_version 96773 (0.0033) +[2024-06-18 08:36:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1585561600. Throughput: 0: 42817.4. Samples: 1585705100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:36:26,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 08:36:30,372][12883] Updated weights for policy 0, policy_version 96783 (0.0034) +[2024-06-18 08:36:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1585774592. Throughput: 0: 42860.8. Samples: 1585839620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) +[2024-06-18 08:36:31,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 08:36:33,553][12883] Updated weights for policy 0, policy_version 96793 (0.0036) +[2024-06-18 08:36:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1585971200. Throughput: 0: 42983.4. Samples: 1586094480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:36:36,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 08:36:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096800_1585971200.pth... +[2024-06-18 08:36:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096176_1575747584.pth +[2024-06-18 08:36:38,049][12883] Updated weights for policy 0, policy_version 96803 (0.0031) +[2024-06-18 08:36:41,491][12883] Updated weights for policy 0, policy_version 96813 (0.0034) +[2024-06-18 08:36:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1586216960. Throughput: 0: 43036.5. Samples: 1586351320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:36:41,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 08:36:45,663][12883] Updated weights for policy 0, policy_version 96823 (0.0023) +[2024-06-18 08:36:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1586413568. Throughput: 0: 43027.3. Samples: 1586488240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:36:46,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 08:36:49,126][12883] Updated weights for policy 0, policy_version 96833 (0.0034) +[2024-06-18 08:36:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1586610176. Throughput: 0: 42981.3. Samples: 1586739040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:36:51,999][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 08:36:53,314][12883] Updated weights for policy 0, policy_version 96843 (0.0037) +[2024-06-18 08:36:56,662][12883] Updated weights for policy 0, policy_version 96853 (0.0046) +[2024-06-18 08:36:56,994][12645] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1586855936. Throughput: 0: 43059.9. Samples: 1586996320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:36:56,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 08:37:01,055][12883] Updated weights for policy 0, policy_version 96863 (0.0037) +[2024-06-18 08:37:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1587052544. Throughput: 0: 42843.5. Samples: 1587128080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:37:01,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 08:37:04,172][12883] Updated weights for policy 0, policy_version 96873 (0.0024) +[2024-06-18 08:37:06,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1587249152. Throughput: 0: 42919.1. Samples: 1587381320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:37:06,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 08:37:08,631][12883] Updated weights for policy 0, policy_version 96883 (0.0027) +[2024-06-18 08:37:11,697][12883] Updated weights for policy 0, policy_version 96893 (0.0029) +[2024-06-18 08:37:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1587494912. Throughput: 0: 42826.1. Samples: 1587632280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:37:11,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 08:37:16,183][12883] Updated weights for policy 0, policy_version 96903 (0.0025) +[2024-06-18 08:37:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1587675136. Throughput: 0: 42786.3. Samples: 1587765000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:37:16,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 08:37:19,405][12883] Updated weights for policy 0, policy_version 96913 (0.0034) +[2024-06-18 08:37:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1587904512. Throughput: 0: 42741.0. Samples: 1588017820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:37:21,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 08:37:23,858][12883] Updated weights for policy 0, policy_version 96923 (0.0026) +[2024-06-18 08:37:26,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1588150272. Throughput: 0: 42783.9. Samples: 1588276600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 08:37:26,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 08:37:26,998][12883] Updated weights for policy 0, policy_version 96933 (0.0032) +[2024-06-18 08:37:31,895][12883] Updated weights for policy 0, policy_version 96943 (0.0041) +[2024-06-18 08:37:31,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1588314112. Throughput: 0: 42611.3. Samples: 1588405760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:37:31,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 08:37:34,754][12883] Updated weights for policy 0, policy_version 96953 (0.0031) +[2024-06-18 08:37:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1588559872. Throughput: 0: 42681.4. Samples: 1588659700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:37:36,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 08:37:39,506][12883] Updated weights for policy 0, policy_version 96963 (0.0037) +[2024-06-18 08:37:39,856][12862] Signal inference workers to stop experience collection... (23200 times) +[2024-06-18 08:37:39,888][12883] InferenceWorker_p0-w0: stopping experience collection (23200 times) +[2024-06-18 08:37:39,909][12862] Signal inference workers to resume experience collection... (23200 times) +[2024-06-18 08:37:39,913][12883] InferenceWorker_p0-w0: resuming experience collection (23200 times) +[2024-06-18 08:37:41,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1588772864. Throughput: 0: 42607.3. Samples: 1588913640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:37:41,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 08:37:42,408][12883] Updated weights for policy 0, policy_version 96973 (0.0032) +[2024-06-18 08:37:46,959][12883] Updated weights for policy 0, policy_version 96983 (0.0027) +[2024-06-18 08:37:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1588969472. Throughput: 0: 42568.9. Samples: 1589043680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:37:46,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 08:37:49,934][12883] Updated weights for policy 0, policy_version 96993 (0.0040) +[2024-06-18 08:37:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1589182464. Throughput: 0: 42617.7. Samples: 1589299120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:37:51,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 08:37:54,857][12883] Updated weights for policy 0, policy_version 97003 (0.0049) +[2024-06-18 08:37:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1589395456. Throughput: 0: 42708.2. Samples: 1589554140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:37:56,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 08:37:57,964][12883] Updated weights for policy 0, policy_version 97013 (0.0052) +[2024-06-18 08:38:01,995][12645] Fps is (10 sec: 40953.7, 60 sec: 42324.2, 300 sec: 42654.1). Total num frames: 1589592064. Throughput: 0: 42632.7. Samples: 1589683540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:38:01,996][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 08:38:02,486][12883] Updated weights for policy 0, policy_version 97023 (0.0037) +[2024-06-18 08:38:05,785][12883] Updated weights for policy 0, policy_version 97033 (0.0036) +[2024-06-18 08:38:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1589821440. Throughput: 0: 42560.4. Samples: 1589933040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:38:06,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 08:38:10,263][12883] Updated weights for policy 0, policy_version 97043 (0.0039) +[2024-06-18 08:38:11,994][12645] Fps is (10 sec: 44243.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1590034432. Throughput: 0: 42492.9. Samples: 1590188780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:38:11,998][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 08:38:13,574][12883] Updated weights for policy 0, policy_version 97053 (0.0043) +[2024-06-18 08:38:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1590231040. Throughput: 0: 42461.9. Samples: 1590316540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:38:16,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 08:38:17,770][12883] Updated weights for policy 0, policy_version 97063 (0.0034) +[2024-06-18 08:38:21,160][12883] Updated weights for policy 0, policy_version 97073 (0.0036) +[2024-06-18 08:38:21,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1590476800. Throughput: 0: 42386.8. Samples: 1590567200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:38:21,997][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 08:38:25,295][12883] Updated weights for policy 0, policy_version 97083 (0.0032) +[2024-06-18 08:38:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1590673408. Throughput: 0: 42414.9. Samples: 1590822320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 08:38:26,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 08:38:28,842][12883] Updated weights for policy 0, policy_version 97093 (0.0033) +[2024-06-18 08:38:31,994][12645] Fps is (10 sec: 37691.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1590853632. Throughput: 0: 42334.2. Samples: 1590948720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:38:31,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 08:38:33,244][12883] Updated weights for policy 0, policy_version 97103 (0.0029) +[2024-06-18 08:38:36,784][12883] Updated weights for policy 0, policy_version 97113 (0.0030) +[2024-06-18 08:38:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1591099392. Throughput: 0: 42404.9. Samples: 1591207340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:38:36,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 08:38:37,131][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097114_1591115776.pth... +[2024-06-18 08:38:37,178][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096488_1580859392.pth +[2024-06-18 08:38:40,967][12883] Updated weights for policy 0, policy_version 97123 (0.0041) +[2024-06-18 08:38:41,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1591312384. Throughput: 0: 42383.5. Samples: 1591461400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:38:41,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 08:38:44,353][12883] Updated weights for policy 0, policy_version 97133 (0.0040) +[2024-06-18 08:38:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1591508992. Throughput: 0: 42337.6. Samples: 1591588660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:38:46,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 08:38:48,740][12883] Updated weights for policy 0, policy_version 97143 (0.0036) +[2024-06-18 08:38:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1591738368. Throughput: 0: 42578.7. Samples: 1591849080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:38:51,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 08:38:52,026][12883] Updated weights for policy 0, policy_version 97153 (0.0031) +[2024-06-18 08:38:56,321][12883] Updated weights for policy 0, policy_version 97163 (0.0029) +[2024-06-18 08:38:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1591951360. Throughput: 0: 42568.0. Samples: 1592104340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:38:57,002][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 08:38:59,635][12883] Updated weights for policy 0, policy_version 97173 (0.0035) +[2024-06-18 08:39:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42326.4, 300 sec: 42542.9). Total num frames: 1592131584. Throughput: 0: 42490.6. Samples: 1592228620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:39:02,004][12645] Avg episode reward: [(0, '0.704')] +[2024-06-18 08:39:03,914][12883] Updated weights for policy 0, policy_version 97183 (0.0039) +[2024-06-18 08:39:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1592377344. Throughput: 0: 42624.3. Samples: 1592485200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:39:06,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 08:39:07,000][12862] Signal inference workers to stop experience collection... (23250 times) +[2024-06-18 08:39:07,004][12862] Signal inference workers to resume experience collection... (23250 times) +[2024-06-18 08:39:07,038][12883] InferenceWorker_p0-w0: stopping experience collection (23250 times) +[2024-06-18 08:39:07,038][12883] InferenceWorker_p0-w0: resuming experience collection (23250 times) +[2024-06-18 08:39:07,303][12883] Updated weights for policy 0, policy_version 97193 (0.0032) +[2024-06-18 08:39:11,751][12883] Updated weights for policy 0, policy_version 97203 (0.0033) +[2024-06-18 08:39:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1592590336. Throughput: 0: 42719.7. Samples: 1592744700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:39:11,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 08:39:14,971][12883] Updated weights for policy 0, policy_version 97213 (0.0040) +[2024-06-18 08:39:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 1592770560. Throughput: 0: 42696.4. Samples: 1592870060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:39:16,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 08:39:19,460][12883] Updated weights for policy 0, policy_version 97223 (0.0040) +[2024-06-18 08:39:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 1593016320. Throughput: 0: 42557.7. Samples: 1593122440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:39:21,995][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 08:39:22,977][12883] Updated weights for policy 0, policy_version 97233 (0.0043) +[2024-06-18 08:39:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1593212928. Throughput: 0: 42793.3. Samples: 1593387100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 08:39:26,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 08:39:27,114][12883] Updated weights for policy 0, policy_version 97243 (0.0028) +[2024-06-18 08:39:30,675][12883] Updated weights for policy 0, policy_version 97253 (0.0028) +[2024-06-18 08:39:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1593425920. Throughput: 0: 42687.6. Samples: 1593509600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:39:31,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 08:39:34,530][12883] Updated weights for policy 0, policy_version 97263 (0.0036) +[2024-06-18 08:39:36,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1593671680. Throughput: 0: 42628.0. Samples: 1593767340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:39:36,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 08:39:38,377][12883] Updated weights for policy 0, policy_version 97273 (0.0036) +[2024-06-18 08:39:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1593851904. Throughput: 0: 42711.7. Samples: 1594026360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:39:41,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 08:39:42,206][12883] Updated weights for policy 0, policy_version 97283 (0.0022) +[2024-06-18 08:39:45,997][12883] Updated weights for policy 0, policy_version 97293 (0.0028) +[2024-06-18 08:39:46,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1594064896. Throughput: 0: 42663.9. Samples: 1594148500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:39:46,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 08:39:49,746][12883] Updated weights for policy 0, policy_version 97303 (0.0042) +[2024-06-18 08:39:51,994][12645] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1594327040. Throughput: 0: 42837.3. Samples: 1594412880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:39:51,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 08:39:53,480][12883] Updated weights for policy 0, policy_version 97313 (0.0028) +[2024-06-18 08:39:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1594507264. Throughput: 0: 42831.9. Samples: 1594672140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:39:56,994][12645] Avg episode reward: [(0, '0.666')] +[2024-06-18 08:39:57,434][12883] Updated weights for policy 0, policy_version 97323 (0.0024) +[2024-06-18 08:40:00,939][12883] Updated weights for policy 0, policy_version 97333 (0.0039) +[2024-06-18 08:40:01,994][12645] Fps is (10 sec: 39320.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1594720256. Throughput: 0: 42788.3. Samples: 1594795540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:40:01,995][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 08:40:05,362][12883] Updated weights for policy 0, policy_version 97343 (0.0028) +[2024-06-18 08:40:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1594949632. Throughput: 0: 42913.4. Samples: 1595053540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:40:06,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 08:40:08,416][12883] Updated weights for policy 0, policy_version 97353 (0.0028) +[2024-06-18 08:40:11,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1595146240. Throughput: 0: 42839.1. Samples: 1595314860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:40:11,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 08:40:13,001][12883] Updated weights for policy 0, policy_version 97363 (0.0038) +[2024-06-18 08:40:16,423][12883] Updated weights for policy 0, policy_version 97373 (0.0034) +[2024-06-18 08:40:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1595375616. Throughput: 0: 42876.8. Samples: 1595439060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:40:16,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 08:40:20,335][12883] Updated weights for policy 0, policy_version 97383 (0.0039) +[2024-06-18 08:40:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1595572224. Throughput: 0: 43005.3. Samples: 1595702580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:40:21,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 08:40:23,922][12883] Updated weights for policy 0, policy_version 97393 (0.0058) +[2024-06-18 08:40:27,000][12645] Fps is (10 sec: 42571.6, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 1595801600. Throughput: 0: 43071.2. Samples: 1595964840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 08:40:27,000][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 08:40:27,879][12883] Updated weights for policy 0, policy_version 97403 (0.0030) +[2024-06-18 08:40:31,377][12883] Updated weights for policy 0, policy_version 97413 (0.0048) +[2024-06-18 08:40:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1596030976. Throughput: 0: 43183.7. Samples: 1596091760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:40:31,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 08:40:35,717][12883] Updated weights for policy 0, policy_version 97423 (0.0029) +[2024-06-18 08:40:36,996][12645] Fps is (10 sec: 42615.5, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 1596227584. Throughput: 0: 42978.7. Samples: 1596347020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:40:36,996][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 08:40:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097426_1596227584.pth... +[2024-06-18 08:40:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096800_1585971200.pth +[2024-06-18 08:40:38,998][12883] Updated weights for policy 0, policy_version 97433 (0.0036) +[2024-06-18 08:40:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 1596440576. Throughput: 0: 42958.2. Samples: 1596605260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:40:41,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 08:40:43,439][12883] Updated weights for policy 0, policy_version 97443 (0.0038) +[2024-06-18 08:40:46,917][12883] Updated weights for policy 0, policy_version 97453 (0.0038) +[2024-06-18 08:40:46,994][12645] Fps is (10 sec: 44246.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1596669952. Throughput: 0: 42962.3. Samples: 1596728840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:40:46,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 08:40:51,090][12883] Updated weights for policy 0, policy_version 97463 (0.0037) +[2024-06-18 08:40:51,879][12862] Signal inference workers to stop experience collection... (23300 times) +[2024-06-18 08:40:51,912][12883] InferenceWorker_p0-w0: stopping experience collection (23300 times) +[2024-06-18 08:40:51,936][12862] Signal inference workers to resume experience collection... (23300 times) +[2024-06-18 08:40:51,937][12883] InferenceWorker_p0-w0: resuming experience collection (23300 times) +[2024-06-18 08:40:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1596850176. Throughput: 0: 42837.4. Samples: 1596981220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:40:51,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 08:40:54,729][12883] Updated weights for policy 0, policy_version 97473 (0.0045) +[2024-06-18 08:40:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1597079552. Throughput: 0: 42772.5. Samples: 1597239620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:40:56,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 08:40:58,676][12883] Updated weights for policy 0, policy_version 97483 (0.0038) +[2024-06-18 08:41:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1597292544. Throughput: 0: 42855.1. Samples: 1597367540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:41:01,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 08:41:02,228][12883] Updated weights for policy 0, policy_version 97493 (0.0051) +[2024-06-18 08:41:06,348][12883] Updated weights for policy 0, policy_version 97503 (0.0027) +[2024-06-18 08:41:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1597489152. Throughput: 0: 42662.3. Samples: 1597622380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:41:06,994][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 08:41:10,029][12883] Updated weights for policy 0, policy_version 97513 (0.0049) +[2024-06-18 08:41:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1597702144. Throughput: 0: 42552.6. Samples: 1597879440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:41:11,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 08:41:14,009][12883] Updated weights for policy 0, policy_version 97523 (0.0029) +[2024-06-18 08:41:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1597931520. Throughput: 0: 42506.6. Samples: 1598004560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:41:16,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 08:41:17,621][12883] Updated weights for policy 0, policy_version 97533 (0.0037) +[2024-06-18 08:41:21,565][12883] Updated weights for policy 0, policy_version 97543 (0.0035) +[2024-06-18 08:41:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1598144512. Throughput: 0: 42585.7. Samples: 1598263280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:41:21,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 08:41:25,198][12883] Updated weights for policy 0, policy_version 97553 (0.0026) +[2024-06-18 08:41:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42329.6, 300 sec: 42598.4). Total num frames: 1598341120. Throughput: 0: 42507.5. Samples: 1598518100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 08:41:26,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 08:41:29,393][12883] Updated weights for policy 0, policy_version 97563 (0.0040) +[2024-06-18 08:41:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1598570496. Throughput: 0: 42572.5. Samples: 1598644600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:41:31,994][12645] Avg episode reward: [(0, '0.174')] +[2024-06-18 08:41:32,805][12883] Updated weights for policy 0, policy_version 97573 (0.0039) +[2024-06-18 08:41:36,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1598783488. Throughput: 0: 42833.0. Samples: 1598908700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:41:36,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 08:41:37,039][12883] Updated weights for policy 0, policy_version 97583 (0.0041) +[2024-06-18 08:41:40,437][12883] Updated weights for policy 0, policy_version 97593 (0.0036) +[2024-06-18 08:41:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1598980096. Throughput: 0: 42628.0. Samples: 1599157880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:41:41,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 08:41:44,587][12883] Updated weights for policy 0, policy_version 97603 (0.0031) +[2024-06-18 08:41:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1599209472. Throughput: 0: 42539.6. Samples: 1599281820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:41:46,994][12645] Avg episode reward: [(0, '0.625')] +[2024-06-18 08:41:48,170][12883] Updated weights for policy 0, policy_version 97613 (0.0023) +[2024-06-18 08:41:52,000][12645] Fps is (10 sec: 44209.2, 60 sec: 42867.0, 300 sec: 42597.5). Total num frames: 1599422464. Throughput: 0: 42740.8. Samples: 1599545980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:41:52,001][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 08:41:52,261][12883] Updated weights for policy 0, policy_version 97623 (0.0032) +[2024-06-18 08:41:56,110][12883] Updated weights for policy 0, policy_version 97633 (0.0035) +[2024-06-18 08:41:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1599619072. Throughput: 0: 42509.0. Samples: 1599792340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:41:56,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 08:41:59,876][12883] Updated weights for policy 0, policy_version 97643 (0.0028) +[2024-06-18 08:42:01,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1599864832. Throughput: 0: 42585.4. Samples: 1599920900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:42:01,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 08:42:04,012][12883] Updated weights for policy 0, policy_version 97653 (0.0034) +[2024-06-18 08:42:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1600045056. Throughput: 0: 42541.3. Samples: 1600177640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:42:06,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 08:42:07,719][12883] Updated weights for policy 0, policy_version 97663 (0.0024) +[2024-06-18 08:42:11,812][12883] Updated weights for policy 0, policy_version 97673 (0.0048) +[2024-06-18 08:42:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1600274432. Throughput: 0: 42385.9. Samples: 1600425460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:42:11,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 08:42:15,452][12883] Updated weights for policy 0, policy_version 97683 (0.0039) +[2024-06-18 08:42:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1600487424. Throughput: 0: 42545.0. Samples: 1600559120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:42:16,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 08:42:19,422][12862] Signal inference workers to stop experience collection... (23350 times) +[2024-06-18 08:42:19,459][12883] InferenceWorker_p0-w0: stopping experience collection (23350 times) +[2024-06-18 08:42:19,470][12862] Signal inference workers to resume experience collection... (23350 times) +[2024-06-18 08:42:19,482][12883] InferenceWorker_p0-w0: resuming experience collection (23350 times) +[2024-06-18 08:42:19,598][12883] Updated weights for policy 0, policy_version 97693 (0.0047) +[2024-06-18 08:42:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1600667648. Throughput: 0: 42321.7. Samples: 1600813180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 08:42:21,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 08:42:23,187][12883] Updated weights for policy 0, policy_version 97703 (0.0029) +[2024-06-18 08:42:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1600897024. Throughput: 0: 42233.7. Samples: 1601058400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:42:26,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 08:42:28,006][12883] Updated weights for policy 0, policy_version 97713 (0.0041) +[2024-06-18 08:42:30,980][12883] Updated weights for policy 0, policy_version 97723 (0.0047) +[2024-06-18 08:42:31,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1601142784. Throughput: 0: 42487.8. Samples: 1601193780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:42:31,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 08:42:35,694][12883] Updated weights for policy 0, policy_version 97733 (0.0034) +[2024-06-18 08:42:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1601323008. Throughput: 0: 42383.2. Samples: 1601452960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:42:36,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 08:42:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097737_1601323008.pth... +[2024-06-18 08:42:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097114_1591115776.pth +[2024-06-18 08:42:38,636][12883] Updated weights for policy 0, policy_version 97743 (0.0032) +[2024-06-18 08:42:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1601552384. Throughput: 0: 42366.6. Samples: 1601698840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:42:41,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 08:42:43,299][12883] Updated weights for policy 0, policy_version 97753 (0.0033) +[2024-06-18 08:42:46,556][12883] Updated weights for policy 0, policy_version 97763 (0.0037) +[2024-06-18 08:42:46,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1601781760. Throughput: 0: 42506.7. Samples: 1601833700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:42:46,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 08:42:50,867][12883] Updated weights for policy 0, policy_version 97773 (0.0026) +[2024-06-18 08:42:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42056.7, 300 sec: 42542.9). Total num frames: 1601945600. Throughput: 0: 42560.1. Samples: 1602092840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:42:51,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 08:42:54,137][12883] Updated weights for policy 0, policy_version 97783 (0.0039) +[2024-06-18 08:42:56,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42765.2). Total num frames: 1602207744. Throughput: 0: 42610.6. Samples: 1602342940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:42:56,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 08:42:58,439][12883] Updated weights for policy 0, policy_version 97793 (0.0032) +[2024-06-18 08:43:01,789][12883] Updated weights for policy 0, policy_version 97803 (0.0033) +[2024-06-18 08:43:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1602404352. Throughput: 0: 42518.7. Samples: 1602472460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:43:01,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 08:43:06,137][12883] Updated weights for policy 0, policy_version 97813 (0.0035) +[2024-06-18 08:43:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1602600960. Throughput: 0: 42581.3. Samples: 1602729340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:43:06,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 08:43:09,625][12883] Updated weights for policy 0, policy_version 97823 (0.0037) +[2024-06-18 08:43:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1602830336. Throughput: 0: 42630.8. Samples: 1602976780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:43:11,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 08:43:13,717][12883] Updated weights for policy 0, policy_version 97833 (0.0050) +[2024-06-18 08:43:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1603026944. Throughput: 0: 42492.5. Samples: 1603105940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:43:16,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 08:43:17,447][12883] Updated weights for policy 0, policy_version 97843 (0.0061) +[2024-06-18 08:43:21,360][12883] Updated weights for policy 0, policy_version 97853 (0.0039) +[2024-06-18 08:43:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1603239936. Throughput: 0: 42366.6. Samples: 1603359460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 08:43:21,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 08:43:24,873][12883] Updated weights for policy 0, policy_version 97863 (0.0032) +[2024-06-18 08:43:26,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1603452928. Throughput: 0: 42552.2. Samples: 1603613680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:43:26,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 08:43:28,955][12883] Updated weights for policy 0, policy_version 97873 (0.0033) +[2024-06-18 08:43:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1603649536. Throughput: 0: 42479.6. Samples: 1603745280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:43:31,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 08:43:32,685][12883] Updated weights for policy 0, policy_version 97883 (0.0045) +[2024-06-18 08:43:36,508][12883] Updated weights for policy 0, policy_version 97893 (0.0038) +[2024-06-18 08:43:36,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1603878912. Throughput: 0: 42408.2. Samples: 1604001220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:43:36,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 08:43:40,684][12883] Updated weights for policy 0, policy_version 97903 (0.0029) +[2024-06-18 08:43:41,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1604108288. Throughput: 0: 42253.0. Samples: 1604244320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:43:41,994][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 08:43:44,143][12862] Signal inference workers to stop experience collection... (23400 times) +[2024-06-18 08:43:44,143][12862] Signal inference workers to resume experience collection... (23400 times) +[2024-06-18 08:43:44,187][12883] InferenceWorker_p0-w0: stopping experience collection (23400 times) +[2024-06-18 08:43:44,188][12883] InferenceWorker_p0-w0: resuming experience collection (23400 times) +[2024-06-18 08:43:44,282][12883] Updated weights for policy 0, policy_version 97913 (0.0030) +[2024-06-18 08:43:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.0, 300 sec: 42487.3). Total num frames: 1604272128. Throughput: 0: 42262.1. Samples: 1604374260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:43:46,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 08:43:48,289][12883] Updated weights for policy 0, policy_version 97923 (0.0031) +[2024-06-18 08:43:51,970][12883] Updated weights for policy 0, policy_version 97933 (0.0041) +[2024-06-18 08:43:51,996][12645] Fps is (10 sec: 42588.7, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 1604534272. Throughput: 0: 42269.9. Samples: 1604631580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:43:51,996][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 08:43:56,055][12883] Updated weights for policy 0, policy_version 97943 (0.0041) +[2024-06-18 08:43:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1604730880. Throughput: 0: 42374.5. Samples: 1604883640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:43:56,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 08:44:00,001][12883] Updated weights for policy 0, policy_version 97953 (0.0037) +[2024-06-18 08:44:01,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1604927488. Throughput: 0: 42292.4. Samples: 1605009100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:44:01,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 08:44:03,893][12883] Updated weights for policy 0, policy_version 97963 (0.0040) +[2024-06-18 08:44:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1605140480. Throughput: 0: 42218.7. Samples: 1605259300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:44:06,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 08:44:07,751][12883] Updated weights for policy 0, policy_version 97973 (0.0035) +[2024-06-18 08:44:11,665][12883] Updated weights for policy 0, policy_version 97983 (0.0031) +[2024-06-18 08:44:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1605353472. Throughput: 0: 42299.9. Samples: 1605517180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:44:11,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 08:44:15,411][12883] Updated weights for policy 0, policy_version 97993 (0.0032) +[2024-06-18 08:44:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1605566464. Throughput: 0: 42205.3. Samples: 1605644520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:44:16,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 08:44:19,283][12883] Updated weights for policy 0, policy_version 98003 (0.0023) +[2024-06-18 08:44:21,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 1605795840. Throughput: 0: 42208.7. Samples: 1605900700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 08:44:21,996][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 08:44:23,126][12883] Updated weights for policy 0, policy_version 98013 (0.0042) +[2024-06-18 08:44:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1605992448. Throughput: 0: 42444.0. Samples: 1606154300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:44:26,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 08:44:27,001][12883] Updated weights for policy 0, policy_version 98023 (0.0040) +[2024-06-18 08:44:31,080][12883] Updated weights for policy 0, policy_version 98033 (0.0038) +[2024-06-18 08:44:31,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1606221824. Throughput: 0: 42399.1. Samples: 1606282220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:44:31,996][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 08:44:34,627][12883] Updated weights for policy 0, policy_version 98043 (0.0039) +[2024-06-18 08:44:36,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 1606418432. Throughput: 0: 42268.9. Samples: 1606533680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:44:36,997][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 08:44:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098048_1606418432.pth... +[2024-06-18 08:44:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097426_1596227584.pth +[2024-06-18 08:44:38,806][12883] Updated weights for policy 0, policy_version 98053 (0.0036) +[2024-06-18 08:44:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1606631424. Throughput: 0: 42347.3. Samples: 1606789260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:44:41,994][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 08:44:42,146][12883] Updated weights for policy 0, policy_version 98063 (0.0028) +[2024-06-18 08:44:46,341][12883] Updated weights for policy 0, policy_version 98073 (0.0045) +[2024-06-18 08:44:46,994][12645] Fps is (10 sec: 44246.3, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1606860800. Throughput: 0: 42482.2. Samples: 1606920800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:44:46,994][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 08:44:49,962][12883] Updated weights for policy 0, policy_version 98083 (0.0031) +[2024-06-18 08:44:51,994][12645] Fps is (10 sec: 39320.7, 60 sec: 41507.6, 300 sec: 42431.8). Total num frames: 1607024640. Throughput: 0: 42394.6. Samples: 1607167060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:44:51,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 08:44:54,080][12883] Updated weights for policy 0, policy_version 98093 (0.0040) +[2024-06-18 08:44:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1607270400. Throughput: 0: 42471.0. Samples: 1607428380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:44:56,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 08:44:57,733][12883] Updated weights for policy 0, policy_version 98103 (0.0035) +[2024-06-18 08:45:01,920][12883] Updated weights for policy 0, policy_version 98113 (0.0043) +[2024-06-18 08:45:01,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1607483392. Throughput: 0: 42514.1. Samples: 1607557660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:45:02,000][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 08:45:05,480][12883] Updated weights for policy 0, policy_version 98123 (0.0046) +[2024-06-18 08:45:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1607680000. Throughput: 0: 42324.6. Samples: 1607805220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:45:06,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 08:45:09,310][12883] Updated weights for policy 0, policy_version 98133 (0.0033) +[2024-06-18 08:45:11,612][12862] Signal inference workers to stop experience collection... (23450 times) +[2024-06-18 08:45:11,644][12883] InferenceWorker_p0-w0: stopping experience collection (23450 times) +[2024-06-18 08:45:11,670][12862] Signal inference workers to resume experience collection... (23450 times) +[2024-06-18 08:45:11,671][12883] InferenceWorker_p0-w0: resuming experience collection (23450 times) +[2024-06-18 08:45:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1607909376. Throughput: 0: 42562.2. Samples: 1608069600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:45:11,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 08:45:13,015][12883] Updated weights for policy 0, policy_version 98143 (0.0033) +[2024-06-18 08:45:16,849][12883] Updated weights for policy 0, policy_version 98153 (0.0038) +[2024-06-18 08:45:16,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1608138752. Throughput: 0: 42680.9. Samples: 1608202860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:45:16,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 08:45:20,596][12883] Updated weights for policy 0, policy_version 98163 (0.0040) +[2024-06-18 08:45:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42053.8, 300 sec: 42432.7). Total num frames: 1608318976. Throughput: 0: 42626.1. Samples: 1608451760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 08:45:21,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 08:45:24,483][12883] Updated weights for policy 0, policy_version 98173 (0.0028) +[2024-06-18 08:45:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1608548352. Throughput: 0: 42706.5. Samples: 1608711060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:45:26,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 08:45:28,326][12883] Updated weights for policy 0, policy_version 98183 (0.0031) +[2024-06-18 08:45:31,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1608761344. Throughput: 0: 42700.7. Samples: 1608842320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:45:31,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 08:45:32,306][12883] Updated weights for policy 0, policy_version 98193 (0.0032) +[2024-06-18 08:45:35,892][12883] Updated weights for policy 0, policy_version 98203 (0.0036) +[2024-06-18 08:45:36,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 1608974336. Throughput: 0: 42848.1. Samples: 1609095220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:45:36,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 08:45:40,035][12883] Updated weights for policy 0, policy_version 98213 (0.0041) +[2024-06-18 08:45:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1609170944. Throughput: 0: 42598.4. Samples: 1609345300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:45:41,994][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 08:45:43,730][12883] Updated weights for policy 0, policy_version 98223 (0.0024) +[2024-06-18 08:45:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1609400320. Throughput: 0: 42437.3. Samples: 1609467340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:45:46,994][12645] Avg episode reward: [(0, '0.734')] +[2024-06-18 08:45:47,782][12883] Updated weights for policy 0, policy_version 98233 (0.0035) +[2024-06-18 08:45:51,909][12883] Updated weights for policy 0, policy_version 98243 (0.0036) +[2024-06-18 08:45:51,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 1609613312. Throughput: 0: 42685.9. Samples: 1609726080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:45:51,994][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 08:45:55,371][12883] Updated weights for policy 0, policy_version 98253 (0.0043) +[2024-06-18 08:45:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1609809920. Throughput: 0: 42356.0. Samples: 1609975620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:45:56,994][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 08:45:59,485][12883] Updated weights for policy 0, policy_version 98263 (0.0033) +[2024-06-18 08:46:01,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1610022912. Throughput: 0: 42219.7. Samples: 1610102740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:46:01,994][12645] Avg episode reward: [(0, '0.154')] +[2024-06-18 08:46:03,018][12883] Updated weights for policy 0, policy_version 98273 (0.0046) +[2024-06-18 08:46:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1610252288. Throughput: 0: 42559.7. Samples: 1610366940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:46:06,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 08:46:06,998][12883] Updated weights for policy 0, policy_version 98283 (0.0037) +[2024-06-18 08:46:10,886][12883] Updated weights for policy 0, policy_version 98293 (0.0024) +[2024-06-18 08:46:11,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1610465280. Throughput: 0: 42267.1. Samples: 1610613080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:46:11,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 08:46:14,621][12883] Updated weights for policy 0, policy_version 98303 (0.0035) +[2024-06-18 08:46:17,000][12645] Fps is (10 sec: 39296.7, 60 sec: 41774.9, 300 sec: 42375.3). Total num frames: 1610645504. Throughput: 0: 42189.1. Samples: 1610741100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:46:17,001][12645] Avg episode reward: [(0, '0.648')] +[2024-06-18 08:46:18,482][12883] Updated weights for policy 0, policy_version 98313 (0.0031) +[2024-06-18 08:46:21,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1610891264. Throughput: 0: 42376.9. Samples: 1611002180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) +[2024-06-18 08:46:21,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 08:46:22,314][12883] Updated weights for policy 0, policy_version 98323 (0.0023) +[2024-06-18 08:46:24,993][12862] Signal inference workers to stop experience collection... (23500 times) +[2024-06-18 08:46:25,032][12883] InferenceWorker_p0-w0: stopping experience collection (23500 times) +[2024-06-18 08:46:25,041][12862] Signal inference workers to resume experience collection... (23500 times) +[2024-06-18 08:46:25,049][12883] InferenceWorker_p0-w0: resuming experience collection (23500 times) +[2024-06-18 08:46:26,171][12883] Updated weights for policy 0, policy_version 98333 (0.0041) +[2024-06-18 08:46:26,994][12645] Fps is (10 sec: 45903.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1611104256. Throughput: 0: 42327.9. Samples: 1611250060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:46:26,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 08:46:29,916][12883] Updated weights for policy 0, policy_version 98343 (0.0048) +[2024-06-18 08:46:32,000][12645] Fps is (10 sec: 40934.1, 60 sec: 42320.8, 300 sec: 42430.9). Total num frames: 1611300864. Throughput: 0: 42437.2. Samples: 1611377280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:46:32,001][12645] Avg episode reward: [(0, '0.662')] +[2024-06-18 08:46:34,153][12883] Updated weights for policy 0, policy_version 98353 (0.0035) +[2024-06-18 08:46:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1611530240. Throughput: 0: 42568.1. Samples: 1611641640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:46:36,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 08:46:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098361_1611546624.pth... +[2024-06-18 08:46:37,057][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097737_1601323008.pth +[2024-06-18 08:46:38,189][12883] Updated weights for policy 0, policy_version 98363 (0.0035) +[2024-06-18 08:46:41,832][12883] Updated weights for policy 0, policy_version 98373 (0.0028) +[2024-06-18 08:46:41,994][12645] Fps is (10 sec: 44264.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1611743232. Throughput: 0: 42656.0. Samples: 1611895140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:46:41,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 08:46:45,635][12883] Updated weights for policy 0, policy_version 98383 (0.0028) +[2024-06-18 08:46:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42488.2). Total num frames: 1611956224. Throughput: 0: 42634.1. Samples: 1612021280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:46:46,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 08:46:49,381][12883] Updated weights for policy 0, policy_version 98393 (0.0038) +[2024-06-18 08:46:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1612169216. Throughput: 0: 42543.4. Samples: 1612281400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:46:51,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 08:46:53,513][12883] Updated weights for policy 0, policy_version 98403 (0.0039) +[2024-06-18 08:46:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1612382208. Throughput: 0: 42770.4. Samples: 1612537740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:46:56,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 08:46:57,016][12883] Updated weights for policy 0, policy_version 98413 (0.0040) +[2024-06-18 08:47:01,087][12883] Updated weights for policy 0, policy_version 98423 (0.0035) +[2024-06-18 08:47:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1612595200. Throughput: 0: 42753.5. Samples: 1612664740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:47:01,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 08:47:05,287][12883] Updated weights for policy 0, policy_version 98433 (0.0032) +[2024-06-18 08:47:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1612808192. Throughput: 0: 42653.4. Samples: 1612921580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:47:06,994][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 08:47:08,792][12883] Updated weights for policy 0, policy_version 98443 (0.0033) +[2024-06-18 08:47:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1613021184. Throughput: 0: 42718.6. Samples: 1613172400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:47:11,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 08:47:13,000][12883] Updated weights for policy 0, policy_version 98453 (0.0029) +[2024-06-18 08:47:16,550][12883] Updated weights for policy 0, policy_version 98463 (0.0033) +[2024-06-18 08:47:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42876.0, 300 sec: 42542.9). Total num frames: 1613217792. Throughput: 0: 42776.3. Samples: 1613301940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:47:16,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 08:47:20,671][12883] Updated weights for policy 0, policy_version 98473 (0.0031) +[2024-06-18 08:47:21,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1613463552. Throughput: 0: 42698.2. Samples: 1613563060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 08:47:21,994][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 08:47:24,144][12883] Updated weights for policy 0, policy_version 98483 (0.0039) +[2024-06-18 08:47:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1613676544. Throughput: 0: 42591.6. Samples: 1613811760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:47:26,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 08:47:28,466][12883] Updated weights for policy 0, policy_version 98493 (0.0047) +[2024-06-18 08:47:31,660][12883] Updated weights for policy 0, policy_version 98503 (0.0038) +[2024-06-18 08:47:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42876.0, 300 sec: 42542.9). Total num frames: 1613873152. Throughput: 0: 42763.5. Samples: 1613945640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:47:31,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 08:47:32,885][12862] Signal inference workers to stop experience collection... (23550 times) +[2024-06-18 08:47:32,885][12862] Signal inference workers to resume experience collection... (23550 times) +[2024-06-18 08:47:32,937][12883] InferenceWorker_p0-w0: stopping experience collection (23550 times) +[2024-06-18 08:47:32,937][12883] InferenceWorker_p0-w0: resuming experience collection (23550 times) +[2024-06-18 08:47:35,996][12883] Updated weights for policy 0, policy_version 98513 (0.0038) +[2024-06-18 08:47:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1614069760. Throughput: 0: 42788.5. Samples: 1614206880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:47:36,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 08:47:39,183][12883] Updated weights for policy 0, policy_version 98523 (0.0035) +[2024-06-18 08:47:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1614315520. Throughput: 0: 42767.6. Samples: 1614462280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:47:41,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 08:47:43,702][12883] Updated weights for policy 0, policy_version 98533 (0.0032) +[2024-06-18 08:47:46,734][12883] Updated weights for policy 0, policy_version 98543 (0.0029) +[2024-06-18 08:47:46,996][12645] Fps is (10 sec: 45864.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1614528512. Throughput: 0: 42838.3. Samples: 1614592560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:47:46,996][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 08:47:51,478][12883] Updated weights for policy 0, policy_version 98553 (0.0036) +[2024-06-18 08:47:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1614708736. Throughput: 0: 42737.7. Samples: 1614844780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:47:51,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 08:47:54,505][12883] Updated weights for policy 0, policy_version 98563 (0.0031) +[2024-06-18 08:47:56,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1614938112. Throughput: 0: 42827.2. Samples: 1615099620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:47:56,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 08:47:59,054][12883] Updated weights for policy 0, policy_version 98573 (0.0031) +[2024-06-18 08:48:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1615151104. Throughput: 0: 42960.4. Samples: 1615235160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:48:01,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 08:48:02,186][12883] Updated weights for policy 0, policy_version 98583 (0.0031) +[2024-06-18 08:48:06,619][12883] Updated weights for policy 0, policy_version 98593 (0.0031) +[2024-06-18 08:48:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1615347712. Throughput: 0: 42770.5. Samples: 1615487740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:48:06,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 08:48:09,767][12883] Updated weights for policy 0, policy_version 98603 (0.0029) +[2024-06-18 08:48:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1615577088. Throughput: 0: 42727.4. Samples: 1615734500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:48:11,994][12645] Avg episode reward: [(0, '0.622')] +[2024-06-18 08:48:14,067][12883] Updated weights for policy 0, policy_version 98613 (0.0040) +[2024-06-18 08:48:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1615790080. Throughput: 0: 42669.8. Samples: 1615865780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:48:16,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 08:48:17,390][12883] Updated weights for policy 0, policy_version 98623 (0.0041) +[2024-06-18 08:48:21,732][12883] Updated weights for policy 0, policy_version 98633 (0.0038) +[2024-06-18 08:48:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1616003072. Throughput: 0: 42551.5. Samples: 1616121700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 08:48:21,995][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 08:48:25,027][12883] Updated weights for policy 0, policy_version 98643 (0.0026) +[2024-06-18 08:48:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1616216064. Throughput: 0: 42670.6. Samples: 1616382460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:48:26,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 08:48:29,221][12883] Updated weights for policy 0, policy_version 98653 (0.0024) +[2024-06-18 08:48:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1616445440. Throughput: 0: 42623.4. Samples: 1616510520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:48:31,994][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 08:48:32,786][12883] Updated weights for policy 0, policy_version 98663 (0.0033) +[2024-06-18 08:48:36,764][12883] Updated weights for policy 0, policy_version 98673 (0.0041) +[2024-06-18 08:48:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1616658432. Throughput: 0: 42715.5. Samples: 1616766980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:48:36,994][12645] Avg episode reward: [(0, '0.673')] +[2024-06-18 08:48:37,113][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098674_1616674816.pth... +[2024-06-18 08:48:37,170][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098048_1606418432.pth +[2024-06-18 08:48:40,469][12883] Updated weights for policy 0, policy_version 98683 (0.0023) +[2024-06-18 08:48:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1616871424. Throughput: 0: 42772.1. Samples: 1617024360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:48:41,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 08:48:44,313][12883] Updated weights for policy 0, policy_version 98693 (0.0038) +[2024-06-18 08:48:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42543.2). Total num frames: 1617084416. Throughput: 0: 42586.7. Samples: 1617151560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:48:46,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 08:48:48,245][12883] Updated weights for policy 0, policy_version 98703 (0.0026) +[2024-06-18 08:48:51,949][12883] Updated weights for policy 0, policy_version 98713 (0.0032) +[2024-06-18 08:48:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1617313792. Throughput: 0: 42905.3. Samples: 1617418480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:48:51,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 08:48:53,932][12862] Signal inference workers to stop experience collection... (23600 times) +[2024-06-18 08:48:53,978][12883] InferenceWorker_p0-w0: stopping experience collection (23600 times) +[2024-06-18 08:48:53,986][12862] Signal inference workers to resume experience collection... (23600 times) +[2024-06-18 08:48:53,992][12883] InferenceWorker_p0-w0: resuming experience collection (23600 times) +[2024-06-18 08:48:55,851][12883] Updated weights for policy 0, policy_version 98723 (0.0029) +[2024-06-18 08:48:57,000][12645] Fps is (10 sec: 42571.3, 60 sec: 42867.0, 300 sec: 42653.0). Total num frames: 1617510400. Throughput: 0: 42995.0. Samples: 1617669540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:48:57,009][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 08:48:59,608][12883] Updated weights for policy 0, policy_version 98733 (0.0032) +[2024-06-18 08:49:01,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1617739776. Throughput: 0: 42947.6. Samples: 1617798420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:49:01,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 08:49:03,464][12883] Updated weights for policy 0, policy_version 98743 (0.0034) +[2024-06-18 08:49:06,994][12645] Fps is (10 sec: 44264.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1617952768. Throughput: 0: 43121.8. Samples: 1618062180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:49:06,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 08:49:07,111][12883] Updated weights for policy 0, policy_version 98753 (0.0035) +[2024-06-18 08:49:11,098][12883] Updated weights for policy 0, policy_version 98763 (0.0025) +[2024-06-18 08:49:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.7, 300 sec: 42653.9). Total num frames: 1618149376. Throughput: 0: 42992.6. Samples: 1618317120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:49:11,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 08:49:14,803][12883] Updated weights for policy 0, policy_version 98773 (0.0025) +[2024-06-18 08:49:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1618345984. Throughput: 0: 42861.8. Samples: 1618439300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:49:16,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 08:49:19,181][12883] Updated weights for policy 0, policy_version 98783 (0.0036) +[2024-06-18 08:49:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1618591744. Throughput: 0: 42823.1. Samples: 1618694020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 08:49:21,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 08:49:22,304][12883] Updated weights for policy 0, policy_version 98793 (0.0029) +[2024-06-18 08:49:26,756][12883] Updated weights for policy 0, policy_version 98803 (0.0042) +[2024-06-18 08:49:26,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1618804736. Throughput: 0: 43030.1. Samples: 1618960720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:49:26,994][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 08:49:29,849][12883] Updated weights for policy 0, policy_version 98813 (0.0040) +[2024-06-18 08:49:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1619001344. Throughput: 0: 42997.3. Samples: 1619086440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:49:31,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 08:49:34,282][12883] Updated weights for policy 0, policy_version 98823 (0.0039) +[2024-06-18 08:49:36,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1619230720. Throughput: 0: 42911.7. Samples: 1619349500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:49:36,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 08:49:37,418][12883] Updated weights for policy 0, policy_version 98833 (0.0031) +[2024-06-18 08:49:41,629][12883] Updated weights for policy 0, policy_version 98843 (0.0043) +[2024-06-18 08:49:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1619443712. Throughput: 0: 43067.7. Samples: 1619607320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:49:41,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 08:49:45,082][12883] Updated weights for policy 0, policy_version 98853 (0.0046) +[2024-06-18 08:49:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1619640320. Throughput: 0: 43056.4. Samples: 1619735960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:49:46,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 08:49:49,747][12883] Updated weights for policy 0, policy_version 98863 (0.0027) +[2024-06-18 08:49:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1619886080. Throughput: 0: 42866.2. Samples: 1619991160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:49:51,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 08:49:52,949][12883] Updated weights for policy 0, policy_version 98873 (0.0028) +[2024-06-18 08:49:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 1620082688. Throughput: 0: 42905.7. Samples: 1620247880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:49:56,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 08:49:57,087][12883] Updated weights for policy 0, policy_version 98883 (0.0032) +[2024-06-18 08:50:00,469][12883] Updated weights for policy 0, policy_version 98893 (0.0041) +[2024-06-18 08:50:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1620279296. Throughput: 0: 42940.4. Samples: 1620371620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:50:01,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 08:50:04,716][12883] Updated weights for policy 0, policy_version 98903 (0.0033) +[2024-06-18 08:50:06,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1620541440. Throughput: 0: 43140.4. Samples: 1620635340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:50:06,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 08:50:08,039][12883] Updated weights for policy 0, policy_version 98913 (0.0029) +[2024-06-18 08:50:11,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1620738048. Throughput: 0: 42800.5. Samples: 1620886740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:50:11,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 08:50:12,218][12883] Updated weights for policy 0, policy_version 98923 (0.0027) +[2024-06-18 08:50:16,045][12883] Updated weights for policy 0, policy_version 98933 (0.0045) +[2024-06-18 08:50:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1620934656. Throughput: 0: 42889.6. Samples: 1621016480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:50:16,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 08:50:19,743][12883] Updated weights for policy 0, policy_version 98943 (0.0040) +[2024-06-18 08:50:20,660][12862] Signal inference workers to stop experience collection... (23650 times) +[2024-06-18 08:50:20,664][12862] Signal inference workers to resume experience collection... (23650 times) +[2024-06-18 08:50:20,682][12883] InferenceWorker_p0-w0: stopping experience collection (23650 times) +[2024-06-18 08:50:20,682][12883] InferenceWorker_p0-w0: resuming experience collection (23650 times) +[2024-06-18 08:50:21,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1621164032. Throughput: 0: 42893.4. Samples: 1621279800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 08:50:21,996][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 08:50:23,730][12883] Updated weights for policy 0, policy_version 98953 (0.0031) +[2024-06-18 08:50:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1621360640. Throughput: 0: 42749.3. Samples: 1621531040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:50:26,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 08:50:27,705][12883] Updated weights for policy 0, policy_version 98963 (0.0044) +[2024-06-18 08:50:31,301][12883] Updated weights for policy 0, policy_version 98973 (0.0027) +[2024-06-18 08:50:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1621590016. Throughput: 0: 42782.7. Samples: 1621661180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:50:31,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 08:50:35,268][12883] Updated weights for policy 0, policy_version 98983 (0.0027) +[2024-06-18 08:50:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1621819392. Throughput: 0: 42874.6. Samples: 1621920520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:50:36,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 08:50:37,123][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098989_1621835776.pth... +[2024-06-18 08:50:37,171][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098361_1611546624.pth +[2024-06-18 08:50:38,911][12883] Updated weights for policy 0, policy_version 98993 (0.0030) +[2024-06-18 08:50:41,996][12645] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 1622032384. Throughput: 0: 42962.2. Samples: 1622181280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:50:41,997][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 08:50:42,769][12883] Updated weights for policy 0, policy_version 99003 (0.0033) +[2024-06-18 08:50:46,382][12883] Updated weights for policy 0, policy_version 99013 (0.0027) +[2024-06-18 08:50:46,994][12645] Fps is (10 sec: 42599.3, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1622245376. Throughput: 0: 43109.4. Samples: 1622311540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:50:46,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 08:50:50,268][12883] Updated weights for policy 0, policy_version 99023 (0.0042) +[2024-06-18 08:50:51,994][12645] Fps is (10 sec: 44247.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1622474752. Throughput: 0: 43006.8. Samples: 1622570640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:50:51,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 08:50:54,046][12883] Updated weights for policy 0, policy_version 99033 (0.0028) +[2024-06-18 08:50:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 1622687744. Throughput: 0: 43281.4. Samples: 1622834400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:50:56,994][12645] Avg episode reward: [(0, '0.161')] +[2024-06-18 08:50:57,814][12883] Updated weights for policy 0, policy_version 99043 (0.0039) +[2024-06-18 08:51:01,640][12883] Updated weights for policy 0, policy_version 99053 (0.0031) +[2024-06-18 08:51:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 1622900736. Throughput: 0: 43159.7. Samples: 1622958660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:51:01,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 08:51:05,343][12883] Updated weights for policy 0, policy_version 99063 (0.0032) +[2024-06-18 08:51:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1623130112. Throughput: 0: 43139.8. Samples: 1623221000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:51:06,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 08:51:09,203][12883] Updated weights for policy 0, policy_version 99073 (0.0022) +[2024-06-18 08:51:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42988.1). Total num frames: 1623326720. Throughput: 0: 43402.3. Samples: 1623484140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:51:11,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 08:51:12,756][12883] Updated weights for policy 0, policy_version 99083 (0.0045) +[2024-06-18 08:51:16,814][12883] Updated weights for policy 0, policy_version 99093 (0.0035) +[2024-06-18 08:51:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 1623556096. Throughput: 0: 43208.8. Samples: 1623605580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:51:16,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 08:51:20,202][12883] Updated weights for policy 0, policy_version 99103 (0.0027) +[2024-06-18 08:51:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43692.3, 300 sec: 42987.2). Total num frames: 1623785472. Throughput: 0: 43328.6. Samples: 1623870300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 08:51:21,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 08:51:24,224][12883] Updated weights for policy 0, policy_version 99113 (0.0044) +[2024-06-18 08:51:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 42988.1). Total num frames: 1623982080. Throughput: 0: 43412.8. Samples: 1624134760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:51:26,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 08:51:28,005][12862] Signal inference workers to stop experience collection... (23700 times) +[2024-06-18 08:51:28,052][12883] InferenceWorker_p0-w0: stopping experience collection (23700 times) +[2024-06-18 08:51:28,052][12862] Signal inference workers to resume experience collection... (23700 times) +[2024-06-18 08:51:28,063][12883] InferenceWorker_p0-w0: resuming experience collection (23700 times) +[2024-06-18 08:51:28,214][12883] Updated weights for policy 0, policy_version 99123 (0.0028) +[2024-06-18 08:51:31,689][12883] Updated weights for policy 0, policy_version 99133 (0.0038) +[2024-06-18 08:51:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1624195072. Throughput: 0: 43229.7. Samples: 1624256880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:51:31,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 08:51:35,673][12883] Updated weights for policy 0, policy_version 99143 (0.0024) +[2024-06-18 08:51:36,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 1624440832. Throughput: 0: 43403.4. Samples: 1624523800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:51:36,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 08:51:39,578][12883] Updated weights for policy 0, policy_version 99153 (0.0041) +[2024-06-18 08:51:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43419.2, 300 sec: 42987.2). Total num frames: 1624637440. Throughput: 0: 43290.2. Samples: 1624782460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:51:41,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 08:51:43,250][12883] Updated weights for policy 0, policy_version 99163 (0.0035) +[2024-06-18 08:51:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1624834048. Throughput: 0: 43239.5. Samples: 1624904440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:51:46,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 08:51:47,085][12883] Updated weights for policy 0, policy_version 99173 (0.0029) +[2024-06-18 08:51:50,801][12883] Updated weights for policy 0, policy_version 99183 (0.0042) +[2024-06-18 08:51:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1625063424. Throughput: 0: 43294.8. Samples: 1625169260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:51:51,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 08:51:54,836][12883] Updated weights for policy 0, policy_version 99193 (0.0032) +[2024-06-18 08:51:56,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42869.9, 300 sec: 42931.3). Total num frames: 1625260032. Throughput: 0: 43137.9. Samples: 1625425440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:51:56,996][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 08:51:58,583][12883] Updated weights for policy 0, policy_version 99203 (0.0033) +[2024-06-18 08:52:01,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 1625473024. Throughput: 0: 43137.3. Samples: 1625546760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:52:01,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 08:52:02,432][12883] Updated weights for policy 0, policy_version 99213 (0.0031) +[2024-06-18 08:52:06,085][12883] Updated weights for policy 0, policy_version 99223 (0.0033) +[2024-06-18 08:52:06,994][12645] Fps is (10 sec: 44246.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1625702400. Throughput: 0: 43172.7. Samples: 1625813080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:52:06,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 08:52:09,908][12883] Updated weights for policy 0, policy_version 99233 (0.0036) +[2024-06-18 08:52:11,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1625915392. Throughput: 0: 42944.9. Samples: 1626067280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:52:11,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 08:52:13,753][12883] Updated weights for policy 0, policy_version 99243 (0.0043) +[2024-06-18 08:52:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1626112000. Throughput: 0: 42975.5. Samples: 1626190780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:52:16,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 08:52:17,994][12883] Updated weights for policy 0, policy_version 99253 (0.0032) +[2024-06-18 08:52:21,447][12883] Updated weights for policy 0, policy_version 99263 (0.0036) +[2024-06-18 08:52:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1626341376. Throughput: 0: 42744.1. Samples: 1626447280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 08:52:21,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 08:52:25,559][12883] Updated weights for policy 0, policy_version 99273 (0.0029) +[2024-06-18 08:52:26,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1626554368. Throughput: 0: 42776.5. Samples: 1626707400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:52:26,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 08:52:29,421][12883] Updated weights for policy 0, policy_version 99283 (0.0036) +[2024-06-18 08:52:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 1626750976. Throughput: 0: 42834.1. Samples: 1626831980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:52:31,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 08:52:33,228][12883] Updated weights for policy 0, policy_version 99293 (0.0040) +[2024-06-18 08:52:36,881][12883] Updated weights for policy 0, policy_version 99303 (0.0030) +[2024-06-18 08:52:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 1626980352. Throughput: 0: 42604.7. Samples: 1627086480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:52:36,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 08:52:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099303_1626980352.pth... +[2024-06-18 08:52:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098674_1616674816.pth +[2024-06-18 08:52:40,914][12883] Updated weights for policy 0, policy_version 99313 (0.0042) +[2024-06-18 08:52:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 1627193344. Throughput: 0: 42670.1. Samples: 1627345500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:52:41,994][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 08:52:44,509][12883] Updated weights for policy 0, policy_version 99323 (0.0031) +[2024-06-18 08:52:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1627389952. Throughput: 0: 42713.0. Samples: 1627468840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:52:46,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 08:52:48,751][12883] Updated weights for policy 0, policy_version 99333 (0.0035) +[2024-06-18 08:52:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 1627619328. Throughput: 0: 42565.9. Samples: 1627728540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:52:51,994][12645] Avg episode reward: [(0, '0.658')] +[2024-06-18 08:52:52,174][12883] Updated weights for policy 0, policy_version 99343 (0.0037) +[2024-06-18 08:52:56,352][12883] Updated weights for policy 0, policy_version 99353 (0.0031) +[2024-06-18 08:52:56,572][12862] Signal inference workers to stop experience collection... (23750 times) +[2024-06-18 08:52:56,625][12862] Signal inference workers to resume experience collection... (23750 times) +[2024-06-18 08:52:56,626][12883] InferenceWorker_p0-w0: stopping experience collection (23750 times) +[2024-06-18 08:52:56,643][12883] InferenceWorker_p0-w0: resuming experience collection (23750 times) +[2024-06-18 08:52:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 1627832320. Throughput: 0: 42721.8. Samples: 1627989760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:52:56,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 08:52:59,791][12883] Updated weights for policy 0, policy_version 99363 (0.0028) +[2024-06-18 08:53:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 1628045312. Throughput: 0: 42809.4. Samples: 1628117200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:53:01,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 08:53:03,943][12883] Updated weights for policy 0, policy_version 99373 (0.0033) +[2024-06-18 08:53:06,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1628274688. Throughput: 0: 42853.2. Samples: 1628375680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:53:06,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 08:53:07,402][12883] Updated weights for policy 0, policy_version 99383 (0.0041) +[2024-06-18 08:53:11,718][12883] Updated weights for policy 0, policy_version 99393 (0.0039) +[2024-06-18 08:53:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 1628471296. Throughput: 0: 42929.2. Samples: 1628639220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:53:11,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 08:53:15,019][12883] Updated weights for policy 0, policy_version 99403 (0.0027) +[2024-06-18 08:53:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1628684288. Throughput: 0: 42850.3. Samples: 1628760240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:53:16,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 08:53:19,355][12883] Updated weights for policy 0, policy_version 99413 (0.0034) +[2024-06-18 08:53:21,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1628913664. Throughput: 0: 42809.1. Samples: 1629012880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) +[2024-06-18 08:53:21,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 08:53:22,907][12883] Updated weights for policy 0, policy_version 99423 (0.0039) +[2024-06-18 08:53:26,997][12645] Fps is (10 sec: 39306.9, 60 sec: 42049.6, 300 sec: 42820.0). Total num frames: 1629077504. Throughput: 0: 42907.1. Samples: 1629276480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:53:26,998][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 08:53:27,386][12883] Updated weights for policy 0, policy_version 99433 (0.0047) +[2024-06-18 08:53:30,239][12883] Updated weights for policy 0, policy_version 99443 (0.0028) +[2024-06-18 08:53:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 1629323264. Throughput: 0: 42843.7. Samples: 1629396800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:53:31,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 08:53:34,931][12883] Updated weights for policy 0, policy_version 99453 (0.0033) +[2024-06-18 08:53:36,994][12645] Fps is (10 sec: 47530.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1629552640. Throughput: 0: 42805.7. Samples: 1629654800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:53:36,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 08:53:37,898][12883] Updated weights for policy 0, policy_version 99463 (0.0027) +[2024-06-18 08:53:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1629732864. Throughput: 0: 42908.9. Samples: 1629920660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:53:41,994][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 08:53:42,408][12883] Updated weights for policy 0, policy_version 99473 (0.0036) +[2024-06-18 08:53:45,675][12883] Updated weights for policy 0, policy_version 99483 (0.0032) +[2024-06-18 08:53:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1629962240. Throughput: 0: 42878.1. Samples: 1630046720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:53:46,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 08:53:49,858][12883] Updated weights for policy 0, policy_version 99493 (0.0037) +[2024-06-18 08:53:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42988.1). Total num frames: 1630191616. Throughput: 0: 42907.2. Samples: 1630306500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:53:51,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 08:53:53,460][12883] Updated weights for policy 0, policy_version 99503 (0.0042) +[2024-06-18 08:53:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1630388224. Throughput: 0: 42701.3. Samples: 1630560780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:53:56,994][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 08:53:57,356][12883] Updated weights for policy 0, policy_version 99513 (0.0035) +[2024-06-18 08:54:01,292][12883] Updated weights for policy 0, policy_version 99523 (0.0041) +[2024-06-18 08:54:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1630617600. Throughput: 0: 42832.9. Samples: 1630687720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:54:01,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 08:54:04,933][12883] Updated weights for policy 0, policy_version 99533 (0.0038) +[2024-06-18 08:54:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1630830592. Throughput: 0: 43031.9. Samples: 1630949320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:54:06,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 08:54:08,921][12883] Updated weights for policy 0, policy_version 99543 (0.0029) +[2024-06-18 08:54:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1631043584. Throughput: 0: 42888.4. Samples: 1631206300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:54:11,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 08:54:12,374][12883] Updated weights for policy 0, policy_version 99553 (0.0027) +[2024-06-18 08:54:16,269][12883] Updated weights for policy 0, policy_version 99563 (0.0038) +[2024-06-18 08:54:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1631256576. Throughput: 0: 43186.9. Samples: 1631340220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:54:16,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 08:54:19,856][12883] Updated weights for policy 0, policy_version 99573 (0.0039) +[2024-06-18 08:54:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1631485952. Throughput: 0: 43106.3. Samples: 1631594580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) +[2024-06-18 08:54:21,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 08:54:23,695][12883] Updated weights for policy 0, policy_version 99583 (0.0027) +[2024-06-18 08:54:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43420.3, 300 sec: 42987.2). Total num frames: 1631682560. Throughput: 0: 42871.1. Samples: 1631849860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:54:26,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 08:54:27,920][12883] Updated weights for policy 0, policy_version 99593 (0.0035) +[2024-06-18 08:54:28,011][12862] Signal inference workers to stop experience collection... (23800 times) +[2024-06-18 08:54:28,059][12862] Signal inference workers to resume experience collection... (23800 times) +[2024-06-18 08:54:28,071][12883] InferenceWorker_p0-w0: stopping experience collection (23800 times) +[2024-06-18 08:54:28,106][12883] InferenceWorker_p0-w0: resuming experience collection (23800 times) +[2024-06-18 08:54:31,185][12883] Updated weights for policy 0, policy_version 99603 (0.0033) +[2024-06-18 08:54:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1631895552. Throughput: 0: 42954.4. Samples: 1631979660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:54:31,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 08:54:35,398][12883] Updated weights for policy 0, policy_version 99613 (0.0036) +[2024-06-18 08:54:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 1632124928. Throughput: 0: 42846.2. Samples: 1632234580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:54:36,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 08:54:37,100][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099618_1632141312.pth... +[2024-06-18 08:54:37,159][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098989_1621835776.pth +[2024-06-18 08:54:38,761][12883] Updated weights for policy 0, policy_version 99623 (0.0045) +[2024-06-18 08:54:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1632321536. Throughput: 0: 42946.7. Samples: 1632493380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:54:41,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 08:54:43,142][12883] Updated weights for policy 0, policy_version 99633 (0.0037) +[2024-06-18 08:54:46,350][12883] Updated weights for policy 0, policy_version 99643 (0.0028) +[2024-06-18 08:54:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1632550912. Throughput: 0: 42867.2. Samples: 1632616740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:54:46,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 08:54:50,687][12883] Updated weights for policy 0, policy_version 99653 (0.0040) +[2024-06-18 08:54:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42987.1). Total num frames: 1632763904. Throughput: 0: 42854.1. Samples: 1632877760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:54:51,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 08:54:53,911][12883] Updated weights for policy 0, policy_version 99663 (0.0033) +[2024-06-18 08:54:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1632960512. Throughput: 0: 42912.9. Samples: 1633137380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:54:56,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 08:54:58,267][12883] Updated weights for policy 0, policy_version 99673 (0.0035) +[2024-06-18 08:55:01,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1633189888. Throughput: 0: 42676.2. Samples: 1633260640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:55:01,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 08:55:02,229][12883] Updated weights for policy 0, policy_version 99683 (0.0036) +[2024-06-18 08:55:05,799][12883] Updated weights for policy 0, policy_version 99693 (0.0039) +[2024-06-18 08:55:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1633402880. Throughput: 0: 42712.9. Samples: 1633516660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:55:06,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 08:55:09,996][12883] Updated weights for policy 0, policy_version 99703 (0.0037) +[2024-06-18 08:55:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1633599488. Throughput: 0: 42700.9. Samples: 1633771400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:55:11,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 08:55:13,473][12883] Updated weights for policy 0, policy_version 99713 (0.0028) +[2024-06-18 08:55:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 1633845248. Throughput: 0: 42604.8. Samples: 1633896880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:55:16,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 08:55:17,831][12883] Updated weights for policy 0, policy_version 99723 (0.0043) +[2024-06-18 08:55:21,062][12883] Updated weights for policy 0, policy_version 99733 (0.0038) +[2024-06-18 08:55:21,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42869.9, 300 sec: 43042.4). Total num frames: 1634058240. Throughput: 0: 42684.1. Samples: 1634155460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) +[2024-06-18 08:55:21,996][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 08:55:25,508][12883] Updated weights for policy 0, policy_version 99743 (0.0046) +[2024-06-18 08:55:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1634254848. Throughput: 0: 42735.1. Samples: 1634416460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:55:26,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 08:55:28,707][12883] Updated weights for policy 0, policy_version 99753 (0.0035) +[2024-06-18 08:55:31,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1634467840. Throughput: 0: 42794.6. Samples: 1634542500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:55:31,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 08:55:33,118][12883] Updated weights for policy 0, policy_version 99763 (0.0040) +[2024-06-18 08:55:36,462][12883] Updated weights for policy 0, policy_version 99773 (0.0036) +[2024-06-18 08:55:36,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 1634697216. Throughput: 0: 42685.1. Samples: 1634798580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:55:36,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 08:55:40,834][12883] Updated weights for policy 0, policy_version 99783 (0.0041) +[2024-06-18 08:55:41,376][12862] Signal inference workers to stop experience collection... (23850 times) +[2024-06-18 08:55:41,376][12862] Signal inference workers to resume experience collection... (23850 times) +[2024-06-18 08:55:41,422][12883] InferenceWorker_p0-w0: stopping experience collection (23850 times) +[2024-06-18 08:55:41,422][12883] InferenceWorker_p0-w0: resuming experience collection (23850 times) +[2024-06-18 08:55:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1634893824. Throughput: 0: 42523.0. Samples: 1635050920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:55:41,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 08:55:44,400][12883] Updated weights for policy 0, policy_version 99793 (0.0046) +[2024-06-18 08:55:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1635123200. Throughput: 0: 42620.8. Samples: 1635178580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:55:46,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 08:55:48,705][12883] Updated weights for policy 0, policy_version 99803 (0.0037) +[2024-06-18 08:55:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1635319808. Throughput: 0: 42533.4. Samples: 1635430660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:55:51,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 08:55:52,027][12883] Updated weights for policy 0, policy_version 99813 (0.0036) +[2024-06-18 08:55:56,481][12883] Updated weights for policy 0, policy_version 99823 (0.0039) +[2024-06-18 08:55:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1635500032. Throughput: 0: 42579.1. Samples: 1635687460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:55:56,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 08:55:59,634][12883] Updated weights for policy 0, policy_version 99833 (0.0036) +[2024-06-18 08:56:01,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1635745792. Throughput: 0: 42505.0. Samples: 1635809700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:56:01,997][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 08:56:04,373][12883] Updated weights for policy 0, policy_version 99843 (0.0032) +[2024-06-18 08:56:06,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1635958784. Throughput: 0: 42569.3. Samples: 1636070980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:56:06,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 08:56:07,261][12883] Updated weights for policy 0, policy_version 99853 (0.0039) +[2024-06-18 08:56:11,975][12883] Updated weights for policy 0, policy_version 99863 (0.0030) +[2024-06-18 08:56:11,994][12645] Fps is (10 sec: 40968.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1636155392. Throughput: 0: 42524.0. Samples: 1636330040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:56:11,995][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 08:56:14,985][12883] Updated weights for policy 0, policy_version 99873 (0.0034) +[2024-06-18 08:56:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1636384768. Throughput: 0: 42417.8. Samples: 1636451300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:56:16,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 08:56:19,626][12883] Updated weights for policy 0, policy_version 99883 (0.0050) +[2024-06-18 08:56:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 1636597760. Throughput: 0: 42376.4. Samples: 1636705520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 08:56:21,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 08:56:22,716][12883] Updated weights for policy 0, policy_version 99893 (0.0039) +[2024-06-18 08:56:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1636794368. Throughput: 0: 42664.0. Samples: 1636970800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:56:26,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 08:56:27,125][12883] Updated weights for policy 0, policy_version 99903 (0.0040) +[2024-06-18 08:56:30,911][12883] Updated weights for policy 0, policy_version 99913 (0.0033) +[2024-06-18 08:56:31,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1637023744. Throughput: 0: 42430.3. Samples: 1637088040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:56:31,997][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 08:56:34,824][12883] Updated weights for policy 0, policy_version 99923 (0.0032) +[2024-06-18 08:56:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1637236736. Throughput: 0: 42529.7. Samples: 1637344500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:56:36,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 08:56:37,126][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099930_1637253120.pth... +[2024-06-18 08:56:37,188][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099303_1626980352.pth +[2024-06-18 08:56:38,575][12883] Updated weights for policy 0, policy_version 99933 (0.0036) +[2024-06-18 08:56:41,996][12645] Fps is (10 sec: 40960.3, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 1637433344. Throughput: 0: 42576.2. Samples: 1637603480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:56:41,997][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 08:56:42,858][12883] Updated weights for policy 0, policy_version 99943 (0.0030) +[2024-06-18 08:56:46,255][12883] Updated weights for policy 0, policy_version 99953 (0.0033) +[2024-06-18 08:56:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1637662720. Throughput: 0: 42655.4. Samples: 1637729100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:56:46,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 08:56:50,597][12883] Updated weights for policy 0, policy_version 99963 (0.0028) +[2024-06-18 08:56:51,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 1637842944. Throughput: 0: 42620.8. Samples: 1637988920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:56:51,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 08:56:53,907][12883] Updated weights for policy 0, policy_version 99973 (0.0032) +[2024-06-18 08:56:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1638072320. Throughput: 0: 42384.1. Samples: 1638237320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:56:56,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 08:56:58,167][12883] Updated weights for policy 0, policy_version 99983 (0.0027) +[2024-06-18 08:57:01,398][12883] Updated weights for policy 0, policy_version 99993 (0.0028) +[2024-06-18 08:57:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42326.8, 300 sec: 42653.9). Total num frames: 1638285312. Throughput: 0: 42624.3. Samples: 1638369400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:57:01,995][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 08:57:05,699][12883] Updated weights for policy 0, policy_version 100003 (0.0030) +[2024-06-18 08:57:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1638498304. Throughput: 0: 42694.2. Samples: 1638626760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:57:06,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 08:57:08,778][12883] Updated weights for policy 0, policy_version 100013 (0.0039) +[2024-06-18 08:57:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1638727680. Throughput: 0: 42420.8. Samples: 1638879740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:57:11,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 08:57:13,252][12883] Updated weights for policy 0, policy_version 100023 (0.0039) +[2024-06-18 08:57:16,310][12883] Updated weights for policy 0, policy_version 100033 (0.0035) +[2024-06-18 08:57:16,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 1638940672. Throughput: 0: 42783.6. Samples: 1639013300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:57:16,997][12645] Avg episode reward: [(0, '0.690')] +[2024-06-18 08:57:19,026][12862] Signal inference workers to stop experience collection... (23900 times) +[2024-06-18 08:57:19,078][12883] InferenceWorker_p0-w0: stopping experience collection (23900 times) +[2024-06-18 08:57:19,138][12862] Signal inference workers to resume experience collection... (23900 times) +[2024-06-18 08:57:19,138][12883] InferenceWorker_p0-w0: resuming experience collection (23900 times) +[2024-06-18 08:57:20,760][12883] Updated weights for policy 0, policy_version 100043 (0.0045) +[2024-06-18 08:57:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1639137280. Throughput: 0: 42714.6. Samples: 1639266660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) +[2024-06-18 08:57:21,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 08:57:24,573][12883] Updated weights for policy 0, policy_version 100053 (0.0024) +[2024-06-18 08:57:26,994][12645] Fps is (10 sec: 44246.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1639383040. Throughput: 0: 42688.8. Samples: 1639524380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:57:26,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 08:57:28,736][12883] Updated weights for policy 0, policy_version 100063 (0.0024) +[2024-06-18 08:57:31,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42598.4, 300 sec: 42709.2). Total num frames: 1639579648. Throughput: 0: 42856.2. Samples: 1639657720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:57:31,996][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 08:57:32,189][12883] Updated weights for policy 0, policy_version 100073 (0.0037) +[2024-06-18 08:57:36,244][12883] Updated weights for policy 0, policy_version 100083 (0.0045) +[2024-06-18 08:57:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1639792640. Throughput: 0: 42788.9. Samples: 1639914420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:57:36,994][12645] Avg episode reward: [(0, '0.121')] +[2024-06-18 08:57:39,769][12883] Updated weights for policy 0, policy_version 100093 (0.0037) +[2024-06-18 08:57:41,994][12645] Fps is (10 sec: 44247.1, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 1640022016. Throughput: 0: 42885.5. Samples: 1640167160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:57:41,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 08:57:43,819][12883] Updated weights for policy 0, policy_version 100103 (0.0036) +[2024-06-18 08:57:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1640235008. Throughput: 0: 42975.7. Samples: 1640303300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:57:46,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 08:57:47,181][12883] Updated weights for policy 0, policy_version 100113 (0.0025) +[2024-06-18 08:57:51,208][12883] Updated weights for policy 0, policy_version 100123 (0.0032) +[2024-06-18 08:57:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1640448000. Throughput: 0: 43002.8. Samples: 1640561880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:57:51,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 08:57:54,899][12883] Updated weights for policy 0, policy_version 100133 (0.0045) +[2024-06-18 08:57:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1640677376. Throughput: 0: 43039.1. Samples: 1640816500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:57:56,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 08:57:58,574][12883] Updated weights for policy 0, policy_version 100143 (0.0033) +[2024-06-18 08:58:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1640873984. Throughput: 0: 43009.7. Samples: 1640948640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:58:01,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 08:58:02,419][12883] Updated weights for policy 0, policy_version 100153 (0.0041) +[2024-06-18 08:58:06,129][12883] Updated weights for policy 0, policy_version 100163 (0.0039) +[2024-06-18 08:58:06,993][12645] Fps is (10 sec: 40961.2, 60 sec: 43144.7, 300 sec: 42765.1). Total num frames: 1641086976. Throughput: 0: 43080.2. Samples: 1641205260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:58:06,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 08:58:10,310][12883] Updated weights for policy 0, policy_version 100173 (0.0035) +[2024-06-18 08:58:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1641299968. Throughput: 0: 42939.9. Samples: 1641456680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:58:11,995][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 08:58:13,786][12883] Updated weights for policy 0, policy_version 100183 (0.0041) +[2024-06-18 08:58:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1641496576. Throughput: 0: 42827.9. Samples: 1641584880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:58:16,994][12645] Avg episode reward: [(0, '0.671')] +[2024-06-18 08:58:17,875][12883] Updated weights for policy 0, policy_version 100193 (0.0033) +[2024-06-18 08:58:21,721][12883] Updated weights for policy 0, policy_version 100203 (0.0033) +[2024-06-18 08:58:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.6). Total num frames: 1641725952. Throughput: 0: 42729.2. Samples: 1641837240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 08:58:21,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 08:58:25,474][12883] Updated weights for policy 0, policy_version 100213 (0.0034) +[2024-06-18 08:58:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1641938944. Throughput: 0: 42807.1. Samples: 1642093480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:58:26,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 08:58:29,158][12883] Updated weights for policy 0, policy_version 100223 (0.0031) +[2024-06-18 08:58:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 1642119168. Throughput: 0: 42701.3. Samples: 1642224860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:58:31,995][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 08:58:33,006][12883] Updated weights for policy 0, policy_version 100233 (0.0036) +[2024-06-18 08:58:36,491][12862] Signal inference workers to stop experience collection... (23950 times) +[2024-06-18 08:58:36,492][12862] Signal inference workers to resume experience collection... (23950 times) +[2024-06-18 08:58:36,513][12883] InferenceWorker_p0-w0: stopping experience collection (23950 times) +[2024-06-18 08:58:36,513][12883] InferenceWorker_p0-w0: resuming experience collection (23950 times) +[2024-06-18 08:58:36,639][12883] Updated weights for policy 0, policy_version 100243 (0.0036) +[2024-06-18 08:58:36,994][12645] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1642381312. Throughput: 0: 42635.4. Samples: 1642480480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:58:36,994][12645] Avg episode reward: [(0, '0.639')] +[2024-06-18 08:58:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100243_1642381312.pth... +[2024-06-18 08:58:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099618_1632141312.pth +[2024-06-18 08:58:40,930][12883] Updated weights for policy 0, policy_version 100253 (0.0032) +[2024-06-18 08:58:41,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1642594304. Throughput: 0: 42721.0. Samples: 1642738940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:58:41,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 08:58:44,198][12883] Updated weights for policy 0, policy_version 100263 (0.0036) +[2024-06-18 08:58:46,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1642774528. Throughput: 0: 42677.0. Samples: 1642869100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:58:46,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 08:58:48,526][12883] Updated weights for policy 0, policy_version 100273 (0.0030) +[2024-06-18 08:58:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1643020288. Throughput: 0: 42672.4. Samples: 1643125520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:58:51,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 08:58:52,015][12883] Updated weights for policy 0, policy_version 100283 (0.0041) +[2024-06-18 08:58:56,332][12883] Updated weights for policy 0, policy_version 100293 (0.0042) +[2024-06-18 08:58:56,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1643233280. Throughput: 0: 42947.5. Samples: 1643389320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:58:56,995][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 08:58:59,638][12883] Updated weights for policy 0, policy_version 100303 (0.0032) +[2024-06-18 08:59:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1643413504. Throughput: 0: 42879.2. Samples: 1643514440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:59:01,994][12645] Avg episode reward: [(0, '0.654')] +[2024-06-18 08:59:03,966][12883] Updated weights for policy 0, policy_version 100313 (0.0029) +[2024-06-18 08:59:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1643659264. Throughput: 0: 42943.7. Samples: 1643769700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:59:06,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 08:59:07,227][12883] Updated weights for policy 0, policy_version 100323 (0.0028) +[2024-06-18 08:59:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1643839488. Throughput: 0: 43008.4. Samples: 1644028860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:59:11,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 08:59:12,012][12883] Updated weights for policy 0, policy_version 100333 (0.0034) +[2024-06-18 08:59:14,707][12883] Updated weights for policy 0, policy_version 100343 (0.0035) +[2024-06-18 08:59:17,000][12645] Fps is (10 sec: 40934.7, 60 sec: 42867.0, 300 sec: 42653.0). Total num frames: 1644068864. Throughput: 0: 42727.0. Samples: 1644147840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:59:17,000][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 08:59:19,514][12883] Updated weights for policy 0, policy_version 100353 (0.0033) +[2024-06-18 08:59:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1644298240. Throughput: 0: 42834.7. Samples: 1644408040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 08:59:21,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 08:59:22,330][12883] Updated weights for policy 0, policy_version 100363 (0.0023) +[2024-06-18 08:59:26,909][12883] Updated weights for policy 0, policy_version 100373 (0.0046) +[2024-06-18 08:59:26,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1644511232. Throughput: 0: 42935.9. Samples: 1644671060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 08:59:26,998][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 08:59:30,055][12883] Updated weights for policy 0, policy_version 100383 (0.0036) +[2024-06-18 08:59:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1644724224. Throughput: 0: 42731.4. Samples: 1644792020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 08:59:31,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 08:59:34,706][12883] Updated weights for policy 0, policy_version 100393 (0.0028) +[2024-06-18 08:59:37,000][12645] Fps is (10 sec: 44209.3, 60 sec: 42867.1, 300 sec: 42819.7). Total num frames: 1644953600. Throughput: 0: 42966.4. Samples: 1645059280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 08:59:37,001][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 08:59:37,830][12883] Updated weights for policy 0, policy_version 100403 (0.0026) +[2024-06-18 08:59:41,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 1645150208. Throughput: 0: 42795.3. Samples: 1645315200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 08:59:41,996][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 08:59:42,523][12883] Updated weights for policy 0, policy_version 100413 (0.0039) +[2024-06-18 08:59:45,274][12883] Updated weights for policy 0, policy_version 100423 (0.0029) +[2024-06-18 08:59:46,994][12645] Fps is (10 sec: 40985.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1645363200. Throughput: 0: 42738.1. Samples: 1645437660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 08:59:46,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 08:59:50,165][12883] Updated weights for policy 0, policy_version 100433 (0.0045) +[2024-06-18 08:59:51,994][12645] Fps is (10 sec: 45885.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1645608960. Throughput: 0: 43013.0. Samples: 1645705280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 08:59:51,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 08:59:53,298][12883] Updated weights for policy 0, policy_version 100443 (0.0048) +[2024-06-18 08:59:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1645789184. Throughput: 0: 42988.8. Samples: 1645963360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 08:59:56,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 08:59:57,608][12883] Updated weights for policy 0, policy_version 100453 (0.0041) +[2024-06-18 09:00:00,947][12883] Updated weights for policy 0, policy_version 100463 (0.0031) +[2024-06-18 09:00:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1646018560. Throughput: 0: 43064.1. Samples: 1646085460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 09:00:01,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 09:00:05,051][12883] Updated weights for policy 0, policy_version 100473 (0.0036) +[2024-06-18 09:00:06,713][12862] Signal inference workers to stop experience collection... (24000 times) +[2024-06-18 09:00:06,713][12862] Signal inference workers to resume experience collection... (24000 times) +[2024-06-18 09:00:06,723][12883] InferenceWorker_p0-w0: stopping experience collection (24000 times) +[2024-06-18 09:00:06,723][12883] InferenceWorker_p0-w0: resuming experience collection (24000 times) +[2024-06-18 09:00:06,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 1646264320. Throughput: 0: 43183.7. Samples: 1646351300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 09:00:06,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 09:00:08,549][12883] Updated weights for policy 0, policy_version 100483 (0.0043) +[2024-06-18 09:00:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1646444544. Throughput: 0: 42919.5. Samples: 1646602440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 09:00:11,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 09:00:12,720][12883] Updated weights for policy 0, policy_version 100493 (0.0035) +[2024-06-18 09:00:16,443][12883] Updated weights for policy 0, policy_version 100503 (0.0033) +[2024-06-18 09:00:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 43149.1, 300 sec: 42709.8). Total num frames: 1646657536. Throughput: 0: 43022.8. Samples: 1646728040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 09:00:16,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 09:00:20,237][12883] Updated weights for policy 0, policy_version 100513 (0.0047) +[2024-06-18 09:00:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1646886912. Throughput: 0: 42765.1. Samples: 1646983440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 09:00:21,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 09:00:24,186][12883] Updated weights for policy 0, policy_version 100523 (0.0035) +[2024-06-18 09:00:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1647083520. Throughput: 0: 42992.3. Samples: 1647249760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:00:26,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 09:00:27,979][12883] Updated weights for policy 0, policy_version 100533 (0.0030) +[2024-06-18 09:00:31,639][12883] Updated weights for policy 0, policy_version 100543 (0.0040) +[2024-06-18 09:00:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1647296512. Throughput: 0: 43000.1. Samples: 1647372660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:00:31,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 09:00:35,575][12883] Updated weights for policy 0, policy_version 100553 (0.0026) +[2024-06-18 09:00:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42875.9, 300 sec: 42820.6). Total num frames: 1647525888. Throughput: 0: 42839.4. Samples: 1647633060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:00:36,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 09:00:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100558_1647542272.pth... +[2024-06-18 09:00:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099930_1637253120.pth +[2024-06-18 09:00:39,155][12883] Updated weights for policy 0, policy_version 100563 (0.0036) +[2024-06-18 09:00:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1647706112. Throughput: 0: 42949.8. Samples: 1647896100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:00:41,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 09:00:43,158][12883] Updated weights for policy 0, policy_version 100573 (0.0041) +[2024-06-18 09:00:46,715][12883] Updated weights for policy 0, policy_version 100583 (0.0053) +[2024-06-18 09:00:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1647951872. Throughput: 0: 42837.8. Samples: 1648013160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:00:46,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 09:00:50,654][12883] Updated weights for policy 0, policy_version 100593 (0.0041) +[2024-06-18 09:00:51,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 42931.3). Total num frames: 1648164864. Throughput: 0: 42722.7. Samples: 1648273920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:00:51,997][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 09:00:54,755][12883] Updated weights for policy 0, policy_version 100603 (0.0026) +[2024-06-18 09:00:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 1648361472. Throughput: 0: 43018.2. Samples: 1648538260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:00:56,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 09:00:58,272][12883] Updated weights for policy 0, policy_version 100613 (0.0028) +[2024-06-18 09:01:01,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1648590848. Throughput: 0: 42973.2. Samples: 1648661840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:01:01,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 09:01:02,337][12883] Updated weights for policy 0, policy_version 100623 (0.0040) +[2024-06-18 09:01:06,058][12883] Updated weights for policy 0, policy_version 100633 (0.0037) +[2024-06-18 09:01:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42931.7). Total num frames: 1648820224. Throughput: 0: 43064.8. Samples: 1648921360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:01:06,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 09:01:09,819][12883] Updated weights for policy 0, policy_version 100643 (0.0040) +[2024-06-18 09:01:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1649000448. Throughput: 0: 42954.1. Samples: 1649182700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:01:11,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 09:01:13,910][12883] Updated weights for policy 0, policy_version 100653 (0.0033) +[2024-06-18 09:01:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1649246208. Throughput: 0: 42970.1. Samples: 1649306320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:01:16,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 09:01:17,287][12883] Updated weights for policy 0, policy_version 100663 (0.0039) +[2024-06-18 09:01:21,429][12883] Updated weights for policy 0, policy_version 100673 (0.0037) +[2024-06-18 09:01:21,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 1649459200. Throughput: 0: 43088.2. Samples: 1649572020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 09:01:21,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 09:01:24,934][12883] Updated weights for policy 0, policy_version 100683 (0.0043) +[2024-06-18 09:01:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 1649639424. Throughput: 0: 42825.8. Samples: 1649823260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:01:26,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 09:01:28,951][12883] Updated weights for policy 0, policy_version 100693 (0.0037) +[2024-06-18 09:01:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 1649901568. Throughput: 0: 43036.4. Samples: 1649949800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:01:31,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 09:01:32,429][12883] Updated weights for policy 0, policy_version 100703 (0.0027) +[2024-06-18 09:01:36,595][12883] Updated weights for policy 0, policy_version 100713 (0.0037) +[2024-06-18 09:01:36,996][12645] Fps is (10 sec: 45864.7, 60 sec: 42869.9, 300 sec: 42931.6). Total num frames: 1650098176. Throughput: 0: 43142.7. Samples: 1650215340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:01:36,997][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 09:01:37,637][12862] Signal inference workers to stop experience collection... (24050 times) +[2024-06-18 09:01:37,638][12862] Signal inference workers to resume experience collection... (24050 times) +[2024-06-18 09:01:37,679][12883] InferenceWorker_p0-w0: stopping experience collection (24050 times) +[2024-06-18 09:01:37,680][12883] InferenceWorker_p0-w0: resuming experience collection (24050 times) +[2024-06-18 09:01:39,949][12883] Updated weights for policy 0, policy_version 100723 (0.0034) +[2024-06-18 09:01:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1650294784. Throughput: 0: 42948.0. Samples: 1650470920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:01:41,994][12645] Avg episode reward: [(0, '0.611')] +[2024-06-18 09:01:44,412][12883] Updated weights for policy 0, policy_version 100733 (0.0042) +[2024-06-18 09:01:46,994][12645] Fps is (10 sec: 44247.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 1650540544. Throughput: 0: 43001.0. Samples: 1650596880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:01:46,994][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 09:01:47,571][12883] Updated weights for policy 0, policy_version 100743 (0.0046) +[2024-06-18 09:01:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 1650720768. Throughput: 0: 43048.5. Samples: 1650858540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:01:51,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 09:01:52,039][12883] Updated weights for policy 0, policy_version 100753 (0.0023) +[2024-06-18 09:01:55,420][12883] Updated weights for policy 0, policy_version 100763 (0.0046) +[2024-06-18 09:01:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 1650950144. Throughput: 0: 42821.0. Samples: 1651109640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:01:56,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 09:01:59,747][12883] Updated weights for policy 0, policy_version 100773 (0.0037) +[2024-06-18 09:02:01,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 1651179520. Throughput: 0: 42955.9. Samples: 1651239340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:01,995][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 09:02:03,206][12883] Updated weights for policy 0, policy_version 100783 (0.0036) +[2024-06-18 09:02:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1651359744. Throughput: 0: 42772.7. Samples: 1651496800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:06,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 09:02:07,553][12883] Updated weights for policy 0, policy_version 100793 (0.0026) +[2024-06-18 09:02:10,787][12883] Updated weights for policy 0, policy_version 100803 (0.0042) +[2024-06-18 09:02:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42931.9). Total num frames: 1651605504. Throughput: 0: 42694.5. Samples: 1651744520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:11,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 09:02:15,322][12883] Updated weights for policy 0, policy_version 100813 (0.0030) +[2024-06-18 09:02:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1651818496. Throughput: 0: 42875.1. Samples: 1651879180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:16,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 09:02:18,567][12883] Updated weights for policy 0, policy_version 100823 (0.0046) +[2024-06-18 09:02:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1651998720. Throughput: 0: 42519.9. Samples: 1652128640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:21,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 09:02:23,117][12883] Updated weights for policy 0, policy_version 100833 (0.0039) +[2024-06-18 09:02:26,211][12883] Updated weights for policy 0, policy_version 100843 (0.0032) +[2024-06-18 09:02:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42931.9). Total num frames: 1652244480. Throughput: 0: 42453.3. Samples: 1652381320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:26,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 09:02:30,853][12883] Updated weights for policy 0, policy_version 100853 (0.0028) +[2024-06-18 09:02:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1652441088. Throughput: 0: 42530.6. Samples: 1652510760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:31,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 09:02:34,163][12883] Updated weights for policy 0, policy_version 100863 (0.0032) +[2024-06-18 09:02:36,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 1652621312. Throughput: 0: 42358.7. Samples: 1652764680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:36,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 09:02:37,062][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100869_1652637696.pth... +[2024-06-18 09:02:37,123][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100243_1642381312.pth +[2024-06-18 09:02:38,407][12883] Updated weights for policy 0, policy_version 100873 (0.0031) +[2024-06-18 09:02:41,865][12883] Updated weights for policy 0, policy_version 100883 (0.0027) +[2024-06-18 09:02:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1652867072. Throughput: 0: 42463.5. Samples: 1653020500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:41,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 09:02:45,990][12883] Updated weights for policy 0, policy_version 100893 (0.0039) +[2024-06-18 09:02:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1653063680. Throughput: 0: 42467.2. Samples: 1653150360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:46,994][12645] Avg episode reward: [(0, '0.798')] +[2024-06-18 09:02:47,015][12862] Saving new best policy, reward=0.798! +[2024-06-18 09:02:49,501][12883] Updated weights for policy 0, policy_version 100903 (0.0040) +[2024-06-18 09:02:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1653260288. Throughput: 0: 42165.9. Samples: 1653394260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:51,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 09:02:53,867][12883] Updated weights for policy 0, policy_version 100913 (0.0036) +[2024-06-18 09:02:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1653489664. Throughput: 0: 42445.4. Samples: 1653654560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:02:56,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 09:02:57,381][12883] Updated weights for policy 0, policy_version 100923 (0.0037) +[2024-06-18 09:03:01,406][12883] Updated weights for policy 0, policy_version 100933 (0.0024) +[2024-06-18 09:03:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 1653702656. Throughput: 0: 42291.6. Samples: 1653782300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:03:01,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 09:03:02,369][12862] Signal inference workers to stop experience collection... (24100 times) +[2024-06-18 09:03:02,408][12883] InferenceWorker_p0-w0: stopping experience collection (24100 times) +[2024-06-18 09:03:02,427][12862] Signal inference workers to resume experience collection... (24100 times) +[2024-06-18 09:03:02,430][12883] InferenceWorker_p0-w0: resuming experience collection (24100 times) +[2024-06-18 09:03:05,007][12883] Updated weights for policy 0, policy_version 100943 (0.0035) +[2024-06-18 09:03:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1653915648. Throughput: 0: 42344.0. Samples: 1654034120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:03:06,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 09:03:09,025][12883] Updated weights for policy 0, policy_version 100953 (0.0029) +[2024-06-18 09:03:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 1654128640. Throughput: 0: 42441.0. Samples: 1654291160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:03:11,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 09:03:12,630][12883] Updated weights for policy 0, policy_version 100963 (0.0036) +[2024-06-18 09:03:16,579][12883] Updated weights for policy 0, policy_version 100973 (0.0027) +[2024-06-18 09:03:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1654341632. Throughput: 0: 42466.5. Samples: 1654421760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:03:16,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 09:03:20,159][12883] Updated weights for policy 0, policy_version 100983 (0.0028) +[2024-06-18 09:03:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1654571008. Throughput: 0: 42413.2. Samples: 1654673280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 09:03:21,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 09:03:24,597][12883] Updated weights for policy 0, policy_version 100993 (0.0023) +[2024-06-18 09:03:26,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 1654800384. Throughput: 0: 42649.7. Samples: 1654939740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:03:26,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 09:03:27,859][12883] Updated weights for policy 0, policy_version 101003 (0.0035) +[2024-06-18 09:03:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 1654964224. Throughput: 0: 42566.2. Samples: 1655065840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:03:31,994][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 09:03:32,302][12883] Updated weights for policy 0, policy_version 101013 (0.0036) +[2024-06-18 09:03:35,462][12883] Updated weights for policy 0, policy_version 101023 (0.0040) +[2024-06-18 09:03:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1655226368. Throughput: 0: 42767.4. Samples: 1655318800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:03:36,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 09:03:39,901][12883] Updated weights for policy 0, policy_version 101033 (0.0039) +[2024-06-18 09:03:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1655422976. Throughput: 0: 42899.2. Samples: 1655585020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:03:41,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 09:03:42,866][12883] Updated weights for policy 0, policy_version 101043 (0.0029) +[2024-06-18 09:03:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1655619584. Throughput: 0: 42760.4. Samples: 1655706520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:03:46,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 09:03:47,553][12883] Updated weights for policy 0, policy_version 101053 (0.0035) +[2024-06-18 09:03:50,374][12883] Updated weights for policy 0, policy_version 101063 (0.0030) +[2024-06-18 09:03:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1655848960. Throughput: 0: 42818.9. Samples: 1655960980. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:03:51,995][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 09:03:55,018][12883] Updated weights for policy 0, policy_version 101073 (0.0030) +[2024-06-18 09:03:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1656045568. Throughput: 0: 43138.1. Samples: 1656232380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:03:56,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 09:03:58,291][12883] Updated weights for policy 0, policy_version 101083 (0.0028) +[2024-06-18 09:04:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1656274944. Throughput: 0: 42911.2. Samples: 1656352760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:04:01,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 09:04:02,566][12883] Updated weights for policy 0, policy_version 101093 (0.0025) +[2024-06-18 09:04:05,925][12883] Updated weights for policy 0, policy_version 101103 (0.0027) +[2024-06-18 09:04:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1656504320. Throughput: 0: 43113.8. Samples: 1656613400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:04:06,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 09:04:10,237][12883] Updated weights for policy 0, policy_version 101113 (0.0025) +[2024-06-18 09:04:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 1656700928. Throughput: 0: 43057.0. Samples: 1656877300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:04:11,994][12645] Avg episode reward: [(0, '0.661')] +[2024-06-18 09:04:13,567][12883] Updated weights for policy 0, policy_version 101123 (0.0043) +[2024-06-18 09:04:16,996][12645] Fps is (10 sec: 42589.2, 60 sec: 43143.0, 300 sec: 42820.2). Total num frames: 1656930304. Throughput: 0: 42950.8. Samples: 1656998720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:04:16,997][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 09:04:17,819][12883] Updated weights for policy 0, policy_version 101133 (0.0040) +[2024-06-18 09:04:19,178][12862] Signal inference workers to stop experience collection... (24150 times) +[2024-06-18 09:04:19,179][12862] Signal inference workers to resume experience collection... (24150 times) +[2024-06-18 09:04:19,212][12883] InferenceWorker_p0-w0: stopping experience collection (24150 times) +[2024-06-18 09:04:19,213][12883] InferenceWorker_p0-w0: resuming experience collection (24150 times) +[2024-06-18 09:04:21,129][12883] Updated weights for policy 0, policy_version 101143 (0.0037) +[2024-06-18 09:04:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1657159680. Throughput: 0: 43139.2. Samples: 1657260060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) +[2024-06-18 09:04:21,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 09:04:25,511][12883] Updated weights for policy 0, policy_version 101153 (0.0033) +[2024-06-18 09:04:26,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1657356288. Throughput: 0: 43025.3. Samples: 1657521160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:04:26,994][12645] Avg episode reward: [(0, '0.625')] +[2024-06-18 09:04:28,795][12883] Updated weights for policy 0, policy_version 101163 (0.0029) +[2024-06-18 09:04:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 42821.5). Total num frames: 1657585664. Throughput: 0: 43002.7. Samples: 1657641640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:04:31,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 09:04:32,922][12883] Updated weights for policy 0, policy_version 101173 (0.0036) +[2024-06-18 09:04:36,485][12883] Updated weights for policy 0, policy_version 101183 (0.0026) +[2024-06-18 09:04:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 1657798656. Throughput: 0: 43139.3. Samples: 1657902240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:04:36,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 09:04:37,102][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101185_1657815040.pth... +[2024-06-18 09:04:37,163][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100558_1647542272.pth +[2024-06-18 09:04:41,017][12883] Updated weights for policy 0, policy_version 101193 (0.0028) +[2024-06-18 09:04:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1657978880. Throughput: 0: 42889.8. Samples: 1658162420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:04:41,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 09:04:44,302][12883] Updated weights for policy 0, policy_version 101203 (0.0041) +[2024-06-18 09:04:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 42820.5). Total num frames: 1658241024. Throughput: 0: 42851.1. Samples: 1658281060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:04:47,007][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 09:04:48,596][12883] Updated weights for policy 0, policy_version 101213 (0.0045) +[2024-06-18 09:04:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1658421248. Throughput: 0: 42932.9. Samples: 1658545380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:04:51,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 09:04:52,031][12883] Updated weights for policy 0, policy_version 101223 (0.0033) +[2024-06-18 09:04:56,196][12883] Updated weights for policy 0, policy_version 101233 (0.0043) +[2024-06-18 09:04:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1658617856. Throughput: 0: 42778.1. Samples: 1658802320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:04:56,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 09:04:59,699][12883] Updated weights for policy 0, policy_version 101243 (0.0033) +[2024-06-18 09:05:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1658880000. Throughput: 0: 42859.0. Samples: 1658927280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:05:01,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 09:05:03,649][12883] Updated weights for policy 0, policy_version 101253 (0.0049) +[2024-06-18 09:05:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1659043840. Throughput: 0: 42819.1. Samples: 1659186920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:05:06,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 09:05:07,375][12883] Updated weights for policy 0, policy_version 101263 (0.0030) +[2024-06-18 09:05:11,677][12883] Updated weights for policy 0, policy_version 101273 (0.0044) +[2024-06-18 09:05:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1659273216. Throughput: 0: 42842.2. Samples: 1659449060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:05:11,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 09:05:14,863][12883] Updated weights for policy 0, policy_version 101283 (0.0031) +[2024-06-18 09:05:16,994][12645] Fps is (10 sec: 47513.1, 60 sec: 43146.0, 300 sec: 42820.5). Total num frames: 1659518976. Throughput: 0: 43000.4. Samples: 1659576660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:05:16,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 09:05:19,030][12883] Updated weights for policy 0, policy_version 101293 (0.0038) +[2024-06-18 09:05:21,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 1659699200. Throughput: 0: 42895.7. Samples: 1659832640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:05:21,996][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 09:05:22,551][12883] Updated weights for policy 0, policy_version 101303 (0.0028) +[2024-06-18 09:05:26,559][12883] Updated weights for policy 0, policy_version 101313 (0.0026) +[2024-06-18 09:05:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1659912192. Throughput: 0: 42871.7. Samples: 1660091640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 09:05:26,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 09:05:28,025][12862] Signal inference workers to stop experience collection... (24200 times) +[2024-06-18 09:05:28,064][12883] InferenceWorker_p0-w0: stopping experience collection (24200 times) +[2024-06-18 09:05:28,096][12862] Signal inference workers to resume experience collection... (24200 times) +[2024-06-18 09:05:28,100][12883] InferenceWorker_p0-w0: resuming experience collection (24200 times) +[2024-06-18 09:05:30,301][12883] Updated weights for policy 0, policy_version 101323 (0.0030) +[2024-06-18 09:05:31,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1660157952. Throughput: 0: 43095.5. Samples: 1660220360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:05:31,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 09:05:34,159][12883] Updated weights for policy 0, policy_version 101333 (0.0036) +[2024-06-18 09:05:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1660354560. Throughput: 0: 42887.1. Samples: 1660475300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:05:36,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 09:05:37,845][12883] Updated weights for policy 0, policy_version 101343 (0.0043) +[2024-06-18 09:05:41,856][12883] Updated weights for policy 0, policy_version 101353 (0.0030) +[2024-06-18 09:05:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1660567552. Throughput: 0: 43034.7. Samples: 1660738880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:05:41,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 09:05:45,442][12883] Updated weights for policy 0, policy_version 101363 (0.0036) +[2024-06-18 09:05:46,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 1660813312. Throughput: 0: 42993.9. Samples: 1660862000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:05:46,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 09:05:49,816][12883] Updated weights for policy 0, policy_version 101373 (0.0033) +[2024-06-18 09:05:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1660993536. Throughput: 0: 42927.2. Samples: 1661118640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:05:51,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 09:05:53,167][12883] Updated weights for policy 0, policy_version 101383 (0.0029) +[2024-06-18 09:05:56,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1661190144. Throughput: 0: 42863.5. Samples: 1661377920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:05:56,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 09:05:57,390][12883] Updated weights for policy 0, policy_version 101393 (0.0035) +[2024-06-18 09:06:00,761][12883] Updated weights for policy 0, policy_version 101403 (0.0036) +[2024-06-18 09:06:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1661435904. Throughput: 0: 42748.0. Samples: 1661500320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:06:02,003][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 09:06:05,191][12883] Updated weights for policy 0, policy_version 101413 (0.0044) +[2024-06-18 09:06:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1661648896. Throughput: 0: 42756.4. Samples: 1661756580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:06:06,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 09:06:08,782][12883] Updated weights for policy 0, policy_version 101423 (0.0033) +[2024-06-18 09:06:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1661845504. Throughput: 0: 42673.8. Samples: 1662011960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:06:11,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 09:06:12,900][12883] Updated weights for policy 0, policy_version 101433 (0.0033) +[2024-06-18 09:06:16,507][12883] Updated weights for policy 0, policy_version 101443 (0.0038) +[2024-06-18 09:06:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1662042112. Throughput: 0: 42534.2. Samples: 1662134400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:06:16,998][12645] Avg episode reward: [(0, '0.700')] +[2024-06-18 09:06:20,603][12883] Updated weights for policy 0, policy_version 101453 (0.0036) +[2024-06-18 09:06:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 1662287872. Throughput: 0: 42652.9. Samples: 1662394680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:06:21,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 09:06:24,494][12883] Updated weights for policy 0, policy_version 101463 (0.0036) +[2024-06-18 09:06:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1662484480. Throughput: 0: 42318.2. Samples: 1662643200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) +[2024-06-18 09:06:26,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 09:06:28,114][12883] Updated weights for policy 0, policy_version 101473 (0.0032) +[2024-06-18 09:06:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 1662681088. Throughput: 0: 42311.5. Samples: 1662766020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:06:31,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 09:06:32,092][12883] Updated weights for policy 0, policy_version 101483 (0.0035) +[2024-06-18 09:06:35,761][12883] Updated weights for policy 0, policy_version 101493 (0.0028) +[2024-06-18 09:06:36,996][12645] Fps is (10 sec: 45864.6, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 1662943232. Throughput: 0: 42471.6. Samples: 1663029960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:06:36,997][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 09:06:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101498_1662943232.pth... +[2024-06-18 09:06:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100869_1652637696.pth +[2024-06-18 09:06:39,773][12883] Updated weights for policy 0, policy_version 101503 (0.0034) +[2024-06-18 09:06:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1663107072. Throughput: 0: 42472.1. Samples: 1663289160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:06:41,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 09:06:43,202][12883] Updated weights for policy 0, policy_version 101513 (0.0031) +[2024-06-18 09:06:43,668][12862] Signal inference workers to stop experience collection... (24250 times) +[2024-06-18 09:06:43,669][12862] Signal inference workers to resume experience collection... (24250 times) +[2024-06-18 09:06:43,713][12883] InferenceWorker_p0-w0: stopping experience collection (24250 times) +[2024-06-18 09:06:43,713][12883] InferenceWorker_p0-w0: resuming experience collection (24250 times) +[2024-06-18 09:06:46,994][12645] Fps is (10 sec: 37692.1, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1663320064. Throughput: 0: 42451.8. Samples: 1663410640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:06:46,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 09:06:47,276][12883] Updated weights for policy 0, policy_version 101523 (0.0030) +[2024-06-18 09:06:50,746][12883] Updated weights for policy 0, policy_version 101533 (0.0034) +[2024-06-18 09:06:51,994][12645] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1663582208. Throughput: 0: 42690.7. Samples: 1663677660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:06:51,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 09:06:55,347][12883] Updated weights for policy 0, policy_version 101543 (0.0037) +[2024-06-18 09:06:56,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1663762432. Throughput: 0: 42692.8. Samples: 1663933140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:06:56,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 09:06:58,356][12883] Updated weights for policy 0, policy_version 101553 (0.0046) +[2024-06-18 09:07:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1663975424. Throughput: 0: 42709.4. Samples: 1664056320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:07:01,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 09:07:02,923][12883] Updated weights for policy 0, policy_version 101563 (0.0026) +[2024-06-18 09:07:06,271][12883] Updated weights for policy 0, policy_version 101573 (0.0021) +[2024-06-18 09:07:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1664204800. Throughput: 0: 42687.6. Samples: 1664315620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:07:06,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 09:07:10,362][12883] Updated weights for policy 0, policy_version 101583 (0.0042) +[2024-06-18 09:07:11,995][12645] Fps is (10 sec: 44231.1, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 1664417792. Throughput: 0: 42919.6. Samples: 1664574640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:07:11,995][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 09:07:13,700][12883] Updated weights for policy 0, policy_version 101593 (0.0038) +[2024-06-18 09:07:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1664647168. Throughput: 0: 43039.5. Samples: 1664702800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:07:16,994][12645] Avg episode reward: [(0, '0.594')] +[2024-06-18 09:07:17,799][12883] Updated weights for policy 0, policy_version 101603 (0.0026) +[2024-06-18 09:07:21,315][12883] Updated weights for policy 0, policy_version 101613 (0.0028) +[2024-06-18 09:07:21,994][12645] Fps is (10 sec: 42604.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1664843776. Throughput: 0: 42943.1. Samples: 1664962300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:07:21,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 09:07:25,262][12883] Updated weights for policy 0, policy_version 101623 (0.0036) +[2024-06-18 09:07:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1665073152. Throughput: 0: 42987.9. Samples: 1665223620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 09:07:26,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 09:07:28,797][12883] Updated weights for policy 0, policy_version 101633 (0.0039) +[2024-06-18 09:07:31,996][12645] Fps is (10 sec: 44226.3, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 1665286144. Throughput: 0: 43072.0. Samples: 1665348980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:07:31,997][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 09:07:32,705][12883] Updated weights for policy 0, policy_version 101643 (0.0041) +[2024-06-18 09:07:36,247][12883] Updated weights for policy 0, policy_version 101653 (0.0033) +[2024-06-18 09:07:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 1665499136. Throughput: 0: 42948.8. Samples: 1665610360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:07:36,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 09:07:40,167][12883] Updated weights for policy 0, policy_version 101663 (0.0036) +[2024-06-18 09:07:41,994][12645] Fps is (10 sec: 40969.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1665695744. Throughput: 0: 43016.1. Samples: 1665868860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:07:41,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 09:07:44,038][12883] Updated weights for policy 0, policy_version 101673 (0.0028) +[2024-06-18 09:07:46,996][12645] Fps is (10 sec: 42589.0, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 1665925120. Throughput: 0: 43119.2. Samples: 1665996780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:07:46,996][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 09:07:47,647][12883] Updated weights for policy 0, policy_version 101683 (0.0033) +[2024-06-18 09:07:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 1666121728. Throughput: 0: 43143.8. Samples: 1666257100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:07:51,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 09:07:52,159][12883] Updated weights for policy 0, policy_version 101693 (0.0027) +[2024-06-18 09:07:55,343][12883] Updated weights for policy 0, policy_version 101703 (0.0027) +[2024-06-18 09:07:56,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1666334720. Throughput: 0: 43019.3. Samples: 1666510460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:07:56,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 09:07:59,879][12883] Updated weights for policy 0, policy_version 101713 (0.0030) +[2024-06-18 09:08:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1666564096. Throughput: 0: 43018.3. Samples: 1666638620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:08:01,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 09:08:02,927][12883] Updated weights for policy 0, policy_version 101723 (0.0032) +[2024-06-18 09:08:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1666760704. Throughput: 0: 42833.3. Samples: 1666889800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:08:06,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 09:08:07,554][12883] Updated weights for policy 0, policy_version 101733 (0.0030) +[2024-06-18 09:08:07,649][12862] Signal inference workers to stop experience collection... (24300 times) +[2024-06-18 09:08:07,701][12883] InferenceWorker_p0-w0: stopping experience collection (24300 times) +[2024-06-18 09:08:07,706][12862] Signal inference workers to resume experience collection... (24300 times) +[2024-06-18 09:08:07,716][12883] InferenceWorker_p0-w0: resuming experience collection (24300 times) +[2024-06-18 09:08:11,247][12883] Updated weights for policy 0, policy_version 101743 (0.0039) +[2024-06-18 09:08:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42599.3, 300 sec: 42820.6). Total num frames: 1666973696. Throughput: 0: 42673.8. Samples: 1667143940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:08:11,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 09:08:15,160][12883] Updated weights for policy 0, policy_version 101753 (0.0027) +[2024-06-18 09:08:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1667203072. Throughput: 0: 42812.8. Samples: 1667275460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:08:16,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 09:08:18,817][12883] Updated weights for policy 0, policy_version 101763 (0.0045) +[2024-06-18 09:08:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1667399680. Throughput: 0: 42672.5. Samples: 1667530620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:08:22,000][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 09:08:22,722][12883] Updated weights for policy 0, policy_version 101773 (0.0032) +[2024-06-18 09:08:26,343][12883] Updated weights for policy 0, policy_version 101783 (0.0027) +[2024-06-18 09:08:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1667612672. Throughput: 0: 42620.3. Samples: 1667786780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:08:26,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 09:08:30,471][12883] Updated weights for policy 0, policy_version 101793 (0.0035) +[2024-06-18 09:08:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42600.1, 300 sec: 42765.1). Total num frames: 1667842048. Throughput: 0: 42675.1. Samples: 1667917060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:08:31,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 09:08:33,897][12883] Updated weights for policy 0, policy_version 101803 (0.0033) +[2024-06-18 09:08:36,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1668055040. Throughput: 0: 42657.1. Samples: 1668176660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:08:36,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 09:08:37,038][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101811_1668071424.pth... +[2024-06-18 09:08:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101185_1657815040.pth +[2024-06-18 09:08:38,267][12883] Updated weights for policy 0, policy_version 101813 (0.0032) +[2024-06-18 09:08:41,646][12883] Updated weights for policy 0, policy_version 101823 (0.0029) +[2024-06-18 09:08:41,995][12645] Fps is (10 sec: 42591.0, 60 sec: 42870.3, 300 sec: 42875.9). Total num frames: 1668268032. Throughput: 0: 42574.1. Samples: 1668426360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:08:41,996][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 09:08:45,861][12883] Updated weights for policy 0, policy_version 101833 (0.0042) +[2024-06-18 09:08:46,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42599.9, 300 sec: 42820.6). Total num frames: 1668481024. Throughput: 0: 42628.3. Samples: 1668556900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:08:46,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 09:08:49,312][12883] Updated weights for policy 0, policy_version 101843 (0.0046) +[2024-06-18 09:08:51,994][12645] Fps is (10 sec: 44244.4, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 1668710400. Throughput: 0: 42871.6. Samples: 1668819020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:08:51,994][12645] Avg episode reward: [(0, '0.717')] +[2024-06-18 09:08:53,342][12883] Updated weights for policy 0, policy_version 101853 (0.0041) +[2024-06-18 09:08:56,995][12645] Fps is (10 sec: 42593.2, 60 sec: 42870.6, 300 sec: 42820.4). Total num frames: 1668907008. Throughput: 0: 42893.0. Samples: 1669074180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:08:56,995][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 09:08:57,082][12883] Updated weights for policy 0, policy_version 101863 (0.0032) +[2024-06-18 09:09:00,739][12883] Updated weights for policy 0, policy_version 101873 (0.0032) +[2024-06-18 09:09:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1669136384. Throughput: 0: 42726.8. Samples: 1669198160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:09:01,994][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 09:09:04,703][12883] Updated weights for policy 0, policy_version 101883 (0.0024) +[2024-06-18 09:09:06,994][12645] Fps is (10 sec: 44243.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1669349376. Throughput: 0: 42984.5. Samples: 1669464920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:09:06,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 09:09:08,205][12883] Updated weights for policy 0, policy_version 101893 (0.0040) +[2024-06-18 09:09:09,080][12862] Signal inference workers to stop experience collection... (24350 times) +[2024-06-18 09:09:09,080][12862] Signal inference workers to resume experience collection... (24350 times) +[2024-06-18 09:09:09,114][12883] InferenceWorker_p0-w0: stopping experience collection (24350 times) +[2024-06-18 09:09:09,114][12883] InferenceWorker_p0-w0: resuming experience collection (24350 times) +[2024-06-18 09:09:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 1669545984. Throughput: 0: 43050.0. Samples: 1669724020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:09:11,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 09:09:12,253][12883] Updated weights for policy 0, policy_version 101903 (0.0044) +[2024-06-18 09:09:15,849][12883] Updated weights for policy 0, policy_version 101913 (0.0040) +[2024-06-18 09:09:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1669775360. Throughput: 0: 42926.1. Samples: 1669848740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:09:16,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 09:09:19,689][12883] Updated weights for policy 0, policy_version 101923 (0.0029) +[2024-06-18 09:09:21,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 1670021120. Throughput: 0: 42977.2. Samples: 1670110640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:09:21,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 09:09:23,319][12883] Updated weights for policy 0, policy_version 101933 (0.0034) +[2024-06-18 09:09:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1670217728. Throughput: 0: 43267.7. Samples: 1670373340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) +[2024-06-18 09:09:26,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 09:09:27,239][12883] Updated weights for policy 0, policy_version 101943 (0.0035) +[2024-06-18 09:09:31,063][12883] Updated weights for policy 0, policy_version 101953 (0.0030) +[2024-06-18 09:09:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 43144.3, 300 sec: 42820.5). Total num frames: 1670430720. Throughput: 0: 43071.5. Samples: 1670495120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:09:31,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 09:09:34,822][12883] Updated weights for policy 0, policy_version 101963 (0.0027) +[2024-06-18 09:09:36,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 43042.7). Total num frames: 1670676480. Throughput: 0: 43139.4. Samples: 1670760300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:09:36,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 09:09:38,755][12883] Updated weights for policy 0, policy_version 101973 (0.0036) +[2024-06-18 09:09:41,996][12645] Fps is (10 sec: 42589.7, 60 sec: 43144.1, 300 sec: 42764.7). Total num frames: 1670856704. Throughput: 0: 43300.9. Samples: 1671022760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:09:41,996][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 09:09:42,725][12883] Updated weights for policy 0, policy_version 101983 (0.0036) +[2024-06-18 09:09:46,423][12883] Updated weights for policy 0, policy_version 101993 (0.0030) +[2024-06-18 09:09:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 1671086080. Throughput: 0: 43245.3. Samples: 1671144200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:09:46,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 09:09:50,238][12883] Updated weights for policy 0, policy_version 102003 (0.0037) +[2024-06-18 09:09:51,994][12645] Fps is (10 sec: 44246.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1671299072. Throughput: 0: 43126.7. Samples: 1671405620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:09:51,994][12645] Avg episode reward: [(0, '0.197')] +[2024-06-18 09:09:53,914][12883] Updated weights for policy 0, policy_version 102013 (0.0032) +[2024-06-18 09:09:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43145.4, 300 sec: 42765.0). Total num frames: 1671495680. Throughput: 0: 43162.1. Samples: 1671666320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:09:56,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 09:09:57,769][12883] Updated weights for policy 0, policy_version 102023 (0.0025) +[2024-06-18 09:10:01,438][12883] Updated weights for policy 0, policy_version 102033 (0.0030) +[2024-06-18 09:10:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 1671725056. Throughput: 0: 43182.1. Samples: 1671791940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:10:01,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 09:10:05,774][12883] Updated weights for policy 0, policy_version 102043 (0.0041) +[2024-06-18 09:10:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 1671954432. Throughput: 0: 43223.6. Samples: 1672055700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:10:06,994][12645] Avg episode reward: [(0, '0.677')] +[2024-06-18 09:10:09,145][12883] Updated weights for policy 0, policy_version 102053 (0.0026) +[2024-06-18 09:10:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1672118272. Throughput: 0: 43035.9. Samples: 1672309960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:10:11,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 09:10:13,280][12883] Updated weights for policy 0, policy_version 102063 (0.0029) +[2024-06-18 09:10:16,678][12883] Updated weights for policy 0, policy_version 102073 (0.0047) +[2024-06-18 09:10:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 1672380416. Throughput: 0: 43029.1. Samples: 1672431420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:10:16,994][12645] Avg episode reward: [(0, '0.201')] +[2024-06-18 09:10:20,853][12883] Updated weights for policy 0, policy_version 102083 (0.0038) +[2024-06-18 09:10:21,994][12645] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1672593408. Throughput: 0: 43037.0. Samples: 1672696960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:10:21,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 09:10:24,370][12883] Updated weights for policy 0, policy_version 102093 (0.0032) +[2024-06-18 09:10:26,996][12645] Fps is (10 sec: 39312.6, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1672773632. Throughput: 0: 42846.2. Samples: 1672950840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:10:26,997][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 09:10:28,474][12883] Updated weights for policy 0, policy_version 102103 (0.0026) +[2024-06-18 09:10:31,916][12883] Updated weights for policy 0, policy_version 102113 (0.0031) +[2024-06-18 09:10:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1673019392. Throughput: 0: 42919.4. Samples: 1673075580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:10:31,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 09:10:35,921][12883] Updated weights for policy 0, policy_version 102123 (0.0031) +[2024-06-18 09:10:36,419][12862] Signal inference workers to stop experience collection... (24400 times) +[2024-06-18 09:10:36,420][12862] Signal inference workers to resume experience collection... (24400 times) +[2024-06-18 09:10:36,463][12883] InferenceWorker_p0-w0: stopping experience collection (24400 times) +[2024-06-18 09:10:36,463][12883] InferenceWorker_p0-w0: resuming experience collection (24400 times) +[2024-06-18 09:10:36,994][12645] Fps is (10 sec: 44247.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1673216000. Throughput: 0: 42993.4. Samples: 1673340320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:10:36,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 09:10:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102126_1673232384.pth... +[2024-06-18 09:10:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101498_1662943232.pth +[2024-06-18 09:10:39,525][12883] Updated weights for policy 0, policy_version 102133 (0.0047) +[2024-06-18 09:10:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 1673428992. Throughput: 0: 42793.8. Samples: 1673592040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:10:41,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 09:10:43,849][12883] Updated weights for policy 0, policy_version 102143 (0.0035) +[2024-06-18 09:10:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1673658368. Throughput: 0: 42829.0. Samples: 1673719240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:10:46,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 09:10:47,238][12883] Updated weights for policy 0, policy_version 102153 (0.0034) +[2024-06-18 09:10:51,695][12883] Updated weights for policy 0, policy_version 102163 (0.0031) +[2024-06-18 09:10:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1673871360. Throughput: 0: 42735.6. Samples: 1673978800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:10:51,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 09:10:54,897][12883] Updated weights for policy 0, policy_version 102173 (0.0031) +[2024-06-18 09:10:56,996][12645] Fps is (10 sec: 39312.6, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1674051584. Throughput: 0: 42692.7. Samples: 1674231220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:10:56,997][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 09:10:59,229][12883] Updated weights for policy 0, policy_version 102183 (0.0040) +[2024-06-18 09:11:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1674297344. Throughput: 0: 42821.2. Samples: 1674358380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:11:01,995][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 09:11:02,474][12883] Updated weights for policy 0, policy_version 102193 (0.0031) +[2024-06-18 09:11:06,771][12883] Updated weights for policy 0, policy_version 102203 (0.0030) +[2024-06-18 09:11:06,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1674493952. Throughput: 0: 42746.7. Samples: 1674620560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:11:06,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 09:11:10,601][12883] Updated weights for policy 0, policy_version 102213 (0.0046) +[2024-06-18 09:11:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1674706944. Throughput: 0: 42656.2. Samples: 1674870280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:11:11,994][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 09:11:14,518][12883] Updated weights for policy 0, policy_version 102223 (0.0035) +[2024-06-18 09:11:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1674936320. Throughput: 0: 42775.7. Samples: 1675000480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:11:16,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 09:11:18,113][12883] Updated weights for policy 0, policy_version 102233 (0.0047) +[2024-06-18 09:11:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 1675116544. Throughput: 0: 42580.8. Samples: 1675256460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:11:21,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 09:11:22,174][12883] Updated weights for policy 0, policy_version 102243 (0.0031) +[2024-06-18 09:11:25,936][12883] Updated weights for policy 0, policy_version 102253 (0.0035) +[2024-06-18 09:11:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 1675345920. Throughput: 0: 42493.3. Samples: 1675504240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 09:11:26,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 09:11:29,767][12883] Updated weights for policy 0, policy_version 102263 (0.0035) +[2024-06-18 09:11:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.6, 300 sec: 42820.9). Total num frames: 1675575296. Throughput: 0: 42627.1. Samples: 1675637460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:11:31,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 09:11:33,691][12883] Updated weights for policy 0, policy_version 102273 (0.0028) +[2024-06-18 09:11:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1675771904. Throughput: 0: 42541.7. Samples: 1675893180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:11:36,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 09:11:37,498][12883] Updated weights for policy 0, policy_version 102283 (0.0031) +[2024-06-18 09:11:41,602][12883] Updated weights for policy 0, policy_version 102293 (0.0042) +[2024-06-18 09:11:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1675984896. Throughput: 0: 42540.4. Samples: 1676145440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:11:41,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 09:11:45,273][12883] Updated weights for policy 0, policy_version 102303 (0.0045) +[2024-06-18 09:11:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1676214272. Throughput: 0: 42561.8. Samples: 1676273660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:11:46,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 09:11:49,204][12883] Updated weights for policy 0, policy_version 102313 (0.0040) +[2024-06-18 09:11:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 1676394496. Throughput: 0: 42388.0. Samples: 1676528020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:11:51,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 09:11:52,964][12883] Updated weights for policy 0, policy_version 102323 (0.0038) +[2024-06-18 09:11:56,733][12883] Updated weights for policy 0, policy_version 102333 (0.0044) +[2024-06-18 09:11:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42873.0, 300 sec: 42876.1). Total num frames: 1676623872. Throughput: 0: 42563.2. Samples: 1676785620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:11:56,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 09:12:00,526][12883] Updated weights for policy 0, policy_version 102343 (0.0041) +[2024-06-18 09:12:01,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1676853248. Throughput: 0: 42579.1. Samples: 1676916540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:12:01,994][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 09:12:02,078][12862] Signal inference workers to stop experience collection... (24450 times) +[2024-06-18 09:12:02,078][12862] Signal inference workers to resume experience collection... (24450 times) +[2024-06-18 09:12:02,097][12883] InferenceWorker_p0-w0: stopping experience collection (24450 times) +[2024-06-18 09:12:02,098][12883] InferenceWorker_p0-w0: resuming experience collection (24450 times) +[2024-06-18 09:12:04,507][12883] Updated weights for policy 0, policy_version 102353 (0.0030) +[2024-06-18 09:12:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.2). Total num frames: 1677033472. Throughput: 0: 42544.0. Samples: 1677170940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:12:06,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 09:12:07,980][12883] Updated weights for policy 0, policy_version 102363 (0.0028) +[2024-06-18 09:12:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1677246464. Throughput: 0: 42771.0. Samples: 1677428940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:12:11,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 09:12:12,218][12883] Updated weights for policy 0, policy_version 102373 (0.0027) +[2024-06-18 09:12:15,611][12883] Updated weights for policy 0, policy_version 102383 (0.0035) +[2024-06-18 09:12:16,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1677492224. Throughput: 0: 42724.0. Samples: 1677560040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:12:16,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 09:12:19,818][12883] Updated weights for policy 0, policy_version 102393 (0.0033) +[2024-06-18 09:12:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1677656064. Throughput: 0: 42600.9. Samples: 1677810220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:12:21,994][12645] Avg episode reward: [(0, '0.829')] +[2024-06-18 09:12:22,039][12862] Saving new best policy, reward=0.829! +[2024-06-18 09:12:23,312][12883] Updated weights for policy 0, policy_version 102403 (0.0034) +[2024-06-18 09:12:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 1677885440. Throughput: 0: 42736.4. Samples: 1678068580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 09:12:26,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 09:12:27,749][12883] Updated weights for policy 0, policy_version 102413 (0.0023) +[2024-06-18 09:12:30,900][12883] Updated weights for policy 0, policy_version 102423 (0.0031) +[2024-06-18 09:12:31,994][12645] Fps is (10 sec: 47514.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1678131200. Throughput: 0: 42821.5. Samples: 1678200620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:12:31,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 09:12:35,224][12883] Updated weights for policy 0, policy_version 102433 (0.0036) +[2024-06-18 09:12:36,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1678311424. Throughput: 0: 42846.9. Samples: 1678456140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:12:36,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 09:12:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102436_1678311424.pth... +[2024-06-18 09:12:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101811_1668071424.pth +[2024-06-18 09:12:38,602][12883] Updated weights for policy 0, policy_version 102443 (0.0021) +[2024-06-18 09:12:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1678540800. Throughput: 0: 42897.9. Samples: 1678716020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:12:41,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 09:12:42,884][12883] Updated weights for policy 0, policy_version 102453 (0.0042) +[2024-06-18 09:12:46,078][12883] Updated weights for policy 0, policy_version 102463 (0.0035) +[2024-06-18 09:12:46,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 1678786560. Throughput: 0: 42828.1. Samples: 1678843800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:12:46,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 09:12:50,485][12883] Updated weights for policy 0, policy_version 102473 (0.0036) +[2024-06-18 09:12:51,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1678966784. Throughput: 0: 42767.9. Samples: 1679095500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:12:51,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 09:12:53,720][12883] Updated weights for policy 0, policy_version 102483 (0.0031) +[2024-06-18 09:12:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1679179776. Throughput: 0: 42805.3. Samples: 1679355180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:12:56,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 09:12:58,105][12883] Updated weights for policy 0, policy_version 102493 (0.0033) +[2024-06-18 09:13:01,331][12883] Updated weights for policy 0, policy_version 102503 (0.0030) +[2024-06-18 09:13:01,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1679409152. Throughput: 0: 42812.4. Samples: 1679486600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:13:01,994][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 09:13:05,842][12883] Updated weights for policy 0, policy_version 102513 (0.0031) +[2024-06-18 09:13:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1679605760. Throughput: 0: 42944.9. Samples: 1679742740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:13:06,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 09:13:09,007][12883] Updated weights for policy 0, policy_version 102523 (0.0039) +[2024-06-18 09:13:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1679818752. Throughput: 0: 42792.0. Samples: 1679994220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:13:11,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 09:13:13,557][12883] Updated weights for policy 0, policy_version 102533 (0.0031) +[2024-06-18 09:13:16,641][12883] Updated weights for policy 0, policy_version 102543 (0.0034) +[2024-06-18 09:13:16,996][12645] Fps is (10 sec: 45865.3, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 1680064512. Throughput: 0: 42741.3. Samples: 1680124080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:13:16,996][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 09:13:21,427][12883] Updated weights for policy 0, policy_version 102553 (0.0028) +[2024-06-18 09:13:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1680261120. Throughput: 0: 42651.6. Samples: 1680375460. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:13:22,008][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 09:13:23,965][12862] Signal inference workers to stop experience collection... (24500 times) +[2024-06-18 09:13:23,966][12862] Signal inference workers to resume experience collection... (24500 times) +[2024-06-18 09:13:24,010][12883] InferenceWorker_p0-w0: stopping experience collection (24500 times) +[2024-06-18 09:13:24,010][12883] InferenceWorker_p0-w0: resuming experience collection (24500 times) +[2024-06-18 09:13:24,286][12883] Updated weights for policy 0, policy_version 102563 (0.0032) +[2024-06-18 09:13:26,994][12645] Fps is (10 sec: 40969.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1680474112. Throughput: 0: 42520.4. Samples: 1680629440. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) +[2024-06-18 09:13:26,996][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 09:13:29,142][12883] Updated weights for policy 0, policy_version 102573 (0.0044) +[2024-06-18 09:13:31,877][12883] Updated weights for policy 0, policy_version 102583 (0.0034) +[2024-06-18 09:13:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1680719872. Throughput: 0: 42568.4. Samples: 1680759380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:13:31,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 09:13:36,675][12883] Updated weights for policy 0, policy_version 102593 (0.0025) +[2024-06-18 09:13:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 1680883712. Throughput: 0: 42664.0. Samples: 1681015380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:13:36,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 09:13:39,650][12883] Updated weights for policy 0, policy_version 102603 (0.0046) +[2024-06-18 09:13:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1681129472. Throughput: 0: 42469.9. Samples: 1681266320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:13:41,994][12645] Avg episode reward: [(0, '0.594')] +[2024-06-18 09:13:44,256][12883] Updated weights for policy 0, policy_version 102613 (0.0033) +[2024-06-18 09:13:46,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1681342464. Throughput: 0: 42543.6. Samples: 1681401060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:13:46,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 09:13:47,491][12883] Updated weights for policy 0, policy_version 102623 (0.0028) +[2024-06-18 09:13:51,903][12883] Updated weights for policy 0, policy_version 102633 (0.0035) +[2024-06-18 09:13:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.6, 300 sec: 42820.7). Total num frames: 1681539072. Throughput: 0: 42516.0. Samples: 1681655960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:13:51,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 09:13:55,125][12883] Updated weights for policy 0, policy_version 102643 (0.0045) +[2024-06-18 09:13:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1681768448. Throughput: 0: 42451.6. Samples: 1681904540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:13:56,994][12645] Avg episode reward: [(0, '0.187')] +[2024-06-18 09:14:00,078][12883] Updated weights for policy 0, policy_version 102653 (0.0030) +[2024-06-18 09:14:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1681965056. Throughput: 0: 42587.3. Samples: 1682040420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:14:01,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 09:14:02,774][12883] Updated weights for policy 0, policy_version 102663 (0.0041) +[2024-06-18 09:14:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1682161664. Throughput: 0: 42655.6. Samples: 1682294960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:14:06,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 09:14:07,698][12883] Updated weights for policy 0, policy_version 102673 (0.0036) +[2024-06-18 09:14:10,654][12883] Updated weights for policy 0, policy_version 102683 (0.0030) +[2024-06-18 09:14:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1682423808. Throughput: 0: 42605.7. Samples: 1682546700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:14:11,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 09:14:15,231][12883] Updated weights for policy 0, policy_version 102693 (0.0031) +[2024-06-18 09:14:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1682604032. Throughput: 0: 42748.9. Samples: 1682683080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:14:16,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 09:14:18,329][12883] Updated weights for policy 0, policy_version 102703 (0.0022) +[2024-06-18 09:14:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1682817024. Throughput: 0: 42582.8. Samples: 1682931600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:14:21,996][12645] Avg episode reward: [(0, '0.134')] +[2024-06-18 09:14:22,783][12883] Updated weights for policy 0, policy_version 102713 (0.0024) +[2024-06-18 09:14:26,098][12883] Updated weights for policy 0, policy_version 102723 (0.0040) +[2024-06-18 09:14:26,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1683062784. Throughput: 0: 42752.0. Samples: 1683190160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 09:14:26,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 09:14:30,397][12883] Updated weights for policy 0, policy_version 102733 (0.0042) +[2024-06-18 09:14:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1683226624. Throughput: 0: 42679.0. Samples: 1683321620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:14:32,003][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 09:14:34,092][12883] Updated weights for policy 0, policy_version 102743 (0.0039) +[2024-06-18 09:14:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 1683472384. Throughput: 0: 42638.7. Samples: 1683574700. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:14:36,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 09:14:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102751_1683472384.pth... +[2024-06-18 09:14:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102126_1673232384.pth +[2024-06-18 09:14:38,053][12883] Updated weights for policy 0, policy_version 102753 (0.0029) +[2024-06-18 09:14:40,077][12862] Signal inference workers to stop experience collection... (24550 times) +[2024-06-18 09:14:40,077][12862] Signal inference workers to resume experience collection... (24550 times) +[2024-06-18 09:14:40,101][12883] InferenceWorker_p0-w0: stopping experience collection (24550 times) +[2024-06-18 09:14:40,101][12883] InferenceWorker_p0-w0: resuming experience collection (24550 times) +[2024-06-18 09:14:41,692][12883] Updated weights for policy 0, policy_version 102763 (0.0036) +[2024-06-18 09:14:41,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1683685376. Throughput: 0: 42834.3. Samples: 1683832080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:14:41,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 09:14:45,622][12883] Updated weights for policy 0, policy_version 102773 (0.0034) +[2024-06-18 09:14:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1683881984. Throughput: 0: 42613.5. Samples: 1683958020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:14:46,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 09:14:49,208][12883] Updated weights for policy 0, policy_version 102783 (0.0053) +[2024-06-18 09:14:51,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1684127744. Throughput: 0: 42604.3. Samples: 1684212160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:14:51,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 09:14:53,170][12883] Updated weights for policy 0, policy_version 102793 (0.0026) +[2024-06-18 09:14:56,855][12883] Updated weights for policy 0, policy_version 102803 (0.0029) +[2024-06-18 09:14:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1684324352. Throughput: 0: 42901.9. Samples: 1684477280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:14:56,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 09:15:00,739][12883] Updated weights for policy 0, policy_version 102813 (0.0039) +[2024-06-18 09:15:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1684520960. Throughput: 0: 42619.6. Samples: 1684600960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:15:01,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 09:15:04,700][12883] Updated weights for policy 0, policy_version 102823 (0.0035) +[2024-06-18 09:15:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1684766720. Throughput: 0: 42730.3. Samples: 1684854460. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:15:06,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 09:15:08,638][12883] Updated weights for policy 0, policy_version 102833 (0.0029) +[2024-06-18 09:15:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1684946944. Throughput: 0: 42892.0. Samples: 1685120300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:15:11,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 09:15:12,197][12883] Updated weights for policy 0, policy_version 102843 (0.0041) +[2024-06-18 09:15:16,269][12883] Updated weights for policy 0, policy_version 102853 (0.0027) +[2024-06-18 09:15:16,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1685176320. Throughput: 0: 42770.7. Samples: 1685246300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:15:16,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 09:15:20,126][12883] Updated weights for policy 0, policy_version 102863 (0.0031) +[2024-06-18 09:15:21,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42869.9, 300 sec: 42765.0). Total num frames: 1685389312. Throughput: 0: 42712.5. Samples: 1685496860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:15:21,996][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 09:15:23,797][12883] Updated weights for policy 0, policy_version 102873 (0.0031) +[2024-06-18 09:15:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1685585920. Throughput: 0: 42730.2. Samples: 1685754940. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) +[2024-06-18 09:15:26,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 09:15:27,713][12883] Updated weights for policy 0, policy_version 102883 (0.0046) +[2024-06-18 09:15:31,383][12883] Updated weights for policy 0, policy_version 102893 (0.0029) +[2024-06-18 09:15:31,996][12645] Fps is (10 sec: 44237.0, 60 sec: 43416.0, 300 sec: 42764.7). Total num frames: 1685831680. Throughput: 0: 42715.7. Samples: 1685880320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:15:31,996][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 09:15:35,427][12883] Updated weights for policy 0, policy_version 102903 (0.0035) +[2024-06-18 09:15:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1686028288. Throughput: 0: 42800.7. Samples: 1686138180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:15:36,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 09:15:38,928][12883] Updated weights for policy 0, policy_version 102913 (0.0027) +[2024-06-18 09:15:41,994][12645] Fps is (10 sec: 39330.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1686224896. Throughput: 0: 42649.4. Samples: 1686396500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:15:41,994][12645] Avg episode reward: [(0, '0.148')] +[2024-06-18 09:15:43,083][12883] Updated weights for policy 0, policy_version 102923 (0.0032) +[2024-06-18 09:15:43,577][12862] Signal inference workers to stop experience collection... (24600 times) +[2024-06-18 09:15:43,577][12862] Signal inference workers to resume experience collection... (24600 times) +[2024-06-18 09:15:43,607][12883] InferenceWorker_p0-w0: stopping experience collection (24600 times) +[2024-06-18 09:15:43,608][12883] InferenceWorker_p0-w0: resuming experience collection (24600 times) +[2024-06-18 09:15:46,916][12883] Updated weights for policy 0, policy_version 102933 (0.0041) +[2024-06-18 09:15:46,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1686454272. Throughput: 0: 42753.4. Samples: 1686524960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:15:46,996][12645] Avg episode reward: [(0, '0.261')] +[2024-06-18 09:15:50,984][12883] Updated weights for policy 0, policy_version 102943 (0.0028) +[2024-06-18 09:15:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42765.3). Total num frames: 1686667264. Throughput: 0: 42820.3. Samples: 1686781380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:15:51,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 09:15:54,509][12883] Updated weights for policy 0, policy_version 102953 (0.0039) +[2024-06-18 09:15:56,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1686863872. Throughput: 0: 42535.5. Samples: 1687034400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:15:56,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 09:15:58,523][12883] Updated weights for policy 0, policy_version 102963 (0.0033) +[2024-06-18 09:16:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1687109632. Throughput: 0: 42648.5. Samples: 1687165480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:16:01,994][12883] Updated weights for policy 0, policy_version 102973 (0.0033) +[2024-06-18 09:16:01,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 09:16:05,866][12883] Updated weights for policy 0, policy_version 102983 (0.0029) +[2024-06-18 09:16:06,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 1687289856. Throughput: 0: 42843.1. Samples: 1687424800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:16:06,996][12645] Avg episode reward: [(0, '0.767')] +[2024-06-18 09:16:09,370][12883] Updated weights for policy 0, policy_version 102993 (0.0028) +[2024-06-18 09:16:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1687519232. Throughput: 0: 42803.0. Samples: 1687681080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:16:11,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 09:16:13,611][12883] Updated weights for policy 0, policy_version 103003 (0.0024) +[2024-06-18 09:16:16,994][12645] Fps is (10 sec: 44246.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1687732224. Throughput: 0: 42880.7. Samples: 1687809860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:16:16,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 09:16:17,260][12883] Updated weights for policy 0, policy_version 103013 (0.0040) +[2024-06-18 09:16:21,381][12883] Updated weights for policy 0, policy_version 103023 (0.0028) +[2024-06-18 09:16:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1687945216. Throughput: 0: 42944.8. Samples: 1688070700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:16:21,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 09:16:24,867][12883] Updated weights for policy 0, policy_version 103033 (0.0027) +[2024-06-18 09:16:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1688158208. Throughput: 0: 42766.5. Samples: 1688321000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:16:26,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 09:16:29,135][12883] Updated weights for policy 0, policy_version 103043 (0.0039) +[2024-06-18 09:16:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 1688371200. Throughput: 0: 42892.7. Samples: 1688455040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 09:16:32,000][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 09:16:32,556][12883] Updated weights for policy 0, policy_version 103053 (0.0037) +[2024-06-18 09:16:36,835][12883] Updated weights for policy 0, policy_version 103063 (0.0033) +[2024-06-18 09:16:36,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 1688584192. Throughput: 0: 42827.6. Samples: 1688708720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:16:36,997][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 09:16:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103063_1688584192.pth... +[2024-06-18 09:16:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102436_1678311424.pth +[2024-06-18 09:16:40,191][12883] Updated weights for policy 0, policy_version 103073 (0.0039) +[2024-06-18 09:16:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1688813568. Throughput: 0: 42707.4. Samples: 1688956240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:16:41,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 09:16:44,552][12883] Updated weights for policy 0, policy_version 103083 (0.0044) +[2024-06-18 09:16:46,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42326.8, 300 sec: 42709.4). Total num frames: 1688993792. Throughput: 0: 42694.1. Samples: 1689086720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:16:46,995][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 09:16:47,783][12883] Updated weights for policy 0, policy_version 103093 (0.0035) +[2024-06-18 09:16:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1689223168. Throughput: 0: 42590.5. Samples: 1689341280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:16:51,994][12645] Avg episode reward: [(0, '0.668')] +[2024-06-18 09:16:52,114][12883] Updated weights for policy 0, policy_version 103103 (0.0030) +[2024-06-18 09:16:55,697][12883] Updated weights for policy 0, policy_version 103113 (0.0044) +[2024-06-18 09:16:56,994][12645] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1689436160. Throughput: 0: 42475.2. Samples: 1689592460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:16:56,994][12645] Avg episode reward: [(0, '0.633')] +[2024-06-18 09:16:59,811][12883] Updated weights for policy 0, policy_version 103123 (0.0033) +[2024-06-18 09:17:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1689632768. Throughput: 0: 42636.9. Samples: 1689728520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:17:01,995][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 09:17:03,365][12883] Updated weights for policy 0, policy_version 103133 (0.0039) +[2024-06-18 09:17:06,995][12645] Fps is (10 sec: 42590.9, 60 sec: 42871.9, 300 sec: 42764.8). Total num frames: 1689862144. Throughput: 0: 42325.5. Samples: 1689975420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:17:06,996][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 09:17:07,492][12883] Updated weights for policy 0, policy_version 103143 (0.0032) +[2024-06-18 09:17:11,206][12883] Updated weights for policy 0, policy_version 103153 (0.0047) +[2024-06-18 09:17:11,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1690075136. Throughput: 0: 42411.7. Samples: 1690229520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:17:11,994][12645] Avg episode reward: [(0, '0.083')] +[2024-06-18 09:17:14,812][12862] Signal inference workers to stop experience collection... (24650 times) +[2024-06-18 09:17:14,867][12883] InferenceWorker_p0-w0: stopping experience collection (24650 times) +[2024-06-18 09:17:14,870][12862] Signal inference workers to resume experience collection... (24650 times) +[2024-06-18 09:17:14,881][12883] InferenceWorker_p0-w0: resuming experience collection (24650 times) +[2024-06-18 09:17:15,191][12883] Updated weights for policy 0, policy_version 103163 (0.0030) +[2024-06-18 09:17:16,996][12645] Fps is (10 sec: 40957.7, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 1690271744. Throughput: 0: 42318.8. Samples: 1690359480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:17:16,996][12645] Avg episode reward: [(0, '0.203')] +[2024-06-18 09:17:18,839][12883] Updated weights for policy 0, policy_version 103173 (0.0034) +[2024-06-18 09:17:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1690501120. Throughput: 0: 42351.1. Samples: 1690614420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:17:21,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 09:17:22,957][12883] Updated weights for policy 0, policy_version 103183 (0.0032) +[2024-06-18 09:17:26,536][12883] Updated weights for policy 0, policy_version 103193 (0.0042) +[2024-06-18 09:17:26,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1690714112. Throughput: 0: 42384.5. Samples: 1690863540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:17:26,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 09:17:30,543][12883] Updated weights for policy 0, policy_version 103203 (0.0025) +[2024-06-18 09:17:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1690910720. Throughput: 0: 42373.1. Samples: 1690993500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 09:17:31,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 09:17:34,661][12883] Updated weights for policy 0, policy_version 103213 (0.0031) +[2024-06-18 09:17:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 1691140096. Throughput: 0: 42469.4. Samples: 1691252400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:17:36,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 09:17:38,130][12883] Updated weights for policy 0, policy_version 103223 (0.0036) +[2024-06-18 09:17:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1691353088. Throughput: 0: 42577.8. Samples: 1691508460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:17:41,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 09:17:42,328][12883] Updated weights for policy 0, policy_version 103233 (0.0024) +[2024-06-18 09:17:46,093][12883] Updated weights for policy 0, policy_version 103243 (0.0036) +[2024-06-18 09:17:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1691549696. Throughput: 0: 42553.4. Samples: 1691643420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:17:46,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 09:17:49,992][12883] Updated weights for policy 0, policy_version 103253 (0.0032) +[2024-06-18 09:17:51,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 1691779072. Throughput: 0: 42538.2. Samples: 1691889660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:17:51,997][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 09:17:54,153][12883] Updated weights for policy 0, policy_version 103263 (0.0032) +[2024-06-18 09:17:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1691992064. Throughput: 0: 42579.9. Samples: 1692145620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:17:56,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 09:17:57,495][12883] Updated weights for policy 0, policy_version 103273 (0.0033) +[2024-06-18 09:18:01,626][12883] Updated weights for policy 0, policy_version 103283 (0.0032) +[2024-06-18 09:18:01,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 1692188672. Throughput: 0: 42771.5. Samples: 1692284100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:18:01,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 09:18:05,158][12883] Updated weights for policy 0, policy_version 103293 (0.0042) +[2024-06-18 09:18:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42326.5, 300 sec: 42653.9). Total num frames: 1692401664. Throughput: 0: 42679.6. Samples: 1692535000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:18:06,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 09:18:09,121][12883] Updated weights for policy 0, policy_version 103303 (0.0033) +[2024-06-18 09:18:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 1692631040. Throughput: 0: 42752.5. Samples: 1692787400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:18:11,994][12645] Avg episode reward: [(0, '0.108')] +[2024-06-18 09:18:12,733][12883] Updated weights for policy 0, policy_version 103313 (0.0024) +[2024-06-18 09:18:16,592][12883] Updated weights for policy 0, policy_version 103323 (0.0037) +[2024-06-18 09:18:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 1692844032. Throughput: 0: 42825.8. Samples: 1692920660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:18:16,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 09:18:20,444][12883] Updated weights for policy 0, policy_version 103333 (0.0040) +[2024-06-18 09:18:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1693057024. Throughput: 0: 42685.9. Samples: 1693173360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:18:21,996][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 09:18:24,237][12883] Updated weights for policy 0, policy_version 103343 (0.0038) +[2024-06-18 09:18:26,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 1693286400. Throughput: 0: 42740.4. Samples: 1693431880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:18:26,997][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 09:18:28,186][12883] Updated weights for policy 0, policy_version 103353 (0.0031) +[2024-06-18 09:18:31,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1693483008. Throughput: 0: 42629.8. Samples: 1693561760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 09:18:31,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 09:18:32,119][12883] Updated weights for policy 0, policy_version 103363 (0.0051) +[2024-06-18 09:18:36,229][12883] Updated weights for policy 0, policy_version 103373 (0.0045) +[2024-06-18 09:18:36,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1693696000. Throughput: 0: 42866.5. Samples: 1693818560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:18:36,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 09:18:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103375_1693696000.pth... +[2024-06-18 09:18:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102751_1683472384.pth +[2024-06-18 09:18:39,639][12883] Updated weights for policy 0, policy_version 103383 (0.0028) +[2024-06-18 09:18:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1693925376. Throughput: 0: 42757.7. Samples: 1694069720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:18:41,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 09:18:43,758][12883] Updated weights for policy 0, policy_version 103393 (0.0034) +[2024-06-18 09:18:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1694121984. Throughput: 0: 42611.9. Samples: 1694201640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:18:46,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 09:18:47,270][12883] Updated weights for policy 0, policy_version 103403 (0.0027) +[2024-06-18 09:18:49,504][12862] Signal inference workers to stop experience collection... (24700 times) +[2024-06-18 09:18:49,537][12883] InferenceWorker_p0-w0: stopping experience collection (24700 times) +[2024-06-18 09:18:49,562][12862] Signal inference workers to resume experience collection... (24700 times) +[2024-06-18 09:18:49,563][12883] InferenceWorker_p0-w0: resuming experience collection (24700 times) +[2024-06-18 09:18:51,299][12883] Updated weights for policy 0, policy_version 103413 (0.0041) +[2024-06-18 09:18:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 1694334976. Throughput: 0: 42645.3. Samples: 1694454040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:18:51,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 09:18:54,817][12883] Updated weights for policy 0, policy_version 103423 (0.0035) +[2024-06-18 09:18:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1694564352. Throughput: 0: 42795.2. Samples: 1694713180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:18:56,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 09:18:58,992][12883] Updated weights for policy 0, policy_version 103433 (0.0033) +[2024-06-18 09:19:01,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1694777344. Throughput: 0: 42745.6. Samples: 1694844220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:19:01,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 09:19:02,415][12883] Updated weights for policy 0, policy_version 103443 (0.0028) +[2024-06-18 09:19:06,525][12883] Updated weights for policy 0, policy_version 103453 (0.0036) +[2024-06-18 09:19:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1694973952. Throughput: 0: 42758.1. Samples: 1695097380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:19:06,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 09:19:10,258][12883] Updated weights for policy 0, policy_version 103463 (0.0034) +[2024-06-18 09:19:11,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1695203328. Throughput: 0: 42637.8. Samples: 1695350480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:19:11,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 09:19:14,478][12883] Updated weights for policy 0, policy_version 103473 (0.0025) +[2024-06-18 09:19:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1695399936. Throughput: 0: 42691.6. Samples: 1695482880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:19:16,994][12645] Avg episode reward: [(0, '0.068')] +[2024-06-18 09:19:17,852][12883] Updated weights for policy 0, policy_version 103483 (0.0032) +[2024-06-18 09:19:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 1695612928. Throughput: 0: 42635.2. Samples: 1695737140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:19:21,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 09:19:22,014][12883] Updated weights for policy 0, policy_version 103493 (0.0048) +[2024-06-18 09:19:26,077][12883] Updated weights for policy 0, policy_version 103503 (0.0030) +[2024-06-18 09:19:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1695842304. Throughput: 0: 42704.5. Samples: 1695991420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:19:26,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 09:19:29,899][12883] Updated weights for policy 0, policy_version 103513 (0.0039) +[2024-06-18 09:19:31,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 1696038912. Throughput: 0: 42735.6. Samples: 1696124840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:19:31,996][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 09:19:33,452][12883] Updated weights for policy 0, policy_version 103523 (0.0025) +[2024-06-18 09:19:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1696251904. Throughput: 0: 42816.9. Samples: 1696380800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:19:36,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 09:19:37,474][12883] Updated weights for policy 0, policy_version 103533 (0.0035) +[2024-06-18 09:19:41,075][12883] Updated weights for policy 0, policy_version 103543 (0.0036) +[2024-06-18 09:19:41,994][12645] Fps is (10 sec: 44246.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1696481280. Throughput: 0: 42651.8. Samples: 1696632520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:19:41,995][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 09:19:45,113][12883] Updated weights for policy 0, policy_version 103553 (0.0030) +[2024-06-18 09:19:46,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1696710656. Throughput: 0: 42770.8. Samples: 1696768900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:19:46,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 09:19:48,518][12883] Updated weights for policy 0, policy_version 103563 (0.0030) +[2024-06-18 09:19:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1696907264. Throughput: 0: 42881.7. Samples: 1697027060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:19:51,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 09:19:52,827][12883] Updated weights for policy 0, policy_version 103573 (0.0037) +[2024-06-18 09:19:56,171][12883] Updated weights for policy 0, policy_version 103583 (0.0038) +[2024-06-18 09:19:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1697120256. Throughput: 0: 42792.4. Samples: 1697276140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:19:56,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 09:20:00,415][12883] Updated weights for policy 0, policy_version 103593 (0.0034) +[2024-06-18 09:20:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1697349632. Throughput: 0: 42806.6. Samples: 1697409180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:20:01,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 09:20:03,859][12883] Updated weights for policy 0, policy_version 103603 (0.0029) +[2024-06-18 09:20:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1697546240. Throughput: 0: 42877.7. Samples: 1697666640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:20:06,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 09:20:08,004][12883] Updated weights for policy 0, policy_version 103613 (0.0040) +[2024-06-18 09:20:11,679][12883] Updated weights for policy 0, policy_version 103623 (0.0036) +[2024-06-18 09:20:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1697775616. Throughput: 0: 42797.5. Samples: 1697917300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:20:11,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 09:20:15,672][12883] Updated weights for policy 0, policy_version 103633 (0.0040) +[2024-06-18 09:20:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 1697972224. Throughput: 0: 42704.8. Samples: 1698046460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:20:16,994][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 09:20:17,375][12862] Signal inference workers to stop experience collection... (24750 times) +[2024-06-18 09:20:17,375][12862] Signal inference workers to resume experience collection... (24750 times) +[2024-06-18 09:20:17,420][12883] InferenceWorker_p0-w0: stopping experience collection (24750 times) +[2024-06-18 09:20:17,420][12883] InferenceWorker_p0-w0: resuming experience collection (24750 times) +[2024-06-18 09:20:19,353][12883] Updated weights for policy 0, policy_version 103643 (0.0039) +[2024-06-18 09:20:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1698168832. Throughput: 0: 42565.2. Samples: 1698296240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:20:21,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 09:20:23,776][12883] Updated weights for policy 0, policy_version 103653 (0.0049) +[2024-06-18 09:20:26,992][12883] Updated weights for policy 0, policy_version 103663 (0.0042) +[2024-06-18 09:20:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 1698414592. Throughput: 0: 42581.4. Samples: 1698548680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:20:26,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 09:20:31,374][12883] Updated weights for policy 0, policy_version 103673 (0.0034) +[2024-06-18 09:20:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 1698611200. Throughput: 0: 42410.3. Samples: 1698677360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) +[2024-06-18 09:20:31,994][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 09:20:34,619][12883] Updated weights for policy 0, policy_version 103683 (0.0033) +[2024-06-18 09:20:36,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1698791424. Throughput: 0: 42298.7. Samples: 1698930500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:20:36,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 09:20:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103686_1698791424.pth... +[2024-06-18 09:20:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103063_1688584192.pth +[2024-06-18 09:20:39,059][12883] Updated weights for policy 0, policy_version 103693 (0.0042) +[2024-06-18 09:20:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 1699037184. Throughput: 0: 42359.9. Samples: 1699182340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:20:41,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 09:20:42,366][12883] Updated weights for policy 0, policy_version 103703 (0.0038) +[2024-06-18 09:20:46,681][12883] Updated weights for policy 0, policy_version 103713 (0.0033) +[2024-06-18 09:20:46,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1699266560. Throughput: 0: 42434.3. Samples: 1699318720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:20:46,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 09:20:49,956][12883] Updated weights for policy 0, policy_version 103723 (0.0033) +[2024-06-18 09:20:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1699430400. Throughput: 0: 42231.6. Samples: 1699567060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:20:51,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 09:20:54,278][12883] Updated weights for policy 0, policy_version 103733 (0.0037) +[2024-06-18 09:20:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1699676160. Throughput: 0: 42456.4. Samples: 1699827840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:20:56,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 09:20:57,740][12883] Updated weights for policy 0, policy_version 103743 (0.0041) +[2024-06-18 09:21:01,883][12883] Updated weights for policy 0, policy_version 103753 (0.0037) +[2024-06-18 09:21:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 1699889152. Throughput: 0: 42474.2. Samples: 1699957800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:21:01,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 09:21:05,639][12883] Updated weights for policy 0, policy_version 103763 (0.0027) +[2024-06-18 09:21:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1700085760. Throughput: 0: 42424.0. Samples: 1700205320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:21:06,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 09:21:09,721][12883] Updated weights for policy 0, policy_version 103773 (0.0032) +[2024-06-18 09:21:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1700298752. Throughput: 0: 42472.5. Samples: 1700459940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:21:11,994][12645] Avg episode reward: [(0, '0.779')] +[2024-06-18 09:21:13,384][12883] Updated weights for policy 0, policy_version 103783 (0.0042) +[2024-06-18 09:21:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1700495360. Throughput: 0: 42408.1. Samples: 1700585720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:21:16,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 09:21:17,474][12883] Updated weights for policy 0, policy_version 103793 (0.0035) +[2024-06-18 09:21:20,905][12883] Updated weights for policy 0, policy_version 103803 (0.0042) +[2024-06-18 09:21:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1700741120. Throughput: 0: 42473.7. Samples: 1700841820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:21:21,998][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 09:21:25,034][12883] Updated weights for policy 0, policy_version 103813 (0.0031) +[2024-06-18 09:21:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1700954112. Throughput: 0: 42386.8. Samples: 1701089740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:21:26,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 09:21:28,850][12883] Updated weights for policy 0, policy_version 103823 (0.0034) +[2024-06-18 09:21:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 1701134336. Throughput: 0: 42350.7. Samples: 1701224500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 09:21:31,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 09:21:32,785][12883] Updated weights for policy 0, policy_version 103833 (0.0035) +[2024-06-18 09:21:36,401][12883] Updated weights for policy 0, policy_version 103843 (0.0034) +[2024-06-18 09:21:36,671][12862] Signal inference workers to stop experience collection... (24800 times) +[2024-06-18 09:21:36,672][12862] Signal inference workers to resume experience collection... (24800 times) +[2024-06-18 09:21:36,716][12883] InferenceWorker_p0-w0: stopping experience collection (24800 times) +[2024-06-18 09:21:36,716][12883] InferenceWorker_p0-w0: resuming experience collection (24800 times) +[2024-06-18 09:21:36,996][12645] Fps is (10 sec: 44227.0, 60 sec: 43416.0, 300 sec: 42653.6). Total num frames: 1701396480. Throughput: 0: 42682.8. Samples: 1701487880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:21:36,997][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 09:21:40,702][12883] Updated weights for policy 0, policy_version 103853 (0.0040) +[2024-06-18 09:21:41,996][12645] Fps is (10 sec: 45864.7, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 1701593088. Throughput: 0: 42405.0. Samples: 1701736160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:21:41,997][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 09:21:44,144][12883] Updated weights for policy 0, policy_version 103863 (0.0051) +[2024-06-18 09:21:46,994][12645] Fps is (10 sec: 37691.8, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1701773312. Throughput: 0: 42212.0. Samples: 1701857340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:21:46,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 09:21:48,523][12883] Updated weights for policy 0, policy_version 103873 (0.0031) +[2024-06-18 09:21:51,963][12883] Updated weights for policy 0, policy_version 103883 (0.0030) +[2024-06-18 09:21:51,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1702019072. Throughput: 0: 42563.2. Samples: 1702120660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:21:51,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 09:21:56,087][12883] Updated weights for policy 0, policy_version 103893 (0.0036) +[2024-06-18 09:21:56,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1702232064. Throughput: 0: 42573.6. Samples: 1702375760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:21:56,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 09:21:59,592][12883] Updated weights for policy 0, policy_version 103903 (0.0035) +[2024-06-18 09:22:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42543.1). Total num frames: 1702412288. Throughput: 0: 42567.5. Samples: 1702501260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:22:01,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 09:22:03,587][12883] Updated weights for policy 0, policy_version 103913 (0.0028) +[2024-06-18 09:22:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1702641664. Throughput: 0: 42542.2. Samples: 1702756220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:22:06,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 09:22:07,374][12883] Updated weights for policy 0, policy_version 103923 (0.0031) +[2024-06-18 09:22:11,532][12883] Updated weights for policy 0, policy_version 103933 (0.0033) +[2024-06-18 09:22:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1702854656. Throughput: 0: 42719.6. Samples: 1703012120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:22:11,994][12645] Avg episode reward: [(0, '0.224')] +[2024-06-18 09:22:15,094][12883] Updated weights for policy 0, policy_version 103943 (0.0033) +[2024-06-18 09:22:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1703051264. Throughput: 0: 42583.5. Samples: 1703140760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:22:16,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 09:22:19,122][12883] Updated weights for policy 0, policy_version 103953 (0.0024) +[2024-06-18 09:22:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1703280640. Throughput: 0: 42457.3. Samples: 1703398360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:22:21,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 09:22:22,819][12883] Updated weights for policy 0, policy_version 103963 (0.0029) +[2024-06-18 09:22:26,668][12883] Updated weights for policy 0, policy_version 103973 (0.0035) +[2024-06-18 09:22:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1703493632. Throughput: 0: 42591.8. Samples: 1703652700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:22:26,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 09:22:30,511][12883] Updated weights for policy 0, policy_version 103983 (0.0040) +[2024-06-18 09:22:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1703706624. Throughput: 0: 42791.5. Samples: 1703782960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) +[2024-06-18 09:22:31,994][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 09:22:34,526][12883] Updated weights for policy 0, policy_version 103993 (0.0038) +[2024-06-18 09:22:36,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 1703936000. Throughput: 0: 42646.7. Samples: 1704039860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:22:36,997][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 09:22:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104000_1703936000.pth... +[2024-06-18 09:22:37,084][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103375_1693696000.pth +[2024-06-18 09:22:38,253][12883] Updated weights for policy 0, policy_version 104003 (0.0037) +[2024-06-18 09:22:41,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42325.4, 300 sec: 42653.6). Total num frames: 1704132608. Throughput: 0: 42626.5. Samples: 1704294040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:22:41,996][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 09:22:42,083][12883] Updated weights for policy 0, policy_version 104013 (0.0028) +[2024-06-18 09:22:45,827][12883] Updated weights for policy 0, policy_version 104023 (0.0042) +[2024-06-18 09:22:46,994][12645] Fps is (10 sec: 42608.0, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 1704361984. Throughput: 0: 42616.4. Samples: 1704419000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:22:46,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 09:22:49,737][12883] Updated weights for policy 0, policy_version 104033 (0.0034) +[2024-06-18 09:22:51,996][12645] Fps is (10 sec: 42598.6, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 1704558592. Throughput: 0: 42786.9. Samples: 1704681720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:22:51,996][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 09:22:53,649][12883] Updated weights for policy 0, policy_version 104043 (0.0036) +[2024-06-18 09:22:56,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 1704771584. Throughput: 0: 42610.7. Samples: 1704929700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:22:56,996][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 09:22:57,375][12883] Updated weights for policy 0, policy_version 104053 (0.0036) +[2024-06-18 09:23:01,269][12883] Updated weights for policy 0, policy_version 104063 (0.0023) +[2024-06-18 09:23:01,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1704984576. Throughput: 0: 42649.4. Samples: 1705059980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:23:01,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 09:23:04,967][12883] Updated weights for policy 0, policy_version 104073 (0.0031) +[2024-06-18 09:23:06,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1705197568. Throughput: 0: 42646.6. Samples: 1705317460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:23:06,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 09:23:07,028][12862] Signal inference workers to stop experience collection... (24850 times) +[2024-06-18 09:23:07,028][12862] Signal inference workers to resume experience collection... (24850 times) +[2024-06-18 09:23:07,048][12883] InferenceWorker_p0-w0: stopping experience collection (24850 times) +[2024-06-18 09:23:07,048][12883] InferenceWorker_p0-w0: resuming experience collection (24850 times) +[2024-06-18 09:23:08,783][12883] Updated weights for policy 0, policy_version 104083 (0.0041) +[2024-06-18 09:23:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1705410560. Throughput: 0: 42583.3. Samples: 1705568940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:23:11,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 09:23:12,633][12883] Updated weights for policy 0, policy_version 104093 (0.0027) +[2024-06-18 09:23:16,333][12883] Updated weights for policy 0, policy_version 104103 (0.0042) +[2024-06-18 09:23:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1705623552. Throughput: 0: 42588.9. Samples: 1705699460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:23:16,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 09:23:20,431][12883] Updated weights for policy 0, policy_version 104113 (0.0041) +[2024-06-18 09:23:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 1705820160. Throughput: 0: 42491.1. Samples: 1705951860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:23:21,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 09:23:24,420][12883] Updated weights for policy 0, policy_version 104123 (0.0042) +[2024-06-18 09:23:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1706065920. Throughput: 0: 42552.8. Samples: 1706208820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:23:26,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 09:23:28,396][12883] Updated weights for policy 0, policy_version 104133 (0.0036) +[2024-06-18 09:23:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1706262528. Throughput: 0: 42699.6. Samples: 1706340480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 09:23:31,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 09:23:32,162][12883] Updated weights for policy 0, policy_version 104143 (0.0051) +[2024-06-18 09:23:36,127][12883] Updated weights for policy 0, policy_version 104153 (0.0032) +[2024-06-18 09:23:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 1706459136. Throughput: 0: 42484.3. Samples: 1706593420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:23:36,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 09:23:39,686][12883] Updated weights for policy 0, policy_version 104163 (0.0031) +[2024-06-18 09:23:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 1706688512. Throughput: 0: 42722.6. Samples: 1706852120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:23:41,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 09:23:43,981][12883] Updated weights for policy 0, policy_version 104173 (0.0031) +[2024-06-18 09:23:46,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1706901504. Throughput: 0: 42604.2. Samples: 1706977180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:23:46,995][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 09:23:47,846][12883] Updated weights for policy 0, policy_version 104183 (0.0025) +[2024-06-18 09:23:51,707][12883] Updated weights for policy 0, policy_version 104193 (0.0037) +[2024-06-18 09:23:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42326.8, 300 sec: 42487.3). Total num frames: 1707098112. Throughput: 0: 42532.8. Samples: 1707231440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:23:51,994][12645] Avg episode reward: [(0, '0.624')] +[2024-06-18 09:23:55,277][12883] Updated weights for policy 0, policy_version 104203 (0.0043) +[2024-06-18 09:23:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 1707343872. Throughput: 0: 42640.8. Samples: 1707487780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:23:56,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 09:23:59,328][12883] Updated weights for policy 0, policy_version 104213 (0.0037) +[2024-06-18 09:24:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1707540480. Throughput: 0: 42646.3. Samples: 1707618540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:24:01,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 09:24:02,716][12883] Updated weights for policy 0, policy_version 104223 (0.0045) +[2024-06-18 09:24:06,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1707720704. Throughput: 0: 42803.1. Samples: 1707878000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:24:06,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 09:24:07,244][12883] Updated weights for policy 0, policy_version 104233 (0.0029) +[2024-06-18 09:24:10,350][12883] Updated weights for policy 0, policy_version 104243 (0.0033) +[2024-06-18 09:24:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1707982848. Throughput: 0: 42633.7. Samples: 1708127340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:24:11,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 09:24:14,698][12862] Signal inference workers to stop experience collection... (24900 times) +[2024-06-18 09:24:14,699][12862] Signal inference workers to resume experience collection... (24900 times) +[2024-06-18 09:24:14,712][12883] InferenceWorker_p0-w0: stopping experience collection (24900 times) +[2024-06-18 09:24:14,740][12883] InferenceWorker_p0-w0: resuming experience collection (24900 times) +[2024-06-18 09:24:14,857][12883] Updated weights for policy 0, policy_version 104253 (0.0029) +[2024-06-18 09:24:16,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1708179456. Throughput: 0: 42803.9. Samples: 1708266660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:24:16,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 09:24:18,321][12883] Updated weights for policy 0, policy_version 104263 (0.0031) +[2024-06-18 09:24:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1708376064. Throughput: 0: 42566.5. Samples: 1708508920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:24:21,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 09:24:22,428][12883] Updated weights for policy 0, policy_version 104273 (0.0046) +[2024-06-18 09:24:25,818][12883] Updated weights for policy 0, policy_version 104283 (0.0039) +[2024-06-18 09:24:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1708621824. Throughput: 0: 42539.1. Samples: 1708766380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:24:26,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 09:24:29,915][12883] Updated weights for policy 0, policy_version 104293 (0.0033) +[2024-06-18 09:24:31,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1708802048. Throughput: 0: 42749.2. Samples: 1708900880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) +[2024-06-18 09:24:31,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 09:24:33,417][12883] Updated weights for policy 0, policy_version 104303 (0.0029) +[2024-06-18 09:24:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1709031424. Throughput: 0: 42562.2. Samples: 1709146740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:24:36,995][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 09:24:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104311_1709031424.pth... +[2024-06-18 09:24:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103686_1698791424.pth +[2024-06-18 09:24:37,510][12883] Updated weights for policy 0, policy_version 104313 (0.0031) +[2024-06-18 09:24:41,238][12883] Updated weights for policy 0, policy_version 104323 (0.0034) +[2024-06-18 09:24:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1709260800. Throughput: 0: 42566.3. Samples: 1709403260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:24:41,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 09:24:45,237][12883] Updated weights for policy 0, policy_version 104333 (0.0031) +[2024-06-18 09:24:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1709457408. Throughput: 0: 42585.3. Samples: 1709534880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:24:46,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 09:24:48,957][12883] Updated weights for policy 0, policy_version 104343 (0.0035) +[2024-06-18 09:24:51,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 1709670400. Throughput: 0: 42411.6. Samples: 1709786620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:24:51,997][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 09:24:53,008][12883] Updated weights for policy 0, policy_version 104353 (0.0029) +[2024-06-18 09:24:56,537][12883] Updated weights for policy 0, policy_version 104363 (0.0028) +[2024-06-18 09:24:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1709883392. Throughput: 0: 42602.2. Samples: 1710044440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:24:56,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 09:25:00,570][12883] Updated weights for policy 0, policy_version 104373 (0.0039) +[2024-06-18 09:25:01,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1710096384. Throughput: 0: 42451.2. Samples: 1710176960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:25:01,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 09:25:04,338][12883] Updated weights for policy 0, policy_version 104383 (0.0038) +[2024-06-18 09:25:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 42542.8). Total num frames: 1710325760. Throughput: 0: 42778.3. Samples: 1710433940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:25:06,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 09:25:08,233][12883] Updated weights for policy 0, policy_version 104393 (0.0035) +[2024-06-18 09:25:11,926][12883] Updated weights for policy 0, policy_version 104403 (0.0043) +[2024-06-18 09:25:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1710538752. Throughput: 0: 42797.7. Samples: 1710692280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:25:11,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 09:25:15,744][12883] Updated weights for policy 0, policy_version 104413 (0.0040) +[2024-06-18 09:25:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1710751744. Throughput: 0: 42687.1. Samples: 1710821900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:25:16,996][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 09:25:19,574][12883] Updated weights for policy 0, policy_version 104423 (0.0031) +[2024-06-18 09:25:21,994][12645] Fps is (10 sec: 44237.5, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 1710981120. Throughput: 0: 42884.6. Samples: 1711076540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:25:21,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 09:25:23,590][12883] Updated weights for policy 0, policy_version 104433 (0.0042) +[2024-06-18 09:25:26,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1711161344. Throughput: 0: 42927.0. Samples: 1711334980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:25:26,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 09:25:27,438][12883] Updated weights for policy 0, policy_version 104443 (0.0039) +[2024-06-18 09:25:31,113][12883] Updated weights for policy 0, policy_version 104453 (0.0036) +[2024-06-18 09:25:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1711390720. Throughput: 0: 42681.8. Samples: 1711455560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:25:31,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 09:25:35,198][12883] Updated weights for policy 0, policy_version 104463 (0.0036) +[2024-06-18 09:25:36,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1711620096. Throughput: 0: 42774.9. Samples: 1711711400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 09:25:36,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 09:25:38,999][12883] Updated weights for policy 0, policy_version 104473 (0.0050) +[2024-06-18 09:25:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1711783936. Throughput: 0: 42740.4. Samples: 1711967760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:25:41,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 09:25:42,882][12862] Signal inference workers to stop experience collection... (24950 times) +[2024-06-18 09:25:42,928][12883] InferenceWorker_p0-w0: stopping experience collection (24950 times) +[2024-06-18 09:25:42,931][12862] Signal inference workers to resume experience collection... (24950 times) +[2024-06-18 09:25:42,943][12883] InferenceWorker_p0-w0: resuming experience collection (24950 times) +[2024-06-18 09:25:43,071][12883] Updated weights for policy 0, policy_version 104483 (0.0044) +[2024-06-18 09:25:46,547][12883] Updated weights for policy 0, policy_version 104493 (0.0037) +[2024-06-18 09:25:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1712029696. Throughput: 0: 42372.0. Samples: 1712083700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:25:46,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 09:25:50,716][12883] Updated weights for policy 0, policy_version 104503 (0.0036) +[2024-06-18 09:25:51,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 1712259072. Throughput: 0: 42523.0. Samples: 1712347480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:25:51,994][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 09:25:54,100][12883] Updated weights for policy 0, policy_version 104513 (0.0040) +[2024-06-18 09:25:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1712406528. Throughput: 0: 42628.9. Samples: 1712610580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:25:56,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 09:25:58,471][12883] Updated weights for policy 0, policy_version 104523 (0.0026) +[2024-06-18 09:26:01,813][12883] Updated weights for policy 0, policy_version 104533 (0.0030) +[2024-06-18 09:26:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1712668672. Throughput: 0: 42369.7. Samples: 1712728440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:26:01,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 09:26:06,076][12883] Updated weights for policy 0, policy_version 104543 (0.0028) +[2024-06-18 09:26:06,994][12645] Fps is (10 sec: 47514.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1712881664. Throughput: 0: 42636.5. Samples: 1712995180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:26:06,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 09:26:09,371][12883] Updated weights for policy 0, policy_version 104553 (0.0044) +[2024-06-18 09:26:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1713061888. Throughput: 0: 42561.0. Samples: 1713250220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:26:11,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 09:26:13,773][12883] Updated weights for policy 0, policy_version 104563 (0.0040) +[2024-06-18 09:26:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 1713307648. Throughput: 0: 42513.0. Samples: 1713368640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:26:16,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 09:26:17,018][12883] Updated weights for policy 0, policy_version 104573 (0.0041) +[2024-06-18 09:26:21,541][12883] Updated weights for policy 0, policy_version 104583 (0.0033) +[2024-06-18 09:26:21,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1713520640. Throughput: 0: 42590.9. Samples: 1713627980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:26:21,994][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 09:26:25,038][12883] Updated weights for policy 0, policy_version 104593 (0.0038) +[2024-06-18 09:26:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1713700864. Throughput: 0: 42552.1. Samples: 1713882600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:26:26,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 09:26:29,269][12883] Updated weights for policy 0, policy_version 104603 (0.0028) +[2024-06-18 09:26:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1713946624. Throughput: 0: 42758.3. Samples: 1714007820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:26:31,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 09:26:32,674][12883] Updated weights for policy 0, policy_version 104613 (0.0038) +[2024-06-18 09:26:36,780][12883] Updated weights for policy 0, policy_version 104623 (0.0036) +[2024-06-18 09:26:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 1714143232. Throughput: 0: 42689.4. Samples: 1714268500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) +[2024-06-18 09:26:36,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 09:26:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104623_1714143232.pth... +[2024-06-18 09:26:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104000_1703936000.pth +[2024-06-18 09:26:40,431][12883] Updated weights for policy 0, policy_version 104633 (0.0041) +[2024-06-18 09:26:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1714356224. Throughput: 0: 42479.6. Samples: 1714522160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:26:41,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 09:26:44,339][12883] Updated weights for policy 0, policy_version 104643 (0.0030) +[2024-06-18 09:26:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1714585600. Throughput: 0: 42740.4. Samples: 1714651760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:26:46,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 09:26:47,836][12883] Updated weights for policy 0, policy_version 104653 (0.0039) +[2024-06-18 09:26:51,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1714782208. Throughput: 0: 42559.1. Samples: 1714910340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:26:51,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 09:26:52,015][12883] Updated weights for policy 0, policy_version 104663 (0.0038) +[2024-06-18 09:26:55,448][12883] Updated weights for policy 0, policy_version 104673 (0.0034) +[2024-06-18 09:26:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 1714995200. Throughput: 0: 42545.9. Samples: 1715164780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:26:56,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 09:26:59,605][12883] Updated weights for policy 0, policy_version 104683 (0.0037) +[2024-06-18 09:27:01,016][12862] Signal inference workers to stop experience collection... (25000 times) +[2024-06-18 09:27:01,027][12883] InferenceWorker_p0-w0: stopping experience collection (25000 times) +[2024-06-18 09:27:01,074][12862] Signal inference workers to resume experience collection... (25000 times) +[2024-06-18 09:27:01,074][12883] InferenceWorker_p0-w0: resuming experience collection (25000 times) +[2024-06-18 09:27:01,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1715240960. Throughput: 0: 42803.0. Samples: 1715294780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:27:01,996][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 09:27:03,059][12883] Updated weights for policy 0, policy_version 104693 (0.0029) +[2024-06-18 09:27:06,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1715421184. Throughput: 0: 42813.2. Samples: 1715554580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:27:06,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 09:27:07,325][12883] Updated weights for policy 0, policy_version 104703 (0.0036) +[2024-06-18 09:27:10,859][12883] Updated weights for policy 0, policy_version 104713 (0.0028) +[2024-06-18 09:27:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1715650560. Throughput: 0: 42711.4. Samples: 1715804620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:27:11,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 09:27:14,926][12883] Updated weights for policy 0, policy_version 104723 (0.0028) +[2024-06-18 09:27:16,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1715879936. Throughput: 0: 42880.8. Samples: 1715937460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:27:16,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 09:27:18,406][12883] Updated weights for policy 0, policy_version 104733 (0.0041) +[2024-06-18 09:27:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1716076544. Throughput: 0: 42788.5. Samples: 1716193980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:27:21,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 09:27:22,632][12883] Updated weights for policy 0, policy_version 104743 (0.0038) +[2024-06-18 09:27:26,275][12883] Updated weights for policy 0, policy_version 104753 (0.0038) +[2024-06-18 09:27:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1716289536. Throughput: 0: 42708.2. Samples: 1716444020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:27:26,994][12645] Avg episode reward: [(0, '0.755')] +[2024-06-18 09:27:30,381][12883] Updated weights for policy 0, policy_version 104763 (0.0048) +[2024-06-18 09:27:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1716486144. Throughput: 0: 42738.5. Samples: 1716575000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:27:31,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 09:27:33,827][12883] Updated weights for policy 0, policy_version 104773 (0.0036) +[2024-06-18 09:27:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 1716715520. Throughput: 0: 42690.1. Samples: 1716831400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 09:27:36,998][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 09:27:37,914][12883] Updated weights for policy 0, policy_version 104783 (0.0030) +[2024-06-18 09:27:41,307][12883] Updated weights for policy 0, policy_version 104793 (0.0032) +[2024-06-18 09:27:41,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1716944896. Throughput: 0: 42634.1. Samples: 1717083320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:27:41,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 09:27:45,944][12883] Updated weights for policy 0, policy_version 104803 (0.0027) +[2024-06-18 09:27:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 1717125120. Throughput: 0: 42720.0. Samples: 1717217180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:27:46,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 09:27:48,896][12883] Updated weights for policy 0, policy_version 104813 (0.0042) +[2024-06-18 09:27:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 1717338112. Throughput: 0: 42665.4. Samples: 1717474520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:27:51,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 09:27:53,373][12883] Updated weights for policy 0, policy_version 104823 (0.0029) +[2024-06-18 09:27:56,589][12883] Updated weights for policy 0, policy_version 104833 (0.0023) +[2024-06-18 09:27:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1717583872. Throughput: 0: 42774.3. Samples: 1717729460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:27:56,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 09:28:00,964][12883] Updated weights for policy 0, policy_version 104843 (0.0031) +[2024-06-18 09:28:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1717780480. Throughput: 0: 42855.6. Samples: 1717865960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:28:01,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 09:28:04,262][12883] Updated weights for policy 0, policy_version 104853 (0.0033) +[2024-06-18 09:28:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1717993472. Throughput: 0: 42671.0. Samples: 1718114180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:28:06,994][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 09:28:09,303][12883] Updated weights for policy 0, policy_version 104863 (0.0030) +[2024-06-18 09:28:11,851][12883] Updated weights for policy 0, policy_version 104873 (0.0031) +[2024-06-18 09:28:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1718239232. Throughput: 0: 42696.8. Samples: 1718365380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:28:11,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 09:28:16,792][12883] Updated weights for policy 0, policy_version 104883 (0.0043) +[2024-06-18 09:28:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1718403072. Throughput: 0: 42718.7. Samples: 1718497340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:28:16,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 09:28:18,301][12862] Signal inference workers to stop experience collection... (25050 times) +[2024-06-18 09:28:18,342][12883] InferenceWorker_p0-w0: stopping experience collection (25050 times) +[2024-06-18 09:28:18,351][12862] Signal inference workers to resume experience collection... (25050 times) +[2024-06-18 09:28:18,361][12883] InferenceWorker_p0-w0: resuming experience collection (25050 times) +[2024-06-18 09:28:19,903][12883] Updated weights for policy 0, policy_version 104893 (0.0035) +[2024-06-18 09:28:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1718648832. Throughput: 0: 42620.8. Samples: 1718749340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:28:21,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 09:28:24,354][12883] Updated weights for policy 0, policy_version 104903 (0.0027) +[2024-06-18 09:28:26,996][12645] Fps is (10 sec: 45865.1, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1718861824. Throughput: 0: 42864.5. Samples: 1719012320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:28:27,008][12645] Avg episode reward: [(0, '0.684')] +[2024-06-18 09:28:27,440][12883] Updated weights for policy 0, policy_version 104913 (0.0034) +[2024-06-18 09:28:31,921][12883] Updated weights for policy 0, policy_version 104923 (0.0044) +[2024-06-18 09:28:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1719058432. Throughput: 0: 42697.3. Samples: 1719138560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:28:31,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 09:28:35,014][12883] Updated weights for policy 0, policy_version 104933 (0.0039) +[2024-06-18 09:28:36,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1719287808. Throughput: 0: 42577.4. Samples: 1719390500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 09:28:36,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 09:28:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104937_1719287808.pth... +[2024-06-18 09:28:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104311_1709031424.pth +[2024-06-18 09:28:39,331][12883] Updated weights for policy 0, policy_version 104943 (0.0033) +[2024-06-18 09:28:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1719500800. Throughput: 0: 42835.1. Samples: 1719657040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:28:41,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 09:28:42,777][12883] Updated weights for policy 0, policy_version 104953 (0.0036) +[2024-06-18 09:28:46,823][12883] Updated weights for policy 0, policy_version 104963 (0.0030) +[2024-06-18 09:28:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1719713792. Throughput: 0: 42694.7. Samples: 1719787220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:28:46,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 09:28:50,407][12883] Updated weights for policy 0, policy_version 104973 (0.0047) +[2024-06-18 09:28:51,996][12645] Fps is (10 sec: 42588.8, 60 sec: 43143.0, 300 sec: 42653.6). Total num frames: 1719926784. Throughput: 0: 42745.5. Samples: 1720037820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:28:51,997][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 09:28:54,749][12883] Updated weights for policy 0, policy_version 104983 (0.0042) +[2024-06-18 09:28:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1720139776. Throughput: 0: 42888.1. Samples: 1720295340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:28:56,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 09:28:58,083][12883] Updated weights for policy 0, policy_version 104993 (0.0043) +[2024-06-18 09:29:01,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1720352768. Throughput: 0: 42772.4. Samples: 1720422100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:29:01,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 09:29:02,293][12883] Updated weights for policy 0, policy_version 105003 (0.0024) +[2024-06-18 09:29:05,767][12883] Updated weights for policy 0, policy_version 105013 (0.0035) +[2024-06-18 09:29:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1720582144. Throughput: 0: 42880.2. Samples: 1720678940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:29:06,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 09:29:09,922][12883] Updated weights for policy 0, policy_version 105023 (0.0034) +[2024-06-18 09:29:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1720762368. Throughput: 0: 42693.7. Samples: 1720933440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:29:11,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 09:29:13,494][12883] Updated weights for policy 0, policy_version 105033 (0.0038) +[2024-06-18 09:29:17,000][12645] Fps is (10 sec: 40933.7, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 1720991744. Throughput: 0: 42667.0. Samples: 1721058840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:29:17,001][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 09:29:17,563][12883] Updated weights for policy 0, policy_version 105043 (0.0023) +[2024-06-18 09:29:21,139][12883] Updated weights for policy 0, policy_version 105053 (0.0034) +[2024-06-18 09:29:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1721221120. Throughput: 0: 42847.0. Samples: 1721318620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:29:22,000][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 09:29:25,416][12883] Updated weights for policy 0, policy_version 105063 (0.0041) +[2024-06-18 09:29:26,994][12645] Fps is (10 sec: 42625.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1721417728. Throughput: 0: 42649.8. Samples: 1721576280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:29:26,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 09:29:28,737][12883] Updated weights for policy 0, policy_version 105073 (0.0036) +[2024-06-18 09:29:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1721630720. Throughput: 0: 42536.4. Samples: 1721701360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:29:31,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 09:29:32,963][12883] Updated weights for policy 0, policy_version 105083 (0.0038) +[2024-06-18 09:29:33,533][12862] Signal inference workers to stop experience collection... (25100 times) +[2024-06-18 09:29:33,587][12883] InferenceWorker_p0-w0: stopping experience collection (25100 times) +[2024-06-18 09:29:33,595][12862] Signal inference workers to resume experience collection... (25100 times) +[2024-06-18 09:29:33,600][12883] InferenceWorker_p0-w0: resuming experience collection (25100 times) +[2024-06-18 09:29:36,767][12883] Updated weights for policy 0, policy_version 105093 (0.0032) +[2024-06-18 09:29:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1721843712. Throughput: 0: 42733.7. Samples: 1721960740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) +[2024-06-18 09:29:36,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 09:29:40,491][12883] Updated weights for policy 0, policy_version 105103 (0.0029) +[2024-06-18 09:29:41,998][12645] Fps is (10 sec: 42579.8, 60 sec: 42595.3, 300 sec: 42708.9). Total num frames: 1722056704. Throughput: 0: 42750.0. Samples: 1722219280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:29:41,998][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 09:29:44,337][12883] Updated weights for policy 0, policy_version 105113 (0.0040) +[2024-06-18 09:29:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1722269696. Throughput: 0: 42800.1. Samples: 1722348100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:29:46,994][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 09:29:48,195][12883] Updated weights for policy 0, policy_version 105123 (0.0028) +[2024-06-18 09:29:51,920][12883] Updated weights for policy 0, policy_version 105133 (0.0058) +[2024-06-18 09:29:51,994][12645] Fps is (10 sec: 44256.3, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1722499072. Throughput: 0: 42751.9. Samples: 1722602780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:29:51,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 09:29:55,806][12883] Updated weights for policy 0, policy_version 105143 (0.0036) +[2024-06-18 09:29:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1722712064. Throughput: 0: 42920.9. Samples: 1722864880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:29:56,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 09:29:59,428][12883] Updated weights for policy 0, policy_version 105153 (0.0032) +[2024-06-18 09:30:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1722925056. Throughput: 0: 43003.3. Samples: 1722993720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:30:01,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 09:30:03,371][12883] Updated weights for policy 0, policy_version 105163 (0.0022) +[2024-06-18 09:30:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1723138048. Throughput: 0: 43004.5. Samples: 1723253820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:30:06,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 09:30:07,121][12883] Updated weights for policy 0, policy_version 105173 (0.0038) +[2024-06-18 09:30:10,956][12883] Updated weights for policy 0, policy_version 105183 (0.0045) +[2024-06-18 09:30:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1723334656. Throughput: 0: 42982.1. Samples: 1723510480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:30:11,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 09:30:14,811][12883] Updated weights for policy 0, policy_version 105193 (0.0027) +[2024-06-18 09:30:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42875.9, 300 sec: 42653.9). Total num frames: 1723564032. Throughput: 0: 43023.5. Samples: 1723637420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:30:16,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 09:30:18,573][12883] Updated weights for policy 0, policy_version 105203 (0.0029) +[2024-06-18 09:30:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1723777024. Throughput: 0: 42941.4. Samples: 1723893100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:30:21,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 09:30:22,446][12883] Updated weights for policy 0, policy_version 105213 (0.0040) +[2024-06-18 09:30:26,166][12883] Updated weights for policy 0, policy_version 105223 (0.0036) +[2024-06-18 09:30:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1723990016. Throughput: 0: 42943.2. Samples: 1724151540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:30:26,994][12645] Avg episode reward: [(0, '0.780')] +[2024-06-18 09:30:30,060][12883] Updated weights for policy 0, policy_version 105233 (0.0030) +[2024-06-18 09:30:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1724203008. Throughput: 0: 42935.5. Samples: 1724280200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:30:31,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 09:30:33,934][12883] Updated weights for policy 0, policy_version 105243 (0.0031) +[2024-06-18 09:30:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1724432384. Throughput: 0: 43096.8. Samples: 1724542140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 09:30:36,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 09:30:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105251_1724432384.pth... +[2024-06-18 09:30:37,048][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104623_1714143232.pth +[2024-06-18 09:30:37,596][12883] Updated weights for policy 0, policy_version 105253 (0.0034) +[2024-06-18 09:30:41,528][12883] Updated weights for policy 0, policy_version 105263 (0.0035) +[2024-06-18 09:30:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42874.5, 300 sec: 42709.5). Total num frames: 1724628992. Throughput: 0: 42731.5. Samples: 1724787800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:30:41,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 09:30:44,065][12862] Signal inference workers to stop experience collection... (25150 times) +[2024-06-18 09:30:44,065][12862] Signal inference workers to resume experience collection... (25150 times) +[2024-06-18 09:30:44,100][12883] InferenceWorker_p0-w0: stopping experience collection (25150 times) +[2024-06-18 09:30:44,100][12883] InferenceWorker_p0-w0: resuming experience collection (25150 times) +[2024-06-18 09:30:45,201][12883] Updated weights for policy 0, policy_version 105273 (0.0039) +[2024-06-18 09:30:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1724841984. Throughput: 0: 42756.6. Samples: 1724917760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:30:46,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 09:30:49,341][12883] Updated weights for policy 0, policy_version 105283 (0.0032) +[2024-06-18 09:30:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1725071360. Throughput: 0: 42752.4. Samples: 1725177680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:30:51,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 09:30:52,984][12883] Updated weights for policy 0, policy_version 105293 (0.0036) +[2024-06-18 09:30:56,895][12883] Updated weights for policy 0, policy_version 105303 (0.0029) +[2024-06-18 09:30:57,000][12645] Fps is (10 sec: 44209.0, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 1725284352. Throughput: 0: 42672.8. Samples: 1725431020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:30:57,000][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 09:31:00,739][12883] Updated weights for policy 0, policy_version 105313 (0.0032) +[2024-06-18 09:31:01,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1725464576. Throughput: 0: 42648.6. Samples: 1725556600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:31:01,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 09:31:04,932][12883] Updated weights for policy 0, policy_version 105323 (0.0044) +[2024-06-18 09:31:06,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1725693952. Throughput: 0: 42684.0. Samples: 1725813880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:31:06,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 09:31:08,146][12883] Updated weights for policy 0, policy_version 105333 (0.0030) +[2024-06-18 09:31:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1725906944. Throughput: 0: 42733.4. Samples: 1726074540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:31:11,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 09:31:12,360][12883] Updated weights for policy 0, policy_version 105343 (0.0033) +[2024-06-18 09:31:15,847][12883] Updated weights for policy 0, policy_version 105353 (0.0033) +[2024-06-18 09:31:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1726119936. Throughput: 0: 42768.5. Samples: 1726204780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:31:16,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 09:31:20,062][12883] Updated weights for policy 0, policy_version 105363 (0.0036) +[2024-06-18 09:31:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1726332928. Throughput: 0: 42535.6. Samples: 1726456240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:31:21,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 09:31:23,523][12883] Updated weights for policy 0, policy_version 105373 (0.0035) +[2024-06-18 09:31:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1726545920. Throughput: 0: 42712.1. Samples: 1726709840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:31:26,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 09:31:27,653][12883] Updated weights for policy 0, policy_version 105383 (0.0036) +[2024-06-18 09:31:31,550][12883] Updated weights for policy 0, policy_version 105393 (0.0036) +[2024-06-18 09:31:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1726775296. Throughput: 0: 42799.2. Samples: 1726843720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:31:31,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 09:31:35,142][12883] Updated weights for policy 0, policy_version 105403 (0.0032) +[2024-06-18 09:31:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1726971904. Throughput: 0: 42656.1. Samples: 1727097200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) +[2024-06-18 09:31:36,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 09:31:39,331][12883] Updated weights for policy 0, policy_version 105413 (0.0033) +[2024-06-18 09:31:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1727201280. Throughput: 0: 42517.9. Samples: 1727344060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:31:41,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 09:31:42,870][12883] Updated weights for policy 0, policy_version 105423 (0.0027) +[2024-06-18 09:31:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1727414272. Throughput: 0: 42715.1. Samples: 1727478780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:31:46,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 09:31:46,994][12883] Updated weights for policy 0, policy_version 105433 (0.0038) +[2024-06-18 09:31:50,952][12883] Updated weights for policy 0, policy_version 105443 (0.0047) +[2024-06-18 09:31:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1727610880. Throughput: 0: 42679.9. Samples: 1727734480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:31:51,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 09:31:54,682][12883] Updated weights for policy 0, policy_version 105453 (0.0042) +[2024-06-18 09:31:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 1727856640. Throughput: 0: 42402.6. Samples: 1727982660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:31:56,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 09:31:58,627][12883] Updated weights for policy 0, policy_version 105463 (0.0021) +[2024-06-18 09:32:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1728053248. Throughput: 0: 42656.9. Samples: 1728124340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:32:01,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 09:32:02,307][12883] Updated weights for policy 0, policy_version 105473 (0.0035) +[2024-06-18 09:32:06,071][12883] Updated weights for policy 0, policy_version 105483 (0.0038) +[2024-06-18 09:32:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1728249856. Throughput: 0: 42677.7. Samples: 1728376740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:32:06,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 09:32:09,882][12883] Updated weights for policy 0, policy_version 105493 (0.0033) +[2024-06-18 09:32:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1728479232. Throughput: 0: 42793.3. Samples: 1728635540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:32:11,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 09:32:13,604][12883] Updated weights for policy 0, policy_version 105503 (0.0032) +[2024-06-18 09:32:16,306][12862] Signal inference workers to stop experience collection... (25200 times) +[2024-06-18 09:32:16,306][12862] Signal inference workers to resume experience collection... (25200 times) +[2024-06-18 09:32:16,322][12883] InferenceWorker_p0-w0: stopping experience collection (25200 times) +[2024-06-18 09:32:16,322][12883] InferenceWorker_p0-w0: resuming experience collection (25200 times) +[2024-06-18 09:32:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1728692224. Throughput: 0: 42753.2. Samples: 1728767620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:32:16,995][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 09:32:17,527][12883] Updated weights for policy 0, policy_version 105513 (0.0038) +[2024-06-18 09:32:21,302][12883] Updated weights for policy 0, policy_version 105523 (0.0035) +[2024-06-18 09:32:21,995][12645] Fps is (10 sec: 42594.5, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 1728905216. Throughput: 0: 42729.8. Samples: 1729020080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:32:21,995][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 09:32:25,088][12883] Updated weights for policy 0, policy_version 105533 (0.0032) +[2024-06-18 09:32:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1729118208. Throughput: 0: 42985.3. Samples: 1729278400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:32:26,995][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 09:32:28,953][12883] Updated weights for policy 0, policy_version 105543 (0.0031) +[2024-06-18 09:32:31,996][12645] Fps is (10 sec: 42592.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 1729331200. Throughput: 0: 42717.9. Samples: 1729401180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:32:31,997][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 09:32:32,641][12883] Updated weights for policy 0, policy_version 105553 (0.0033) +[2024-06-18 09:32:36,641][12883] Updated weights for policy 0, policy_version 105563 (0.0039) +[2024-06-18 09:32:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1729560576. Throughput: 0: 42703.1. Samples: 1729656120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 09:32:36,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 09:32:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105564_1729560576.pth... +[2024-06-18 09:32:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104937_1719287808.pth +[2024-06-18 09:32:40,319][12883] Updated weights for policy 0, policy_version 105573 (0.0040) +[2024-06-18 09:32:41,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1729757184. Throughput: 0: 43012.6. Samples: 1729918220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:32:41,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 09:32:44,209][12883] Updated weights for policy 0, policy_version 105583 (0.0038) +[2024-06-18 09:32:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1729953792. Throughput: 0: 42571.9. Samples: 1730040080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:32:46,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 09:32:48,330][12883] Updated weights for policy 0, policy_version 105593 (0.0028) +[2024-06-18 09:32:51,881][12883] Updated weights for policy 0, policy_version 105603 (0.0028) +[2024-06-18 09:32:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1730199552. Throughput: 0: 42666.4. Samples: 1730296720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:32:51,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 09:32:56,233][12883] Updated weights for policy 0, policy_version 105613 (0.0037) +[2024-06-18 09:32:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1730396160. Throughput: 0: 42665.2. Samples: 1730555480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:32:56,995][12645] Avg episode reward: [(0, '0.618')] +[2024-06-18 09:32:59,649][12883] Updated weights for policy 0, policy_version 105623 (0.0035) +[2024-06-18 09:33:01,997][12645] Fps is (10 sec: 37671.3, 60 sec: 42050.1, 300 sec: 42653.5). Total num frames: 1730576384. Throughput: 0: 42385.1. Samples: 1730675080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:01,997][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 09:33:03,826][12883] Updated weights for policy 0, policy_version 105633 (0.0036) +[2024-06-18 09:33:06,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1730822144. Throughput: 0: 42409.4. Samples: 1730928560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:06,996][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 09:33:07,243][12883] Updated weights for policy 0, policy_version 105643 (0.0035) +[2024-06-18 09:33:11,943][12883] Updated weights for policy 0, policy_version 105653 (0.0039) +[2024-06-18 09:33:11,994][12645] Fps is (10 sec: 44250.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1731018752. Throughput: 0: 42415.2. Samples: 1731187080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:11,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 09:33:15,201][12883] Updated weights for policy 0, policy_version 105663 (0.0025) +[2024-06-18 09:33:16,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1731231744. Throughput: 0: 42358.9. Samples: 1731307240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:16,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 09:33:19,514][12883] Updated weights for policy 0, policy_version 105673 (0.0034) +[2024-06-18 09:33:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42599.1, 300 sec: 42709.8). Total num frames: 1731461120. Throughput: 0: 42425.9. Samples: 1731565280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:21,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 09:33:22,666][12883] Updated weights for policy 0, policy_version 105683 (0.0025) +[2024-06-18 09:33:26,109][12862] Signal inference workers to stop experience collection... (25250 times) +[2024-06-18 09:33:26,110][12862] Signal inference workers to resume experience collection... (25250 times) +[2024-06-18 09:33:26,140][12883] InferenceWorker_p0-w0: stopping experience collection (25250 times) +[2024-06-18 09:33:26,140][12883] InferenceWorker_p0-w0: resuming experience collection (25250 times) +[2024-06-18 09:33:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1731641344. Throughput: 0: 42328.0. Samples: 1731822980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:26,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 09:33:27,238][12883] Updated weights for policy 0, policy_version 105693 (0.0030) +[2024-06-18 09:33:30,646][12883] Updated weights for policy 0, policy_version 105703 (0.0026) +[2024-06-18 09:33:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1731870720. Throughput: 0: 42373.4. Samples: 1731946880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:31,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 09:33:34,839][12883] Updated weights for policy 0, policy_version 105713 (0.0035) +[2024-06-18 09:33:36,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1732083712. Throughput: 0: 42309.9. Samples: 1732200680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:36,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 09:33:38,305][12883] Updated weights for policy 0, policy_version 105723 (0.0029) +[2024-06-18 09:33:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1732280320. Throughput: 0: 42398.3. Samples: 1732463400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 09:33:41,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 09:33:42,506][12883] Updated weights for policy 0, policy_version 105733 (0.0028) +[2024-06-18 09:33:46,094][12883] Updated weights for policy 0, policy_version 105743 (0.0030) +[2024-06-18 09:33:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 1732526080. Throughput: 0: 42519.3. Samples: 1732588320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:33:46,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 09:33:50,057][12883] Updated weights for policy 0, policy_version 105753 (0.0026) +[2024-06-18 09:33:51,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1732739072. Throughput: 0: 42431.4. Samples: 1732837880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:33:51,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 09:33:53,898][12883] Updated weights for policy 0, policy_version 105763 (0.0032) +[2024-06-18 09:33:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1732919296. Throughput: 0: 42776.8. Samples: 1733112040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:33:56,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 09:33:57,731][12883] Updated weights for policy 0, policy_version 105773 (0.0024) +[2024-06-18 09:34:01,347][12883] Updated weights for policy 0, policy_version 105783 (0.0022) +[2024-06-18 09:34:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43146.8, 300 sec: 42653.9). Total num frames: 1733165056. Throughput: 0: 42651.6. Samples: 1733226560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:01,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 09:34:05,359][12883] Updated weights for policy 0, policy_version 105793 (0.0036) +[2024-06-18 09:34:06,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1733378048. Throughput: 0: 42653.8. Samples: 1733484700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:06,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 09:34:09,018][12883] Updated weights for policy 0, policy_version 105803 (0.0046) +[2024-06-18 09:34:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42543.8). Total num frames: 1733541888. Throughput: 0: 42886.7. Samples: 1733752880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:11,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 09:34:12,886][12883] Updated weights for policy 0, policy_version 105813 (0.0040) +[2024-06-18 09:34:16,611][12883] Updated weights for policy 0, policy_version 105823 (0.0035) +[2024-06-18 09:34:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1733804032. Throughput: 0: 42731.0. Samples: 1733869780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:16,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 09:34:20,671][12883] Updated weights for policy 0, policy_version 105833 (0.0040) +[2024-06-18 09:34:21,994][12645] Fps is (10 sec: 49151.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1734033408. Throughput: 0: 42870.3. Samples: 1734129840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:21,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 09:34:24,362][12883] Updated weights for policy 0, policy_version 105843 (0.0037) +[2024-06-18 09:34:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1734197248. Throughput: 0: 42807.6. Samples: 1734389740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:26,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 09:34:28,255][12883] Updated weights for policy 0, policy_version 105853 (0.0023) +[2024-06-18 09:34:30,146][12862] Signal inference workers to stop experience collection... (25300 times) +[2024-06-18 09:34:30,146][12862] Signal inference workers to resume experience collection... (25300 times) +[2024-06-18 09:34:30,169][12883] InferenceWorker_p0-w0: stopping experience collection (25300 times) +[2024-06-18 09:34:30,169][12883] InferenceWorker_p0-w0: resuming experience collection (25300 times) +[2024-06-18 09:34:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1734443008. Throughput: 0: 42686.4. Samples: 1734509200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:31,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 09:34:32,103][12883] Updated weights for policy 0, policy_version 105863 (0.0038) +[2024-06-18 09:34:35,952][12883] Updated weights for policy 0, policy_version 105873 (0.0037) +[2024-06-18 09:34:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42710.1). Total num frames: 1734656000. Throughput: 0: 42917.8. Samples: 1734769180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:36,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 09:34:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105875_1734656000.pth... +[2024-06-18 09:34:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105251_1724432384.pth +[2024-06-18 09:34:39,713][12883] Updated weights for policy 0, policy_version 105883 (0.0030) +[2024-06-18 09:34:41,994][12645] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1734819840. Throughput: 0: 42436.5. Samples: 1735021680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 09:34:41,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 09:34:43,837][12883] Updated weights for policy 0, policy_version 105893 (0.0023) +[2024-06-18 09:34:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1735081984. Throughput: 0: 42536.1. Samples: 1735140680. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:34:46,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 09:34:47,438][12883] Updated weights for policy 0, policy_version 105903 (0.0030) +[2024-06-18 09:34:51,706][12883] Updated weights for policy 0, policy_version 105913 (0.0026) +[2024-06-18 09:34:51,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1735294976. Throughput: 0: 42675.0. Samples: 1735405080. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:34:51,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 09:34:55,115][12883] Updated weights for policy 0, policy_version 105923 (0.0030) +[2024-06-18 09:34:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1735475200. Throughput: 0: 42268.5. Samples: 1735654960. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:34:56,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 09:34:59,388][12883] Updated weights for policy 0, policy_version 105933 (0.0049) +[2024-06-18 09:35:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1735704576. Throughput: 0: 42305.1. Samples: 1735773500. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:01,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 09:35:03,033][12883] Updated weights for policy 0, policy_version 105943 (0.0052) +[2024-06-18 09:35:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1735901184. Throughput: 0: 42422.9. Samples: 1736038860. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:06,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 09:35:07,181][12883] Updated weights for policy 0, policy_version 105953 (0.0028) +[2024-06-18 09:35:11,163][12883] Updated weights for policy 0, policy_version 105963 (0.0032) +[2024-06-18 09:35:11,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1736097792. Throughput: 0: 42063.0. Samples: 1736282580. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:11,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 09:35:15,000][12883] Updated weights for policy 0, policy_version 105973 (0.0028) +[2024-06-18 09:35:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1736343552. Throughput: 0: 42205.3. Samples: 1736408440. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:16,994][12645] Avg episode reward: [(0, '0.691')] +[2024-06-18 09:35:18,781][12883] Updated weights for policy 0, policy_version 105983 (0.0034) +[2024-06-18 09:35:21,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41233.2, 300 sec: 42431.8). Total num frames: 1736507392. Throughput: 0: 42132.5. Samples: 1736665140. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:21,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 09:35:22,883][12883] Updated weights for policy 0, policy_version 105993 (0.0030) +[2024-06-18 09:35:26,492][12883] Updated weights for policy 0, policy_version 106003 (0.0024) +[2024-06-18 09:35:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1736753152. Throughput: 0: 41787.6. Samples: 1736902120. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:26,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 09:35:30,649][12883] Updated weights for policy 0, policy_version 106013 (0.0046) +[2024-06-18 09:35:31,994][12645] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1736998912. Throughput: 0: 42242.7. Samples: 1737041600. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:31,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 09:35:34,128][12883] Updated weights for policy 0, policy_version 106023 (0.0026) +[2024-06-18 09:35:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41233.1, 300 sec: 42376.3). Total num frames: 1737129984. Throughput: 0: 42055.1. Samples: 1737297560. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:36,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 09:35:37,256][12862] Signal inference workers to stop experience collection... (25350 times) +[2024-06-18 09:35:37,256][12862] Signal inference workers to resume experience collection... (25350 times) +[2024-06-18 09:35:37,268][12883] InferenceWorker_p0-w0: stopping experience collection (25350 times) +[2024-06-18 09:35:37,268][12883] InferenceWorker_p0-w0: resuming experience collection (25350 times) +[2024-06-18 09:35:38,287][12883] Updated weights for policy 0, policy_version 106033 (0.0029) +[2024-06-18 09:35:41,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1737375744. Throughput: 0: 42024.4. Samples: 1737546060. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) +[2024-06-18 09:35:41,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 09:35:42,346][12883] Updated weights for policy 0, policy_version 106043 (0.0039) +[2024-06-18 09:35:45,899][12883] Updated weights for policy 0, policy_version 106053 (0.0032) +[2024-06-18 09:35:46,994][12645] Fps is (10 sec: 50790.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1737637888. Throughput: 0: 42486.2. Samples: 1737685380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:35:46,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 09:35:50,035][12883] Updated weights for policy 0, policy_version 106063 (0.0027) +[2024-06-18 09:35:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 42321.6). Total num frames: 1737768960. Throughput: 0: 42060.4. Samples: 1737931580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:35:51,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 09:35:53,892][12883] Updated weights for policy 0, policy_version 106073 (0.0025) +[2024-06-18 09:35:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1738014720. Throughput: 0: 42225.0. Samples: 1738182700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:35:56,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 09:35:57,679][12883] Updated weights for policy 0, policy_version 106083 (0.0035) +[2024-06-18 09:36:01,468][12883] Updated weights for policy 0, policy_version 106093 (0.0037) +[2024-06-18 09:36:01,994][12645] Fps is (10 sec: 49152.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1738260480. Throughput: 0: 42470.7. Samples: 1738319620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:01,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 09:36:05,294][12883] Updated weights for policy 0, policy_version 106103 (0.0048) +[2024-06-18 09:36:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1738440704. Throughput: 0: 42281.6. Samples: 1738567820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:06,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 09:36:09,358][12883] Updated weights for policy 0, policy_version 106113 (0.0038) +[2024-06-18 09:36:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1738670080. Throughput: 0: 42589.7. Samples: 1738818660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:11,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 09:36:12,973][12883] Updated weights for policy 0, policy_version 106123 (0.0037) +[2024-06-18 09:36:16,918][12883] Updated weights for policy 0, policy_version 106133 (0.0035) +[2024-06-18 09:36:16,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1738883072. Throughput: 0: 42480.5. Samples: 1738953220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:16,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 09:36:20,784][12883] Updated weights for policy 0, policy_version 106143 (0.0029) +[2024-06-18 09:36:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1739079680. Throughput: 0: 42512.4. Samples: 1739210620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:21,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 09:36:24,381][12883] Updated weights for policy 0, policy_version 106153 (0.0036) +[2024-06-18 09:36:26,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1739325440. Throughput: 0: 42667.0. Samples: 1739466080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:26,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 09:36:28,393][12883] Updated weights for policy 0, policy_version 106163 (0.0027) +[2024-06-18 09:36:31,829][12862] Signal inference workers to stop experience collection... (25400 times) +[2024-06-18 09:36:31,830][12862] Signal inference workers to resume experience collection... (25400 times) +[2024-06-18 09:36:31,847][12883] InferenceWorker_p0-w0: stopping experience collection (25400 times) +[2024-06-18 09:36:31,847][12883] InferenceWorker_p0-w0: resuming experience collection (25400 times) +[2024-06-18 09:36:31,982][12883] Updated weights for policy 0, policy_version 106173 (0.0040) +[2024-06-18 09:36:31,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1739538432. Throughput: 0: 42477.7. Samples: 1739596880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:31,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 09:36:35,983][12883] Updated weights for policy 0, policy_version 106183 (0.0035) +[2024-06-18 09:36:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42487.3). Total num frames: 1739735040. Throughput: 0: 42673.3. Samples: 1739851880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:36,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 09:36:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106185_1739735040.pth... +[2024-06-18 09:36:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105564_1729560576.pth +[2024-06-18 09:36:39,677][12883] Updated weights for policy 0, policy_version 106193 (0.0027) +[2024-06-18 09:36:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1739964416. Throughput: 0: 42652.0. Samples: 1740102040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:36:41,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 09:36:44,087][12883] Updated weights for policy 0, policy_version 106203 (0.0041) +[2024-06-18 09:36:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1740161024. Throughput: 0: 42531.6. Samples: 1740233540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:36:46,994][12645] Avg episode reward: [(0, '0.799')] +[2024-06-18 09:36:47,280][12883] Updated weights for policy 0, policy_version 106213 (0.0036) +[2024-06-18 09:36:51,727][12883] Updated weights for policy 0, policy_version 106223 (0.0033) +[2024-06-18 09:36:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42376.3). Total num frames: 1740357632. Throughput: 0: 42749.0. Samples: 1740491520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:36:51,994][12645] Avg episode reward: [(0, '0.799')] +[2024-06-18 09:36:55,096][12883] Updated weights for policy 0, policy_version 106233 (0.0040) +[2024-06-18 09:36:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1740603392. Throughput: 0: 42641.9. Samples: 1740737540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:36:56,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 09:36:59,246][12883] Updated weights for policy 0, policy_version 106243 (0.0029) +[2024-06-18 09:37:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1740783616. Throughput: 0: 42605.7. Samples: 1740870480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:01,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 09:37:03,009][12883] Updated weights for policy 0, policy_version 106253 (0.0047) +[2024-06-18 09:37:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1740996608. Throughput: 0: 42518.3. Samples: 1741123940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:06,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 09:37:07,140][12883] Updated weights for policy 0, policy_version 106263 (0.0052) +[2024-06-18 09:37:10,652][12883] Updated weights for policy 0, policy_version 106273 (0.0036) +[2024-06-18 09:37:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1741225984. Throughput: 0: 42393.8. Samples: 1741373800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:11,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 09:37:14,877][12883] Updated weights for policy 0, policy_version 106283 (0.0038) +[2024-06-18 09:37:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42487.4). Total num frames: 1741438976. Throughput: 0: 42438.2. Samples: 1741506600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:17,000][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 09:37:18,216][12883] Updated weights for policy 0, policy_version 106293 (0.0028) +[2024-06-18 09:37:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1741635584. Throughput: 0: 42383.0. Samples: 1741759120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:21,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 09:37:22,493][12883] Updated weights for policy 0, policy_version 106303 (0.0039) +[2024-06-18 09:37:25,726][12883] Updated weights for policy 0, policy_version 106313 (0.0035) +[2024-06-18 09:37:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42487.6). Total num frames: 1741864960. Throughput: 0: 42560.0. Samples: 1742017240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:26,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 09:37:29,973][12883] Updated weights for policy 0, policy_version 106323 (0.0037) +[2024-06-18 09:37:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1742077952. Throughput: 0: 42511.0. Samples: 1742146540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:31,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 09:37:33,385][12883] Updated weights for policy 0, policy_version 106333 (0.0039) +[2024-06-18 09:37:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1742274560. Throughput: 0: 42577.8. Samples: 1742407520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:36,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 09:37:37,541][12883] Updated weights for policy 0, policy_version 106343 (0.0031) +[2024-06-18 09:37:41,292][12883] Updated weights for policy 0, policy_version 106353 (0.0033) +[2024-06-18 09:37:41,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1742503936. Throughput: 0: 42551.2. Samples: 1742652340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 09:37:41,994][12645] Avg episode reward: [(0, '0.622')] +[2024-06-18 09:37:45,508][12883] Updated weights for policy 0, policy_version 106363 (0.0024) +[2024-06-18 09:37:46,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42375.9). Total num frames: 1742700544. Throughput: 0: 42513.0. Samples: 1742783660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:37:46,996][12645] Avg episode reward: [(0, '0.722')] +[2024-06-18 09:37:48,839][12883] Updated weights for policy 0, policy_version 106373 (0.0026) +[2024-06-18 09:37:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1742913536. Throughput: 0: 42517.7. Samples: 1743037240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:37:51,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 09:37:53,077][12883] Updated weights for policy 0, policy_version 106383 (0.0029) +[2024-06-18 09:37:56,894][12883] Updated weights for policy 0, policy_version 106393 (0.0032) +[2024-06-18 09:37:56,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 1743142912. Throughput: 0: 42609.9. Samples: 1743291240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:37:56,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 09:38:00,661][12883] Updated weights for policy 0, policy_version 106403 (0.0035) +[2024-06-18 09:38:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42487.6). Total num frames: 1743355904. Throughput: 0: 42551.6. Samples: 1743421420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:01,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 09:38:04,292][12862] Signal inference workers to stop experience collection... (25450 times) +[2024-06-18 09:38:04,293][12862] Signal inference workers to resume experience collection... (25450 times) +[2024-06-18 09:38:04,309][12883] InferenceWorker_p0-w0: stopping experience collection (25450 times) +[2024-06-18 09:38:04,309][12883] InferenceWorker_p0-w0: resuming experience collection (25450 times) +[2024-06-18 09:38:04,670][12883] Updated weights for policy 0, policy_version 106413 (0.0023) +[2024-06-18 09:38:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1743568896. Throughput: 0: 42560.6. Samples: 1743674340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:06,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 09:38:08,255][12883] Updated weights for policy 0, policy_version 106423 (0.0030) +[2024-06-18 09:38:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1743765504. Throughput: 0: 42596.9. Samples: 1743934100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:11,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 09:38:12,226][12883] Updated weights for policy 0, policy_version 106433 (0.0030) +[2024-06-18 09:38:16,177][12883] Updated weights for policy 0, policy_version 106443 (0.0028) +[2024-06-18 09:38:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1743994880. Throughput: 0: 42514.7. Samples: 1744059700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:16,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 09:38:19,954][12883] Updated weights for policy 0, policy_version 106453 (0.0032) +[2024-06-18 09:38:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1744207872. Throughput: 0: 42407.0. Samples: 1744315840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:21,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 09:38:23,831][12883] Updated weights for policy 0, policy_version 106463 (0.0033) +[2024-06-18 09:38:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1744420864. Throughput: 0: 42727.4. Samples: 1744575080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:26,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 09:38:27,581][12883] Updated weights for policy 0, policy_version 106473 (0.0038) +[2024-06-18 09:38:31,638][12883] Updated weights for policy 0, policy_version 106483 (0.0034) +[2024-06-18 09:38:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1744650240. Throughput: 0: 42627.1. Samples: 1744701780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:31,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 09:38:35,293][12883] Updated weights for policy 0, policy_version 106493 (0.0048) +[2024-06-18 09:38:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1744830464. Throughput: 0: 42550.3. Samples: 1744952000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:36,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 09:38:37,117][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106497_1744846848.pth... +[2024-06-18 09:38:37,185][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105875_1734656000.pth +[2024-06-18 09:38:39,255][12883] Updated weights for policy 0, policy_version 106503 (0.0039) +[2024-06-18 09:38:41,994][12645] Fps is (10 sec: 39319.3, 60 sec: 42324.9, 300 sec: 42431.7). Total num frames: 1745043456. Throughput: 0: 42640.8. Samples: 1745210100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) +[2024-06-18 09:38:41,995][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 09:38:43,064][12883] Updated weights for policy 0, policy_version 106513 (0.0037) +[2024-06-18 09:38:46,940][12883] Updated weights for policy 0, policy_version 106523 (0.0044) +[2024-06-18 09:38:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42873.0, 300 sec: 42487.3). Total num frames: 1745272832. Throughput: 0: 42597.8. Samples: 1745338320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:38:46,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 09:38:50,889][12883] Updated weights for policy 0, policy_version 106533 (0.0038) +[2024-06-18 09:38:51,994][12645] Fps is (10 sec: 42600.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1745469440. Throughput: 0: 42610.6. Samples: 1745591820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:38:51,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 09:38:54,547][12883] Updated weights for policy 0, policy_version 106543 (0.0036) +[2024-06-18 09:38:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1745682432. Throughput: 0: 42450.1. Samples: 1745844360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:38:56,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 09:38:58,729][12883] Updated weights for policy 0, policy_version 106553 (0.0031) +[2024-06-18 09:39:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1745911808. Throughput: 0: 42513.0. Samples: 1745972780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:01,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 09:39:02,160][12883] Updated weights for policy 0, policy_version 106563 (0.0031) +[2024-06-18 09:39:06,578][12883] Updated weights for policy 0, policy_version 106573 (0.0023) +[2024-06-18 09:39:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 1746092032. Throughput: 0: 42423.4. Samples: 1746224900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:06,995][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 09:39:10,042][12883] Updated weights for policy 0, policy_version 106583 (0.0042) +[2024-06-18 09:39:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1746321408. Throughput: 0: 42242.2. Samples: 1746475980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:11,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 09:39:14,614][12883] Updated weights for policy 0, policy_version 106593 (0.0044) +[2024-06-18 09:39:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1746534400. Throughput: 0: 42388.8. Samples: 1746609280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:17,003][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 09:39:18,162][12883] Updated weights for policy 0, policy_version 106603 (0.0036) +[2024-06-18 09:39:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 1746714624. Throughput: 0: 42439.9. Samples: 1746861800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:21,995][12645] Avg episode reward: [(0, '0.113')] +[2024-06-18 09:39:22,449][12883] Updated weights for policy 0, policy_version 106613 (0.0030) +[2024-06-18 09:39:22,538][12862] Signal inference workers to stop experience collection... (25500 times) +[2024-06-18 09:39:22,596][12883] InferenceWorker_p0-w0: stopping experience collection (25500 times) +[2024-06-18 09:39:22,653][12862] Signal inference workers to resume experience collection... (25500 times) +[2024-06-18 09:39:22,653][12883] InferenceWorker_p0-w0: resuming experience collection (25500 times) +[2024-06-18 09:39:25,698][12883] Updated weights for policy 0, policy_version 106623 (0.0043) +[2024-06-18 09:39:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1746960384. Throughput: 0: 42181.5. Samples: 1747108240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:26,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 09:39:30,184][12883] Updated weights for policy 0, policy_version 106633 (0.0045) +[2024-06-18 09:39:31,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 1747156992. Throughput: 0: 42499.3. Samples: 1747250780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:31,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 09:39:33,327][12883] Updated weights for policy 0, policy_version 106643 (0.0038) +[2024-06-18 09:39:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1747353600. Throughput: 0: 42195.0. Samples: 1747490600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:36,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 09:39:37,856][12883] Updated weights for policy 0, policy_version 106653 (0.0051) +[2024-06-18 09:39:40,890][12883] Updated weights for policy 0, policy_version 106663 (0.0038) +[2024-06-18 09:39:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.8, 300 sec: 42431.8). Total num frames: 1747599360. Throughput: 0: 42133.0. Samples: 1747740340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 09:39:41,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 09:39:45,434][12883] Updated weights for policy 0, policy_version 106673 (0.0031) +[2024-06-18 09:39:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1747795968. Throughput: 0: 42401.3. Samples: 1747880840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:39:46,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 09:39:48,500][12883] Updated weights for policy 0, policy_version 106683 (0.0031) +[2024-06-18 09:39:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1748008960. Throughput: 0: 42390.8. Samples: 1748132480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:39:51,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 09:39:53,067][12883] Updated weights for policy 0, policy_version 106693 (0.0036) +[2024-06-18 09:39:55,966][12883] Updated weights for policy 0, policy_version 106703 (0.0033) +[2024-06-18 09:39:56,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1748254720. Throughput: 0: 42397.3. Samples: 1748383860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:39:56,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 09:40:00,861][12883] Updated weights for policy 0, policy_version 106713 (0.0038) +[2024-06-18 09:40:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1748451328. Throughput: 0: 42499.2. Samples: 1748521740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:01,994][12645] Avg episode reward: [(0, '0.198')] +[2024-06-18 09:40:03,753][12883] Updated weights for policy 0, policy_version 106723 (0.0042) +[2024-06-18 09:40:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1748664320. Throughput: 0: 42546.8. Samples: 1748776400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:06,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 09:40:08,506][12883] Updated weights for policy 0, policy_version 106733 (0.0036) +[2024-06-18 09:40:11,365][12883] Updated weights for policy 0, policy_version 106743 (0.0030) +[2024-06-18 09:40:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1748893696. Throughput: 0: 42737.8. Samples: 1749031440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:11,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 09:40:16,262][12883] Updated weights for policy 0, policy_version 106753 (0.0028) +[2024-06-18 09:40:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1749073920. Throughput: 0: 42497.7. Samples: 1749163180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:16,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 09:40:18,933][12883] Updated weights for policy 0, policy_version 106763 (0.0026) +[2024-06-18 09:40:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1749286912. Throughput: 0: 42877.4. Samples: 1749420080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:22,004][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 09:40:23,895][12883] Updated weights for policy 0, policy_version 106773 (0.0024) +[2024-06-18 09:40:26,446][12883] Updated weights for policy 0, policy_version 106783 (0.0033) +[2024-06-18 09:40:26,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1749549056. Throughput: 0: 42949.9. Samples: 1749673080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:26,994][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 09:40:31,275][12883] Updated weights for policy 0, policy_version 106793 (0.0032) +[2024-06-18 09:40:32,000][12645] Fps is (10 sec: 44209.3, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 1749729280. Throughput: 0: 43003.8. Samples: 1749816280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:32,001][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 09:40:34,074][12883] Updated weights for policy 0, policy_version 106803 (0.0039) +[2024-06-18 09:40:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1749942272. Throughput: 0: 43011.5. Samples: 1750068000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:36,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 09:40:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106808_1749942272.pth... +[2024-06-18 09:40:37,099][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106185_1739735040.pth +[2024-06-18 09:40:38,698][12862] Signal inference workers to stop experience collection... (25550 times) +[2024-06-18 09:40:38,698][12862] Signal inference workers to resume experience collection... (25550 times) +[2024-06-18 09:40:38,734][12883] InferenceWorker_p0-w0: stopping experience collection (25550 times) +[2024-06-18 09:40:38,734][12883] InferenceWorker_p0-w0: resuming experience collection (25550 times) +[2024-06-18 09:40:38,852][12883] Updated weights for policy 0, policy_version 106813 (0.0039) +[2024-06-18 09:40:41,614][12883] Updated weights for policy 0, policy_version 106823 (0.0026) +[2024-06-18 09:40:41,994][12645] Fps is (10 sec: 47543.5, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1750204416. Throughput: 0: 43071.2. Samples: 1750322060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) +[2024-06-18 09:40:41,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 09:40:46,576][12883] Updated weights for policy 0, policy_version 106833 (0.0041) +[2024-06-18 09:40:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1750368256. Throughput: 0: 42968.6. Samples: 1750455320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:40:46,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 09:40:49,344][12883] Updated weights for policy 0, policy_version 106843 (0.0032) +[2024-06-18 09:40:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1750597632. Throughput: 0: 42948.0. Samples: 1750709060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:40:51,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 09:40:54,136][12883] Updated weights for policy 0, policy_version 106853 (0.0046) +[2024-06-18 09:40:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1750827008. Throughput: 0: 43000.3. Samples: 1750966460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:40:56,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 09:40:57,310][12883] Updated weights for policy 0, policy_version 106863 (0.0033) +[2024-06-18 09:41:01,583][12883] Updated weights for policy 0, policy_version 106873 (0.0046) +[2024-06-18 09:41:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1751007232. Throughput: 0: 43042.1. Samples: 1751100080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:01,995][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 09:41:04,865][12883] Updated weights for policy 0, policy_version 106883 (0.0038) +[2024-06-18 09:41:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1751236608. Throughput: 0: 42946.6. Samples: 1751352680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:06,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 09:41:09,075][12883] Updated weights for policy 0, policy_version 106893 (0.0033) +[2024-06-18 09:41:11,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1751465984. Throughput: 0: 43130.1. Samples: 1751613940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:11,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 09:41:12,361][12883] Updated weights for policy 0, policy_version 106903 (0.0026) +[2024-06-18 09:41:16,587][12883] Updated weights for policy 0, policy_version 106913 (0.0025) +[2024-06-18 09:41:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1751662592. Throughput: 0: 42927.3. Samples: 1751747740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:16,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 09:41:19,874][12883] Updated weights for policy 0, policy_version 106923 (0.0026) +[2024-06-18 09:41:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1751875584. Throughput: 0: 43041.8. Samples: 1752004880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:21,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 09:41:24,430][12883] Updated weights for policy 0, policy_version 106933 (0.0034) +[2024-06-18 09:41:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1752121344. Throughput: 0: 43072.4. Samples: 1752260320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:26,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 09:41:27,560][12883] Updated weights for policy 0, policy_version 106943 (0.0036) +[2024-06-18 09:41:31,974][12883] Updated weights for policy 0, policy_version 106953 (0.0027) +[2024-06-18 09:41:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43148.9, 300 sec: 42653.9). Total num frames: 1752317952. Throughput: 0: 43053.1. Samples: 1752392720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:31,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 09:41:35,287][12883] Updated weights for policy 0, policy_version 106963 (0.0045) +[2024-06-18 09:41:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1752530944. Throughput: 0: 43055.1. Samples: 1752646540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:36,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 09:41:39,637][12883] Updated weights for policy 0, policy_version 106973 (0.0032) +[2024-06-18 09:41:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1752760320. Throughput: 0: 42952.8. Samples: 1752899340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:41,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 09:41:43,208][12883] Updated weights for policy 0, policy_version 106983 (0.0033) +[2024-06-18 09:41:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1752940544. Throughput: 0: 43003.2. Samples: 1753035220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) +[2024-06-18 09:41:46,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 09:41:47,247][12883] Updated weights for policy 0, policy_version 106993 (0.0031) +[2024-06-18 09:41:50,711][12883] Updated weights for policy 0, policy_version 107003 (0.0034) +[2024-06-18 09:41:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1753169920. Throughput: 0: 42963.2. Samples: 1753286020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:41:51,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 09:41:54,940][12883] Updated weights for policy 0, policy_version 107013 (0.0033) +[2024-06-18 09:41:56,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1753415680. Throughput: 0: 42855.0. Samples: 1753542420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:41:56,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 09:41:58,162][12883] Updated weights for policy 0, policy_version 107023 (0.0032) +[2024-06-18 09:42:01,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1753595904. Throughput: 0: 42869.8. Samples: 1753676880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:01,994][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 09:42:02,524][12883] Updated weights for policy 0, policy_version 107033 (0.0031) +[2024-06-18 09:42:03,979][12862] Signal inference workers to stop experience collection... (25600 times) +[2024-06-18 09:42:03,980][12862] Signal inference workers to resume experience collection... (25600 times) +[2024-06-18 09:42:04,004][12883] InferenceWorker_p0-w0: stopping experience collection (25600 times) +[2024-06-18 09:42:04,004][12883] InferenceWorker_p0-w0: resuming experience collection (25600 times) +[2024-06-18 09:42:05,958][12883] Updated weights for policy 0, policy_version 107043 (0.0034) +[2024-06-18 09:42:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1753808896. Throughput: 0: 42689.3. Samples: 1753925900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:06,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 09:42:10,402][12883] Updated weights for policy 0, policy_version 107053 (0.0031) +[2024-06-18 09:42:11,998][12645] Fps is (10 sec: 44216.4, 60 sec: 42868.2, 300 sec: 42708.8). Total num frames: 1754038272. Throughput: 0: 42674.3. Samples: 1754180860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:11,999][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 09:42:13,545][12883] Updated weights for policy 0, policy_version 107063 (0.0033) +[2024-06-18 09:42:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1754234880. Throughput: 0: 42727.3. Samples: 1754315440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:16,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 09:42:17,897][12883] Updated weights for policy 0, policy_version 107073 (0.0039) +[2024-06-18 09:42:21,198][12883] Updated weights for policy 0, policy_version 107083 (0.0039) +[2024-06-18 09:42:21,994][12645] Fps is (10 sec: 40978.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1754447872. Throughput: 0: 42807.6. Samples: 1754572880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:21,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 09:42:25,495][12883] Updated weights for policy 0, policy_version 107093 (0.0028) +[2024-06-18 09:42:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1754693632. Throughput: 0: 42749.4. Samples: 1754823060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:26,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 09:42:28,823][12883] Updated weights for policy 0, policy_version 107103 (0.0028) +[2024-06-18 09:42:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1754857472. Throughput: 0: 42742.3. Samples: 1754958620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:31,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 09:42:33,162][12883] Updated weights for policy 0, policy_version 107113 (0.0046) +[2024-06-18 09:42:36,548][12883] Updated weights for policy 0, policy_version 107123 (0.0030) +[2024-06-18 09:42:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1755103232. Throughput: 0: 42826.8. Samples: 1755213220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:36,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 09:42:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107123_1755103232.pth... +[2024-06-18 09:42:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106497_1744846848.pth +[2024-06-18 09:42:40,950][12883] Updated weights for policy 0, policy_version 107133 (0.0040) +[2024-06-18 09:42:41,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1755332608. Throughput: 0: 42730.6. Samples: 1755465300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:41,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 09:42:44,235][12883] Updated weights for policy 0, policy_version 107143 (0.0031) +[2024-06-18 09:42:46,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 1755512832. Throughput: 0: 42603.1. Samples: 1755594120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 09:42:46,996][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 09:42:48,776][12883] Updated weights for policy 0, policy_version 107153 (0.0031) +[2024-06-18 09:42:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1755758592. Throughput: 0: 42814.3. Samples: 1755852540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:42:51,995][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 09:42:51,995][12883] Updated weights for policy 0, policy_version 107163 (0.0045) +[2024-06-18 09:42:56,260][12883] Updated weights for policy 0, policy_version 107173 (0.0045) +[2024-06-18 09:42:56,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1755971584. Throughput: 0: 42848.3. Samples: 1756108840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:42:56,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 09:42:59,634][12883] Updated weights for policy 0, policy_version 107183 (0.0028) +[2024-06-18 09:43:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1756168192. Throughput: 0: 42739.9. Samples: 1756238740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:01,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 09:43:03,749][12883] Updated weights for policy 0, policy_version 107193 (0.0038) +[2024-06-18 09:43:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1756381184. Throughput: 0: 42659.9. Samples: 1756492580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:06,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 09:43:07,311][12883] Updated weights for policy 0, policy_version 107203 (0.0029) +[2024-06-18 09:43:11,561][12883] Updated weights for policy 0, policy_version 107213 (0.0035) +[2024-06-18 09:43:11,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42600.0, 300 sec: 42709.2). Total num frames: 1756594176. Throughput: 0: 42745.8. Samples: 1756746720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:11,997][12645] Avg episode reward: [(0, '0.687')] +[2024-06-18 09:43:15,294][12883] Updated weights for policy 0, policy_version 107223 (0.0027) +[2024-06-18 09:43:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1756790784. Throughput: 0: 42502.7. Samples: 1756871240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:16,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 09:43:19,057][12883] Updated weights for policy 0, policy_version 107233 (0.0041) +[2024-06-18 09:43:21,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1757020160. Throughput: 0: 42502.1. Samples: 1757125820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:21,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 09:43:23,169][12883] Updated weights for policy 0, policy_version 107243 (0.0044) +[2024-06-18 09:43:26,851][12883] Updated weights for policy 0, policy_version 107253 (0.0040) +[2024-06-18 09:43:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1757249536. Throughput: 0: 42515.6. Samples: 1757378500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:26,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 09:43:31,043][12883] Updated weights for policy 0, policy_version 107263 (0.0031) +[2024-06-18 09:43:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1757429760. Throughput: 0: 42452.8. Samples: 1757504400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:31,994][12645] Avg episode reward: [(0, '0.697')] +[2024-06-18 09:43:32,917][12862] Signal inference workers to stop experience collection... (25650 times) +[2024-06-18 09:43:32,917][12862] Signal inference workers to resume experience collection... (25650 times) +[2024-06-18 09:43:32,937][12883] InferenceWorker_p0-w0: stopping experience collection (25650 times) +[2024-06-18 09:43:32,965][12883] InferenceWorker_p0-w0: resuming experience collection (25650 times) +[2024-06-18 09:43:34,563][12883] Updated weights for policy 0, policy_version 107273 (0.0028) +[2024-06-18 09:43:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 1757642752. Throughput: 0: 42368.0. Samples: 1757759100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:36,994][12645] Avg episode reward: [(0, '0.697')] +[2024-06-18 09:43:38,760][12883] Updated weights for policy 0, policy_version 107283 (0.0039) +[2024-06-18 09:43:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1757872128. Throughput: 0: 42521.3. Samples: 1758022300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:41,994][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 09:43:42,244][12883] Updated weights for policy 0, policy_version 107293 (0.0038) +[2024-06-18 09:43:46,464][12883] Updated weights for policy 0, policy_version 107303 (0.0027) +[2024-06-18 09:43:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1758068736. Throughput: 0: 42486.2. Samples: 1758150620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 09:43:46,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 09:43:49,822][12883] Updated weights for policy 0, policy_version 107313 (0.0038) +[2024-06-18 09:43:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1758281728. Throughput: 0: 42355.5. Samples: 1758398580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:43:51,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 09:43:54,209][12883] Updated weights for policy 0, policy_version 107323 (0.0023) +[2024-06-18 09:43:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1758511104. Throughput: 0: 42328.8. Samples: 1758651420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:43:56,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 09:43:57,506][12883] Updated weights for policy 0, policy_version 107333 (0.0024) +[2024-06-18 09:44:01,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42050.7, 300 sec: 42709.2). Total num frames: 1758691328. Throughput: 0: 42455.7. Samples: 1758781840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:01,996][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 09:44:02,023][12883] Updated weights for policy 0, policy_version 107343 (0.0041) +[2024-06-18 09:44:05,609][12883] Updated weights for policy 0, policy_version 107353 (0.0048) +[2024-06-18 09:44:06,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 1758920704. Throughput: 0: 42530.4. Samples: 1759039780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:06,996][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 09:44:09,581][12883] Updated weights for policy 0, policy_version 107363 (0.0046) +[2024-06-18 09:44:11,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 1759133696. Throughput: 0: 42516.9. Samples: 1759291760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:11,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 09:44:13,116][12883] Updated weights for policy 0, policy_version 107373 (0.0036) +[2024-06-18 09:44:16,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1759330304. Throughput: 0: 42576.1. Samples: 1759420320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:16,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 09:44:17,362][12883] Updated weights for policy 0, policy_version 107383 (0.0040) +[2024-06-18 09:44:20,711][12883] Updated weights for policy 0, policy_version 107393 (0.0038) +[2024-06-18 09:44:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1759576064. Throughput: 0: 42657.3. Samples: 1759678680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:21,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 09:44:24,907][12883] Updated weights for policy 0, policy_version 107403 (0.0032) +[2024-06-18 09:44:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1759772672. Throughput: 0: 42521.0. Samples: 1759935740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:26,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 09:44:28,243][12883] Updated weights for policy 0, policy_version 107413 (0.0038) +[2024-06-18 09:44:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1759969280. Throughput: 0: 42465.4. Samples: 1760061560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:31,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 09:44:32,532][12883] Updated weights for policy 0, policy_version 107423 (0.0037) +[2024-06-18 09:44:35,987][12883] Updated weights for policy 0, policy_version 107433 (0.0039) +[2024-06-18 09:44:36,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1760231424. Throughput: 0: 42752.0. Samples: 1760322420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:36,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 09:44:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107436_1760231424.pth... +[2024-06-18 09:44:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106808_1749942272.pth +[2024-06-18 09:44:40,331][12883] Updated weights for policy 0, policy_version 107443 (0.0027) +[2024-06-18 09:44:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1760428032. Throughput: 0: 42793.3. Samples: 1760577120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:41,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 09:44:43,647][12883] Updated weights for policy 0, policy_version 107453 (0.0034) +[2024-06-18 09:44:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1760624640. Throughput: 0: 42683.8. Samples: 1760702520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 09:44:46,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 09:44:48,126][12883] Updated weights for policy 0, policy_version 107463 (0.0038) +[2024-06-18 09:44:51,368][12883] Updated weights for policy 0, policy_version 107473 (0.0032) +[2024-06-18 09:44:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1760854016. Throughput: 0: 42755.6. Samples: 1760963680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:44:51,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 09:44:56,043][12883] Updated weights for policy 0, policy_version 107483 (0.0027) +[2024-06-18 09:44:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1761050624. Throughput: 0: 42748.9. Samples: 1761215460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:44:56,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 09:44:59,134][12883] Updated weights for policy 0, policy_version 107493 (0.0047) +[2024-06-18 09:45:01,994][12645] Fps is (10 sec: 40957.4, 60 sec: 42872.7, 300 sec: 42709.4). Total num frames: 1761263616. Throughput: 0: 42639.9. Samples: 1761339140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:01,995][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 09:45:03,799][12862] Signal inference workers to stop experience collection... (25700 times) +[2024-06-18 09:45:03,857][12883] InferenceWorker_p0-w0: stopping experience collection (25700 times) +[2024-06-18 09:45:03,864][12862] Signal inference workers to resume experience collection... (25700 times) +[2024-06-18 09:45:03,880][12883] InferenceWorker_p0-w0: resuming experience collection (25700 times) +[2024-06-18 09:45:03,883][12883] Updated weights for policy 0, policy_version 107503 (0.0036) +[2024-06-18 09:45:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 1761476608. Throughput: 0: 42473.5. Samples: 1761589980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:06,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 09:45:07,092][12883] Updated weights for policy 0, policy_version 107513 (0.0024) +[2024-06-18 09:45:11,767][12883] Updated weights for policy 0, policy_version 107523 (0.0037) +[2024-06-18 09:45:11,994][12645] Fps is (10 sec: 40961.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1761673216. Throughput: 0: 42649.6. Samples: 1761854980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:11,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 09:45:14,616][12883] Updated weights for policy 0, policy_version 107533 (0.0036) +[2024-06-18 09:45:16,998][12645] Fps is (10 sec: 42579.8, 60 sec: 42868.4, 300 sec: 42764.4). Total num frames: 1761902592. Throughput: 0: 42517.3. Samples: 1761975020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:16,999][12645] Avg episode reward: [(0, '0.741')] +[2024-06-18 09:45:19,351][12883] Updated weights for policy 0, policy_version 107543 (0.0033) +[2024-06-18 09:45:21,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1762131968. Throughput: 0: 42411.6. Samples: 1762230940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:21,994][12645] Avg episode reward: [(0, '0.720')] +[2024-06-18 09:45:22,459][12883] Updated weights for policy 0, policy_version 107553 (0.0031) +[2024-06-18 09:45:26,868][12883] Updated weights for policy 0, policy_version 107563 (0.0039) +[2024-06-18 09:45:26,994][12645] Fps is (10 sec: 40977.5, 60 sec: 42325.3, 300 sec: 42654.8). Total num frames: 1762312192. Throughput: 0: 42607.6. Samples: 1762494460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:26,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 09:45:30,057][12883] Updated weights for policy 0, policy_version 107573 (0.0027) +[2024-06-18 09:45:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1762525184. Throughput: 0: 42464.5. Samples: 1762613420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:31,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 09:45:34,356][12883] Updated weights for policy 0, policy_version 107583 (0.0037) +[2024-06-18 09:45:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1762770944. Throughput: 0: 42442.0. Samples: 1762873580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:36,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 09:45:37,594][12883] Updated weights for policy 0, policy_version 107593 (0.0034) +[2024-06-18 09:45:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1762951168. Throughput: 0: 42580.4. Samples: 1763131580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:41,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 09:45:42,453][12883] Updated weights for policy 0, policy_version 107603 (0.0049) +[2024-06-18 09:45:45,093][12883] Updated weights for policy 0, policy_version 107613 (0.0042) +[2024-06-18 09:45:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1763180544. Throughput: 0: 42492.0. Samples: 1763251260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 09:45:46,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 09:45:49,980][12883] Updated weights for policy 0, policy_version 107623 (0.0032) +[2024-06-18 09:45:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1763409920. Throughput: 0: 42808.7. Samples: 1763516380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:45:51,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 09:45:52,625][12883] Updated weights for policy 0, policy_version 107633 (0.0031) +[2024-06-18 09:45:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1763590144. Throughput: 0: 42728.0. Samples: 1763777740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:45:56,994][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 09:45:57,463][12883] Updated weights for policy 0, policy_version 107643 (0.0032) +[2024-06-18 09:46:00,412][12883] Updated weights for policy 0, policy_version 107653 (0.0033) +[2024-06-18 09:46:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.8, 300 sec: 42709.5). Total num frames: 1763835904. Throughput: 0: 42741.3. Samples: 1763898200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:01,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 09:46:04,961][12883] Updated weights for policy 0, policy_version 107663 (0.0032) +[2024-06-18 09:46:06,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1764065280. Throughput: 0: 42924.5. Samples: 1764162540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:06,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 09:46:08,122][12883] Updated weights for policy 0, policy_version 107673 (0.0030) +[2024-06-18 09:46:11,996][12645] Fps is (10 sec: 42589.2, 60 sec: 43143.0, 300 sec: 42709.2). Total num frames: 1764261888. Throughput: 0: 42769.4. Samples: 1764419180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:11,996][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 09:46:12,715][12883] Updated weights for policy 0, policy_version 107683 (0.0038) +[2024-06-18 09:46:14,875][12862] Signal inference workers to stop experience collection... (25750 times) +[2024-06-18 09:46:14,875][12862] Signal inference workers to resume experience collection... (25750 times) +[2024-06-18 09:46:14,908][12883] InferenceWorker_p0-w0: stopping experience collection (25750 times) +[2024-06-18 09:46:14,908][12883] InferenceWorker_p0-w0: resuming experience collection (25750 times) +[2024-06-18 09:46:15,580][12883] Updated weights for policy 0, policy_version 107693 (0.0034) +[2024-06-18 09:46:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42874.5, 300 sec: 42709.5). Total num frames: 1764474880. Throughput: 0: 43012.4. Samples: 1764548980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:16,996][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 09:46:20,110][12883] Updated weights for policy 0, policy_version 107703 (0.0034) +[2024-06-18 09:46:21,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1764687872. Throughput: 0: 42959.6. Samples: 1764806760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:21,996][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 09:46:23,215][12883] Updated weights for policy 0, policy_version 107713 (0.0039) +[2024-06-18 09:46:26,995][12645] Fps is (10 sec: 42595.0, 60 sec: 43143.9, 300 sec: 42653.8). Total num frames: 1764900864. Throughput: 0: 42909.8. Samples: 1765062560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:26,995][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 09:46:27,630][12883] Updated weights for policy 0, policy_version 107723 (0.0036) +[2024-06-18 09:46:30,977][12883] Updated weights for policy 0, policy_version 107733 (0.0030) +[2024-06-18 09:46:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1765113856. Throughput: 0: 43078.2. Samples: 1765189780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:31,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 09:46:35,313][12883] Updated weights for policy 0, policy_version 107743 (0.0025) +[2024-06-18 09:46:36,994][12645] Fps is (10 sec: 44240.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1765343232. Throughput: 0: 42987.2. Samples: 1765450800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:36,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 09:46:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107748_1765343232.pth... +[2024-06-18 09:46:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107123_1755103232.pth +[2024-06-18 09:46:38,497][12883] Updated weights for policy 0, policy_version 107753 (0.0029) +[2024-06-18 09:46:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1765539840. Throughput: 0: 42882.2. Samples: 1765707440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:41,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 09:46:43,092][12883] Updated weights for policy 0, policy_version 107763 (0.0034) +[2024-06-18 09:46:46,373][12883] Updated weights for policy 0, policy_version 107773 (0.0045) +[2024-06-18 09:46:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1765769216. Throughput: 0: 43050.7. Samples: 1765835480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) +[2024-06-18 09:46:46,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 09:46:50,685][12883] Updated weights for policy 0, policy_version 107783 (0.0036) +[2024-06-18 09:46:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1765965824. Throughput: 0: 42829.3. Samples: 1766089860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:46:51,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 09:46:53,891][12883] Updated weights for policy 0, policy_version 107793 (0.0023) +[2024-06-18 09:46:56,994][12645] Fps is (10 sec: 39320.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1766162432. Throughput: 0: 42767.7. Samples: 1766343640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:46:56,995][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 09:46:58,243][12883] Updated weights for policy 0, policy_version 107803 (0.0023) +[2024-06-18 09:47:01,503][12883] Updated weights for policy 0, policy_version 107813 (0.0042) +[2024-06-18 09:47:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1766408192. Throughput: 0: 42701.4. Samples: 1766470540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:01,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 09:47:06,058][12883] Updated weights for policy 0, policy_version 107823 (0.0031) +[2024-06-18 09:47:06,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 1766621184. Throughput: 0: 42700.8. Samples: 1766728300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:06,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 09:47:09,055][12883] Updated weights for policy 0, policy_version 107833 (0.0033) +[2024-06-18 09:47:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1766817792. Throughput: 0: 42699.0. Samples: 1766983980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:11,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 09:47:13,652][12883] Updated weights for policy 0, policy_version 107843 (0.0034) +[2024-06-18 09:47:16,949][12883] Updated weights for policy 0, policy_version 107853 (0.0031) +[2024-06-18 09:47:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1767063552. Throughput: 0: 42635.6. Samples: 1767108380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:16,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 09:47:21,259][12883] Updated weights for policy 0, policy_version 107863 (0.0031) +[2024-06-18 09:47:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1767260160. Throughput: 0: 42716.0. Samples: 1767373020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:21,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 09:47:24,367][12883] Updated weights for policy 0, policy_version 107873 (0.0041) +[2024-06-18 09:47:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 1767456768. Throughput: 0: 42634.2. Samples: 1767625980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:26,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 09:47:29,045][12883] Updated weights for policy 0, policy_version 107883 (0.0044) +[2024-06-18 09:47:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1767702528. Throughput: 0: 42517.7. Samples: 1767748780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:31,996][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 09:47:32,290][12883] Updated weights for policy 0, policy_version 107893 (0.0031) +[2024-06-18 09:47:36,405][12862] Signal inference workers to stop experience collection... (25800 times) +[2024-06-18 09:47:36,405][12862] Signal inference workers to resume experience collection... (25800 times) +[2024-06-18 09:47:36,421][12883] InferenceWorker_p0-w0: stopping experience collection (25800 times) +[2024-06-18 09:47:36,421][12883] InferenceWorker_p0-w0: resuming experience collection (25800 times) +[2024-06-18 09:47:36,547][12883] Updated weights for policy 0, policy_version 107903 (0.0038) +[2024-06-18 09:47:36,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1767899136. Throughput: 0: 42660.0. Samples: 1768009660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:36,997][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 09:47:40,190][12883] Updated weights for policy 0, policy_version 107913 (0.0034) +[2024-06-18 09:47:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1768095744. Throughput: 0: 42809.6. Samples: 1768270060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:41,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 09:47:44,368][12883] Updated weights for policy 0, policy_version 107923 (0.0033) +[2024-06-18 09:47:46,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1768341504. Throughput: 0: 42672.8. Samples: 1768390820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:47,003][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 09:47:47,795][12883] Updated weights for policy 0, policy_version 107933 (0.0031) +[2024-06-18 09:47:51,976][12883] Updated weights for policy 0, policy_version 107943 (0.0034) +[2024-06-18 09:47:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1768538112. Throughput: 0: 42671.6. Samples: 1768648520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 09:47:51,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 09:47:55,503][12883] Updated weights for policy 0, policy_version 107953 (0.0027) +[2024-06-18 09:47:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1768734720. Throughput: 0: 42795.1. Samples: 1768909760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:47:56,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 09:47:59,693][12883] Updated weights for policy 0, policy_version 107963 (0.0036) +[2024-06-18 09:48:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1768996864. Throughput: 0: 42828.8. Samples: 1769035680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:01,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 09:48:03,066][12883] Updated weights for policy 0, policy_version 107973 (0.0049) +[2024-06-18 09:48:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 1769160704. Throughput: 0: 42566.6. Samples: 1769288520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:06,996][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 09:48:07,510][12883] Updated weights for policy 0, policy_version 107983 (0.0031) +[2024-06-18 09:48:10,955][12883] Updated weights for policy 0, policy_version 107993 (0.0027) +[2024-06-18 09:48:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1769390080. Throughput: 0: 42610.2. Samples: 1769543440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:11,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 09:48:15,163][12883] Updated weights for policy 0, policy_version 108003 (0.0032) +[2024-06-18 09:48:16,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1769619456. Throughput: 0: 42727.2. Samples: 1769671500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:16,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 09:48:18,639][12883] Updated weights for policy 0, policy_version 108013 (0.0035) +[2024-06-18 09:48:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1769799680. Throughput: 0: 42777.3. Samples: 1769934540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:21,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 09:48:22,665][12883] Updated weights for policy 0, policy_version 108023 (0.0037) +[2024-06-18 09:48:26,260][12883] Updated weights for policy 0, policy_version 108033 (0.0050) +[2024-06-18 09:48:26,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1770045440. Throughput: 0: 42597.6. Samples: 1770186960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:26,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 09:48:30,199][12883] Updated weights for policy 0, policy_version 108043 (0.0028) +[2024-06-18 09:48:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1770258432. Throughput: 0: 42899.1. Samples: 1770321280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:31,994][12645] Avg episode reward: [(0, '0.757')] +[2024-06-18 09:48:33,837][12883] Updated weights for policy 0, policy_version 108053 (0.0031) +[2024-06-18 09:48:36,996][12645] Fps is (10 sec: 40951.3, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 1770455040. Throughput: 0: 42872.1. Samples: 1770577860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:36,996][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 09:48:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108060_1770455040.pth... +[2024-06-18 09:48:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107436_1760231424.pth +[2024-06-18 09:48:38,127][12883] Updated weights for policy 0, policy_version 108063 (0.0043) +[2024-06-18 09:48:41,560][12883] Updated weights for policy 0, policy_version 108073 (0.0032) +[2024-06-18 09:48:41,995][12645] Fps is (10 sec: 40955.0, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 1770668032. Throughput: 0: 42671.8. Samples: 1770830040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:41,996][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 09:48:45,779][12883] Updated weights for policy 0, policy_version 108083 (0.0039) +[2024-06-18 09:48:46,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1770897408. Throughput: 0: 42709.4. Samples: 1770957600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:46,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 09:48:49,240][12883] Updated weights for policy 0, policy_version 108093 (0.0028) +[2024-06-18 09:48:51,994][12645] Fps is (10 sec: 44242.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1771110400. Throughput: 0: 42827.2. Samples: 1771215740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 09:48:51,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 09:48:53,785][12883] Updated weights for policy 0, policy_version 108103 (0.0025) +[2024-06-18 09:48:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1771290624. Throughput: 0: 42906.8. Samples: 1771474240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:48:56,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 09:48:57,159][12883] Updated weights for policy 0, policy_version 108113 (0.0031) +[2024-06-18 09:49:01,276][12883] Updated weights for policy 0, policy_version 108123 (0.0035) +[2024-06-18 09:49:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 1771520000. Throughput: 0: 42751.9. Samples: 1771595340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:01,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 09:49:04,423][12862] Signal inference workers to stop experience collection... (25850 times) +[2024-06-18 09:49:04,424][12862] Signal inference workers to resume experience collection... (25850 times) +[2024-06-18 09:49:04,465][12883] InferenceWorker_p0-w0: stopping experience collection (25850 times) +[2024-06-18 09:49:04,465][12883] InferenceWorker_p0-w0: resuming experience collection (25850 times) +[2024-06-18 09:49:04,701][12883] Updated weights for policy 0, policy_version 108133 (0.0027) +[2024-06-18 09:49:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1771749376. Throughput: 0: 42651.5. Samples: 1771853860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:06,994][12645] Avg episode reward: [(0, '0.677')] +[2024-06-18 09:49:08,922][12883] Updated weights for policy 0, policy_version 108143 (0.0045) +[2024-06-18 09:49:11,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 1771945984. Throughput: 0: 42761.0. Samples: 1772111300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:11,997][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 09:49:12,270][12883] Updated weights for policy 0, policy_version 108153 (0.0021) +[2024-06-18 09:49:16,515][12883] Updated weights for policy 0, policy_version 108163 (0.0036) +[2024-06-18 09:49:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1772158976. Throughput: 0: 42505.9. Samples: 1772234040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:16,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 09:49:19,840][12883] Updated weights for policy 0, policy_version 108173 (0.0029) +[2024-06-18 09:49:21,994][12645] Fps is (10 sec: 44247.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1772388352. Throughput: 0: 42480.4. Samples: 1772489380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:21,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 09:49:24,289][12883] Updated weights for policy 0, policy_version 108183 (0.0037) +[2024-06-18 09:49:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1772584960. Throughput: 0: 42622.2. Samples: 1772747980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:26,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 09:49:27,928][12883] Updated weights for policy 0, policy_version 108193 (0.0038) +[2024-06-18 09:49:31,860][12883] Updated weights for policy 0, policy_version 108203 (0.0035) +[2024-06-18 09:49:31,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1772797952. Throughput: 0: 42549.3. Samples: 1772872320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:31,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 09:49:35,817][12883] Updated weights for policy 0, policy_version 108213 (0.0032) +[2024-06-18 09:49:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 1773027328. Throughput: 0: 42678.1. Samples: 1773136260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:36,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 09:49:39,506][12883] Updated weights for policy 0, policy_version 108223 (0.0039) +[2024-06-18 09:49:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42599.3, 300 sec: 42709.5). Total num frames: 1773223936. Throughput: 0: 42510.2. Samples: 1773387200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:41,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 09:49:43,563][12883] Updated weights for policy 0, policy_version 108233 (0.0024) +[2024-06-18 09:49:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1773436928. Throughput: 0: 42560.9. Samples: 1773510580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:46,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 09:49:47,212][12883] Updated weights for policy 0, policy_version 108243 (0.0030) +[2024-06-18 09:49:51,284][12883] Updated weights for policy 0, policy_version 108253 (0.0037) +[2024-06-18 09:49:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1773649920. Throughput: 0: 42452.9. Samples: 1773764240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 09:49:51,994][12645] Avg episode reward: [(0, '0.184')] +[2024-06-18 09:49:54,899][12883] Updated weights for policy 0, policy_version 108263 (0.0035) +[2024-06-18 09:49:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 1773846528. Throughput: 0: 42404.8. Samples: 1774019420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:49:56,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 09:49:58,879][12883] Updated weights for policy 0, policy_version 108273 (0.0029) +[2024-06-18 09:50:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1774075904. Throughput: 0: 42414.1. Samples: 1774142680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:01,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 09:50:02,408][12883] Updated weights for policy 0, policy_version 108283 (0.0038) +[2024-06-18 09:50:06,551][12883] Updated weights for policy 0, policy_version 108293 (0.0036) +[2024-06-18 09:50:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1774288896. Throughput: 0: 42497.5. Samples: 1774401780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:06,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 09:50:10,249][12883] Updated weights for policy 0, policy_version 108303 (0.0032) +[2024-06-18 09:50:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42654.5). Total num frames: 1774485504. Throughput: 0: 42579.0. Samples: 1774664040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:11,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 09:50:14,258][12883] Updated weights for policy 0, policy_version 108313 (0.0023) +[2024-06-18 09:50:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1774714880. Throughput: 0: 42534.3. Samples: 1774786360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:16,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 09:50:17,946][12883] Updated weights for policy 0, policy_version 108323 (0.0037) +[2024-06-18 09:50:21,904][12883] Updated weights for policy 0, policy_version 108333 (0.0039) +[2024-06-18 09:50:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1774927872. Throughput: 0: 42348.5. Samples: 1775041940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:21,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 09:50:23,321][12862] Signal inference workers to stop experience collection... (25900 times) +[2024-06-18 09:50:23,321][12862] Signal inference workers to resume experience collection... (25900 times) +[2024-06-18 09:50:23,343][12883] InferenceWorker_p0-w0: stopping experience collection (25900 times) +[2024-06-18 09:50:23,343][12883] InferenceWorker_p0-w0: resuming experience collection (25900 times) +[2024-06-18 09:50:26,207][12883] Updated weights for policy 0, policy_version 108343 (0.0022) +[2024-06-18 09:50:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1775124480. Throughput: 0: 42522.6. Samples: 1775300720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:26,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 09:50:29,700][12883] Updated weights for policy 0, policy_version 108353 (0.0036) +[2024-06-18 09:50:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1775370240. Throughput: 0: 42589.4. Samples: 1775427100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:31,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 09:50:33,811][12883] Updated weights for policy 0, policy_version 108363 (0.0026) +[2024-06-18 09:50:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 1775534080. Throughput: 0: 42603.2. Samples: 1775681380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:36,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 09:50:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108371_1775550464.pth... +[2024-06-18 09:50:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107748_1765343232.pth +[2024-06-18 09:50:37,398][12883] Updated weights for policy 0, policy_version 108373 (0.0027) +[2024-06-18 09:50:41,407][12883] Updated weights for policy 0, policy_version 108383 (0.0029) +[2024-06-18 09:50:41,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1775779840. Throughput: 0: 42637.3. Samples: 1775938100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:41,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 09:50:44,993][12883] Updated weights for policy 0, policy_version 108393 (0.0045) +[2024-06-18 09:50:46,994][12645] Fps is (10 sec: 49151.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1776025600. Throughput: 0: 42704.0. Samples: 1776064360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:46,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 09:50:49,106][12883] Updated weights for policy 0, policy_version 108403 (0.0039) +[2024-06-18 09:50:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1776173056. Throughput: 0: 42517.9. Samples: 1776315080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 09:50:51,996][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 09:50:52,869][12883] Updated weights for policy 0, policy_version 108413 (0.0033) +[2024-06-18 09:50:56,851][12883] Updated weights for policy 0, policy_version 108423 (0.0034) +[2024-06-18 09:50:56,997][12645] Fps is (10 sec: 37669.5, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 1776402432. Throughput: 0: 42359.7. Samples: 1776570380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:50:56,998][12645] Avg episode reward: [(0, '0.215')] +[2024-06-18 09:51:00,378][12883] Updated weights for policy 0, policy_version 108433 (0.0031) +[2024-06-18 09:51:01,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1776648192. Throughput: 0: 42643.0. Samples: 1776705300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:01,994][12645] Avg episode reward: [(0, '0.194')] +[2024-06-18 09:51:04,413][12883] Updated weights for policy 0, policy_version 108443 (0.0031) +[2024-06-18 09:51:06,994][12645] Fps is (10 sec: 40975.4, 60 sec: 42052.4, 300 sec: 42543.2). Total num frames: 1776812032. Throughput: 0: 42569.9. Samples: 1776957580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:06,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 09:51:08,394][12883] Updated weights for policy 0, policy_version 108453 (0.0037) +[2024-06-18 09:51:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1777041408. Throughput: 0: 42388.1. Samples: 1777208180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:11,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 09:51:12,042][12883] Updated weights for policy 0, policy_version 108463 (0.0034) +[2024-06-18 09:51:16,033][12883] Updated weights for policy 0, policy_version 108473 (0.0027) +[2024-06-18 09:51:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1777270784. Throughput: 0: 42508.9. Samples: 1777340000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:16,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 09:51:19,646][12883] Updated weights for policy 0, policy_version 108483 (0.0030) +[2024-06-18 09:51:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42543.0). Total num frames: 1777451008. Throughput: 0: 42435.5. Samples: 1777590980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:21,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 09:51:23,575][12883] Updated weights for policy 0, policy_version 108493 (0.0043) +[2024-06-18 09:51:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1777696768. Throughput: 0: 42310.3. Samples: 1777842060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:26,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 09:51:27,395][12883] Updated weights for policy 0, policy_version 108503 (0.0035) +[2024-06-18 09:51:31,238][12883] Updated weights for policy 0, policy_version 108513 (0.0046) +[2024-06-18 09:51:31,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 1777909760. Throughput: 0: 42517.6. Samples: 1777977740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:31,996][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 09:51:34,955][12883] Updated weights for policy 0, policy_version 108523 (0.0029) +[2024-06-18 09:51:36,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 1778089984. Throughput: 0: 42585.9. Samples: 1778231540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:36,996][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 09:51:37,525][12862] Signal inference workers to stop experience collection... (25950 times) +[2024-06-18 09:51:37,580][12862] Signal inference workers to resume experience collection... (25950 times) +[2024-06-18 09:51:37,581][12883] InferenceWorker_p0-w0: stopping experience collection (25950 times) +[2024-06-18 09:51:37,602][12883] InferenceWorker_p0-w0: resuming experience collection (25950 times) +[2024-06-18 09:51:38,815][12883] Updated weights for policy 0, policy_version 108533 (0.0041) +[2024-06-18 09:51:41,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1778335744. Throughput: 0: 42543.6. Samples: 1778484680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:41,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 09:51:42,402][12883] Updated weights for policy 0, policy_version 108543 (0.0022) +[2024-06-18 09:51:46,498][12883] Updated weights for policy 0, policy_version 108553 (0.0035) +[2024-06-18 09:51:46,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1778548736. Throughput: 0: 42590.8. Samples: 1778621880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:47,000][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 09:51:50,376][12883] Updated weights for policy 0, policy_version 108563 (0.0041) +[2024-06-18 09:51:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1778745344. Throughput: 0: 42438.2. Samples: 1778867300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 09:51:51,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 09:51:54,253][12883] Updated weights for policy 0, policy_version 108573 (0.0043) +[2024-06-18 09:51:57,000][12645] Fps is (10 sec: 42571.4, 60 sec: 42869.6, 300 sec: 42597.5). Total num frames: 1778974720. Throughput: 0: 42441.1. Samples: 1779118300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:51:57,001][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 09:51:57,911][12883] Updated weights for policy 0, policy_version 108583 (0.0031) +[2024-06-18 09:52:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1779171328. Throughput: 0: 42540.9. Samples: 1779254340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:01,994][12645] Avg episode reward: [(0, '0.123')] +[2024-06-18 09:52:02,018][12883] Updated weights for policy 0, policy_version 108593 (0.0031) +[2024-06-18 09:52:05,765][12883] Updated weights for policy 0, policy_version 108603 (0.0039) +[2024-06-18 09:52:06,994][12645] Fps is (10 sec: 40986.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1779384320. Throughput: 0: 42556.9. Samples: 1779506040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:06,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 09:52:09,676][12883] Updated weights for policy 0, policy_version 108613 (0.0027) +[2024-06-18 09:52:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1779613696. Throughput: 0: 42608.5. Samples: 1779759440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:11,994][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 09:52:13,334][12883] Updated weights for policy 0, policy_version 108623 (0.0047) +[2024-06-18 09:52:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1779810304. Throughput: 0: 42498.4. Samples: 1779890080. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:16,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 09:52:17,428][12883] Updated weights for policy 0, policy_version 108633 (0.0044) +[2024-06-18 09:52:20,831][12883] Updated weights for policy 0, policy_version 108643 (0.0030) +[2024-06-18 09:52:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1780039680. Throughput: 0: 42684.7. Samples: 1780152260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:21,995][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 09:52:25,079][12883] Updated weights for policy 0, policy_version 108653 (0.0023) +[2024-06-18 09:52:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1780252672. Throughput: 0: 42817.7. Samples: 1780411480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:26,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 09:52:28,727][12883] Updated weights for policy 0, policy_version 108663 (0.0035) +[2024-06-18 09:52:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42873.1, 300 sec: 42654.3). Total num frames: 1780482048. Throughput: 0: 42462.8. Samples: 1780532700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:31,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 09:52:32,838][12883] Updated weights for policy 0, policy_version 108673 (0.0038) +[2024-06-18 09:52:36,249][12883] Updated weights for policy 0, policy_version 108683 (0.0051) +[2024-06-18 09:52:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 1780662272. Throughput: 0: 42653.3. Samples: 1780786700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:36,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 09:52:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108683_1780662272.pth... +[2024-06-18 09:52:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108060_1770455040.pth +[2024-06-18 09:52:40,563][12883] Updated weights for policy 0, policy_version 108693 (0.0032) +[2024-06-18 09:52:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1780875264. Throughput: 0: 42935.5. Samples: 1781050120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:41,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 09:52:43,797][12883] Updated weights for policy 0, policy_version 108703 (0.0026) +[2024-06-18 09:52:46,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1781104640. Throughput: 0: 42644.7. Samples: 1781173360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:46,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 09:52:47,772][12862] Signal inference workers to stop experience collection... (26000 times) +[2024-06-18 09:52:47,824][12883] InferenceWorker_p0-w0: stopping experience collection (26000 times) +[2024-06-18 09:52:47,887][12862] Signal inference workers to resume experience collection... (26000 times) +[2024-06-18 09:52:47,887][12883] InferenceWorker_p0-w0: resuming experience collection (26000 times) +[2024-06-18 09:52:48,022][12883] Updated weights for policy 0, policy_version 108713 (0.0046) +[2024-06-18 09:52:51,389][12883] Updated weights for policy 0, policy_version 108723 (0.0028) +[2024-06-18 09:52:51,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 1781317632. Throughput: 0: 42709.9. Samples: 1781428080. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:51,997][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 09:52:55,566][12883] Updated weights for policy 0, policy_version 108733 (0.0038) +[2024-06-18 09:52:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42329.8, 300 sec: 42431.8). Total num frames: 1781514240. Throughput: 0: 42840.4. Samples: 1781687260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) +[2024-06-18 09:52:56,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 09:52:59,191][12883] Updated weights for policy 0, policy_version 108743 (0.0041) +[2024-06-18 09:53:01,996][12645] Fps is (10 sec: 42598.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 1781743616. Throughput: 0: 42666.8. Samples: 1781810180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:01,997][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 09:53:03,052][12883] Updated weights for policy 0, policy_version 108753 (0.0042) +[2024-06-18 09:53:06,724][12883] Updated weights for policy 0, policy_version 108763 (0.0022) +[2024-06-18 09:53:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1781972992. Throughput: 0: 42664.0. Samples: 1782072140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:06,994][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 09:53:10,587][12883] Updated weights for policy 0, policy_version 108773 (0.0030) +[2024-06-18 09:53:11,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1782153216. Throughput: 0: 42569.2. Samples: 1782327100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:11,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 09:53:14,810][12883] Updated weights for policy 0, policy_version 108783 (0.0028) +[2024-06-18 09:53:16,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1782382592. Throughput: 0: 42660.9. Samples: 1782452540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:16,997][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 09:53:18,208][12883] Updated weights for policy 0, policy_version 108793 (0.0036) +[2024-06-18 09:53:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1782595584. Throughput: 0: 42688.9. Samples: 1782707700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:21,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 09:53:22,370][12883] Updated weights for policy 0, policy_version 108803 (0.0037) +[2024-06-18 09:53:26,089][12883] Updated weights for policy 0, policy_version 108813 (0.0039) +[2024-06-18 09:53:26,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1782792192. Throughput: 0: 42540.8. Samples: 1782964460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:26,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 09:53:30,329][12883] Updated weights for policy 0, policy_version 108823 (0.0037) +[2024-06-18 09:53:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 1783021568. Throughput: 0: 42651.8. Samples: 1783092680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:31,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 09:53:33,923][12883] Updated weights for policy 0, policy_version 108833 (0.0038) +[2024-06-18 09:53:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.6). Total num frames: 1783234560. Throughput: 0: 42770.1. Samples: 1783352640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:36,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 09:53:37,886][12883] Updated weights for policy 0, policy_version 108843 (0.0024) +[2024-06-18 09:53:41,667][12883] Updated weights for policy 0, policy_version 108853 (0.0039) +[2024-06-18 09:53:41,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 1783447552. Throughput: 0: 42596.6. Samples: 1783604200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:41,996][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 09:53:45,639][12883] Updated weights for policy 0, policy_version 108863 (0.0040) +[2024-06-18 09:53:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1783660544. Throughput: 0: 42696.4. Samples: 1783731420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:46,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 09:53:49,552][12883] Updated weights for policy 0, policy_version 108873 (0.0032) +[2024-06-18 09:53:51,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 1783857152. Throughput: 0: 42550.2. Samples: 1783986900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:51,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 09:53:53,247][12883] Updated weights for policy 0, policy_version 108883 (0.0044) +[2024-06-18 09:53:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1784070144. Throughput: 0: 42517.5. Samples: 1784240380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 09:53:56,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 09:53:57,295][12883] Updated weights for policy 0, policy_version 108893 (0.0037) +[2024-06-18 09:54:01,184][12883] Updated weights for policy 0, policy_version 108903 (0.0032) +[2024-06-18 09:54:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 1784299520. Throughput: 0: 42627.1. Samples: 1784370660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:01,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 09:54:04,886][12883] Updated weights for policy 0, policy_version 108913 (0.0037) +[2024-06-18 09:54:06,996][12645] Fps is (10 sec: 44226.2, 60 sec: 42323.8, 300 sec: 42598.4). Total num frames: 1784512512. Throughput: 0: 42547.2. Samples: 1784622420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:06,997][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 09:54:08,822][12883] Updated weights for policy 0, policy_version 108923 (0.0027) +[2024-06-18 09:54:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1784725504. Throughput: 0: 42517.8. Samples: 1784877760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:11,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 09:54:12,368][12862] Signal inference workers to stop experience collection... (26050 times) +[2024-06-18 09:54:12,421][12883] InferenceWorker_p0-w0: stopping experience collection (26050 times) +[2024-06-18 09:54:12,420][12862] Signal inference workers to resume experience collection... (26050 times) +[2024-06-18 09:54:12,446][12883] InferenceWorker_p0-w0: resuming experience collection (26050 times) +[2024-06-18 09:54:12,563][12883] Updated weights for policy 0, policy_version 108933 (0.0031) +[2024-06-18 09:54:16,607][12883] Updated weights for policy 0, policy_version 108943 (0.0026) +[2024-06-18 09:54:16,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1784922112. Throughput: 0: 42651.0. Samples: 1785011980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:16,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 09:54:20,394][12883] Updated weights for policy 0, policy_version 108953 (0.0046) +[2024-06-18 09:54:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1785135104. Throughput: 0: 42444.5. Samples: 1785262640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:21,995][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 09:54:24,312][12883] Updated weights for policy 0, policy_version 108963 (0.0037) +[2024-06-18 09:54:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1785364480. Throughput: 0: 42423.8. Samples: 1785513180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:26,995][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 09:54:28,072][12883] Updated weights for policy 0, policy_version 108973 (0.0036) +[2024-06-18 09:54:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1785544704. Throughput: 0: 42571.6. Samples: 1785647140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:31,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 09:54:32,202][12883] Updated weights for policy 0, policy_version 108983 (0.0027) +[2024-06-18 09:54:35,750][12883] Updated weights for policy 0, policy_version 108993 (0.0030) +[2024-06-18 09:54:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1785774080. Throughput: 0: 42592.9. Samples: 1785903580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:36,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 09:54:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108996_1785790464.pth... +[2024-06-18 09:54:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108371_1775550464.pth +[2024-06-18 09:54:40,001][12883] Updated weights for policy 0, policy_version 109003 (0.0035) +[2024-06-18 09:54:41,997][12645] Fps is (10 sec: 45861.8, 60 sec: 42597.9, 300 sec: 42598.0). Total num frames: 1786003456. Throughput: 0: 42474.9. Samples: 1786151880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:41,997][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 09:54:43,649][12883] Updated weights for policy 0, policy_version 109013 (0.0031) +[2024-06-18 09:54:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1786216448. Throughput: 0: 42441.2. Samples: 1786280520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:46,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 09:54:47,539][12883] Updated weights for policy 0, policy_version 109023 (0.0028) +[2024-06-18 09:54:51,098][12883] Updated weights for policy 0, policy_version 109033 (0.0034) +[2024-06-18 09:54:51,994][12645] Fps is (10 sec: 42610.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1786429440. Throughput: 0: 42694.6. Samples: 1786543580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:51,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 09:54:54,942][12883] Updated weights for policy 0, policy_version 109043 (0.0039) +[2024-06-18 09:54:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1786642432. Throughput: 0: 42643.4. Samples: 1786796720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 09:54:56,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 09:54:58,882][12883] Updated weights for policy 0, policy_version 109053 (0.0038) +[2024-06-18 09:55:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1786839040. Throughput: 0: 42391.7. Samples: 1786919600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:01,994][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 09:55:02,506][12883] Updated weights for policy 0, policy_version 109063 (0.0038) +[2024-06-18 09:55:06,383][12883] Updated weights for policy 0, policy_version 109073 (0.0045) +[2024-06-18 09:55:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 1787084800. Throughput: 0: 42602.3. Samples: 1787179740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:06,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 09:55:10,461][12883] Updated weights for policy 0, policy_version 109083 (0.0042) +[2024-06-18 09:55:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1787265024. Throughput: 0: 42733.4. Samples: 1787436180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:11,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 09:55:13,970][12883] Updated weights for policy 0, policy_version 109093 (0.0026) +[2024-06-18 09:55:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1787494400. Throughput: 0: 42565.3. Samples: 1787562580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:16,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 09:55:17,923][12883] Updated weights for policy 0, policy_version 109103 (0.0029) +[2024-06-18 09:55:21,807][12883] Updated weights for policy 0, policy_version 109113 (0.0031) +[2024-06-18 09:55:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1787707392. Throughput: 0: 42592.0. Samples: 1787820220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:21,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 09:55:25,636][12883] Updated weights for policy 0, policy_version 109123 (0.0035) +[2024-06-18 09:55:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1787904000. Throughput: 0: 42872.0. Samples: 1788081000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:26,994][12645] Avg episode reward: [(0, '0.132')] +[2024-06-18 09:55:29,322][12883] Updated weights for policy 0, policy_version 109133 (0.0037) +[2024-06-18 09:55:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1788133376. Throughput: 0: 42819.7. Samples: 1788207400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:31,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 09:55:33,462][12862] Signal inference workers to stop experience collection... (26100 times) +[2024-06-18 09:55:33,462][12862] Signal inference workers to resume experience collection... (26100 times) +[2024-06-18 09:55:33,472][12883] InferenceWorker_p0-w0: stopping experience collection (26100 times) +[2024-06-18 09:55:33,472][12883] InferenceWorker_p0-w0: resuming experience collection (26100 times) +[2024-06-18 09:55:33,613][12883] Updated weights for policy 0, policy_version 109143 (0.0022) +[2024-06-18 09:55:36,942][12883] Updated weights for policy 0, policy_version 109153 (0.0023) +[2024-06-18 09:55:36,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1788362752. Throughput: 0: 42716.0. Samples: 1788465800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:36,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 09:55:41,263][12883] Updated weights for policy 0, policy_version 109163 (0.0033) +[2024-06-18 09:55:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42600.4, 300 sec: 42487.3). Total num frames: 1788559360. Throughput: 0: 42990.2. Samples: 1788731280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:41,995][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 09:55:44,717][12883] Updated weights for policy 0, policy_version 109173 (0.0042) +[2024-06-18 09:55:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1788788736. Throughput: 0: 43007.5. Samples: 1788854940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:46,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 09:55:48,804][12883] Updated weights for policy 0, policy_version 109183 (0.0040) +[2024-06-18 09:55:51,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 1789001728. Throughput: 0: 42797.7. Samples: 1789105640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:51,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 09:55:52,388][12883] Updated weights for policy 0, policy_version 109193 (0.0039) +[2024-06-18 09:55:56,586][12883] Updated weights for policy 0, policy_version 109203 (0.0026) +[2024-06-18 09:55:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1789181952. Throughput: 0: 42898.6. Samples: 1789366620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 09:55:56,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 09:56:00,110][12883] Updated weights for policy 0, policy_version 109213 (0.0032) +[2024-06-18 09:56:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1789411328. Throughput: 0: 42861.8. Samples: 1789491360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:01,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 09:56:04,144][12883] Updated weights for policy 0, policy_version 109223 (0.0026) +[2024-06-18 09:56:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1789624320. Throughput: 0: 42805.7. Samples: 1789746480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:06,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 09:56:07,638][12883] Updated weights for policy 0, policy_version 109233 (0.0028) +[2024-06-18 09:56:11,846][12883] Updated weights for policy 0, policy_version 109243 (0.0036) +[2024-06-18 09:56:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1789837312. Throughput: 0: 42677.8. Samples: 1790001500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:11,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 09:56:15,666][12883] Updated weights for policy 0, policy_version 109253 (0.0032) +[2024-06-18 09:56:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1790050304. Throughput: 0: 42575.9. Samples: 1790123320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:16,994][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 09:56:19,529][12883] Updated weights for policy 0, policy_version 109263 (0.0042) +[2024-06-18 09:56:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1790279680. Throughput: 0: 42634.1. Samples: 1790384340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:21,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 09:56:23,206][12883] Updated weights for policy 0, policy_version 109273 (0.0024) +[2024-06-18 09:56:26,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42869.9, 300 sec: 42598.4). Total num frames: 1790476288. Throughput: 0: 42426.9. Samples: 1790640580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:26,996][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 09:56:27,151][12883] Updated weights for policy 0, policy_version 109283 (0.0028) +[2024-06-18 09:56:30,818][12883] Updated weights for policy 0, policy_version 109293 (0.0035) +[2024-06-18 09:56:31,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 1790689280. Throughput: 0: 42431.1. Samples: 1790764440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:31,997][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 09:56:34,934][12883] Updated weights for policy 0, policy_version 109303 (0.0031) +[2024-06-18 09:56:36,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1790902272. Throughput: 0: 42688.5. Samples: 1791026620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:36,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 09:56:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109308_1790902272.pth... +[2024-06-18 09:56:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108683_1780662272.pth +[2024-06-18 09:56:38,396][12883] Updated weights for policy 0, policy_version 109313 (0.0024) +[2024-06-18 09:56:41,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1791115264. Throughput: 0: 42426.2. Samples: 1791275800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:41,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 09:56:42,727][12883] Updated weights for policy 0, policy_version 109323 (0.0033) +[2024-06-18 09:56:45,862][12883] Updated weights for policy 0, policy_version 109333 (0.0021) +[2024-06-18 09:56:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1791344640. Throughput: 0: 42601.3. Samples: 1791408420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:46,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 09:56:50,030][12862] Signal inference workers to stop experience collection... (26150 times) +[2024-06-18 09:56:50,031][12862] Signal inference workers to resume experience collection... (26150 times) +[2024-06-18 09:56:50,071][12883] InferenceWorker_p0-w0: stopping experience collection (26150 times) +[2024-06-18 09:56:50,071][12883] InferenceWorker_p0-w0: resuming experience collection (26150 times) +[2024-06-18 09:56:50,179][12883] Updated weights for policy 0, policy_version 109343 (0.0044) +[2024-06-18 09:56:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42543.8). Total num frames: 1791524864. Throughput: 0: 42742.8. Samples: 1791669900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:51,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 09:56:53,471][12883] Updated weights for policy 0, policy_version 109353 (0.0024) +[2024-06-18 09:56:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1791770624. Throughput: 0: 42689.5. Samples: 1791922520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 09:56:56,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 09:56:57,797][12883] Updated weights for policy 0, policy_version 109363 (0.0044) +[2024-06-18 09:57:01,389][12883] Updated weights for policy 0, policy_version 109373 (0.0033) +[2024-06-18 09:57:01,996][12645] Fps is (10 sec: 47502.6, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 1792000000. Throughput: 0: 42986.8. Samples: 1792057820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:02,005][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 09:57:05,542][12883] Updated weights for policy 0, policy_version 109383 (0.0034) +[2024-06-18 09:57:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1792163840. Throughput: 0: 42774.8. Samples: 1792309200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:06,995][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 09:57:08,957][12883] Updated weights for policy 0, policy_version 109393 (0.0047) +[2024-06-18 09:57:11,994][12645] Fps is (10 sec: 42608.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1792425984. Throughput: 0: 42686.6. Samples: 1792561380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:11,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 09:57:13,347][12883] Updated weights for policy 0, policy_version 109403 (0.0048) +[2024-06-18 09:57:16,639][12883] Updated weights for policy 0, policy_version 109413 (0.0041) +[2024-06-18 09:57:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1792622592. Throughput: 0: 42922.3. Samples: 1792695840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:16,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 09:57:20,997][12883] Updated weights for policy 0, policy_version 109423 (0.0038) +[2024-06-18 09:57:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1792802816. Throughput: 0: 42731.5. Samples: 1792949540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:22,003][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 09:57:24,379][12883] Updated weights for policy 0, policy_version 109433 (0.0040) +[2024-06-18 09:57:26,996][12645] Fps is (10 sec: 44226.6, 60 sec: 43144.5, 300 sec: 42653.6). Total num frames: 1793064960. Throughput: 0: 42798.4. Samples: 1793201820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:27,005][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 09:57:28,829][12883] Updated weights for policy 0, policy_version 109443 (0.0032) +[2024-06-18 09:57:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 1793261568. Throughput: 0: 42864.5. Samples: 1793337320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:31,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 09:57:32,052][12883] Updated weights for policy 0, policy_version 109453 (0.0041) +[2024-06-18 09:57:36,246][12883] Updated weights for policy 0, policy_version 109463 (0.0024) +[2024-06-18 09:57:36,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1793458176. Throughput: 0: 42760.4. Samples: 1793594120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:36,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 09:57:39,546][12883] Updated weights for policy 0, policy_version 109473 (0.0033) +[2024-06-18 09:57:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1793687552. Throughput: 0: 42774.1. Samples: 1793847360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:41,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 09:57:43,709][12883] Updated weights for policy 0, policy_version 109483 (0.0045) +[2024-06-18 09:57:46,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1793916928. Throughput: 0: 42836.8. Samples: 1793985380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:46,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 09:57:47,174][12883] Updated weights for policy 0, policy_version 109493 (0.0030) +[2024-06-18 09:57:51,252][12883] Updated weights for policy 0, policy_version 109503 (0.0035) +[2024-06-18 09:57:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1794113536. Throughput: 0: 42947.5. Samples: 1794241840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:51,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 09:57:54,852][12883] Updated weights for policy 0, policy_version 109513 (0.0032) +[2024-06-18 09:57:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1794342912. Throughput: 0: 42943.6. Samples: 1794493840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 09:57:56,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 09:57:59,066][12883] Updated weights for policy 0, policy_version 109523 (0.0031) +[2024-06-18 09:58:01,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 1794555904. Throughput: 0: 42874.4. Samples: 1794625200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:01,995][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 09:58:02,683][12883] Updated weights for policy 0, policy_version 109533 (0.0029) +[2024-06-18 09:58:06,837][12883] Updated weights for policy 0, policy_version 109543 (0.0031) +[2024-06-18 09:58:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1794752512. Throughput: 0: 42964.9. Samples: 1794882960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:06,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 09:58:10,362][12883] Updated weights for policy 0, policy_version 109553 (0.0026) +[2024-06-18 09:58:11,994][12645] Fps is (10 sec: 42599.7, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1794981888. Throughput: 0: 43122.2. Samples: 1795142220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:11,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 09:58:14,497][12883] Updated weights for policy 0, policy_version 109563 (0.0042) +[2024-06-18 09:58:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1795194880. Throughput: 0: 43036.7. Samples: 1795273980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:16,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 09:58:17,828][12883] Updated weights for policy 0, policy_version 109573 (0.0048) +[2024-06-18 09:58:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1795375104. Throughput: 0: 42797.9. Samples: 1795520020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 09:58:22,259][12883] Updated weights for policy 0, policy_version 109583 (0.0041) +[2024-06-18 09:58:25,433][12883] Updated weights for policy 0, policy_version 109593 (0.0040) +[2024-06-18 09:58:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42599.9, 300 sec: 42709.4). Total num frames: 1795620864. Throughput: 0: 42916.8. Samples: 1795778620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:26,994][12645] Avg episode reward: [(0, '0.599')] +[2024-06-18 09:58:29,721][12883] Updated weights for policy 0, policy_version 109603 (0.0044) +[2024-06-18 09:58:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1795833856. Throughput: 0: 42771.6. Samples: 1795910100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:31,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 09:58:33,442][12883] Updated weights for policy 0, policy_version 109613 (0.0022) +[2024-06-18 09:58:35,215][12862] Signal inference workers to stop experience collection... (26200 times) +[2024-06-18 09:58:35,237][12883] InferenceWorker_p0-w0: stopping experience collection (26200 times) +[2024-06-18 09:58:35,277][12862] Signal inference workers to resume experience collection... (26200 times) +[2024-06-18 09:58:35,277][12883] InferenceWorker_p0-w0: resuming experience collection (26200 times) +[2024-06-18 09:58:36,996][12645] Fps is (10 sec: 42589.5, 60 sec: 43142.9, 300 sec: 42709.5). Total num frames: 1796046848. Throughput: 0: 42634.7. Samples: 1796160500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:36,997][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 09:58:37,024][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109622_1796046848.pth... +[2024-06-18 09:58:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108996_1785790464.pth +[2024-06-18 09:58:37,269][12883] Updated weights for policy 0, policy_version 109623 (0.0032) +[2024-06-18 09:58:41,029][12883] Updated weights for policy 0, policy_version 109633 (0.0038) +[2024-06-18 09:58:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1796259840. Throughput: 0: 42643.6. Samples: 1796412800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:41,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 09:58:45,024][12883] Updated weights for policy 0, policy_version 109643 (0.0033) +[2024-06-18 09:58:46,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1796472832. Throughput: 0: 42735.2. Samples: 1796548280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:46,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 09:58:48,684][12883] Updated weights for policy 0, policy_version 109653 (0.0025) +[2024-06-18 09:58:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1796669440. Throughput: 0: 42660.4. Samples: 1796802680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:51,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 09:58:52,729][12883] Updated weights for policy 0, policy_version 109663 (0.0038) +[2024-06-18 09:58:56,439][12883] Updated weights for policy 0, policy_version 109673 (0.0029) +[2024-06-18 09:58:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1796898816. Throughput: 0: 42567.1. Samples: 1797057740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:58:56,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 09:59:00,646][12883] Updated weights for policy 0, policy_version 109683 (0.0037) +[2024-06-18 09:59:01,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 1797111808. Throughput: 0: 42556.2. Samples: 1797189000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 09:59:01,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 09:59:04,003][12883] Updated weights for policy 0, policy_version 109693 (0.0035) +[2024-06-18 09:59:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1797324800. Throughput: 0: 42789.7. Samples: 1797445560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:06,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 09:59:08,089][12883] Updated weights for policy 0, policy_version 109703 (0.0047) +[2024-06-18 09:59:11,826][12883] Updated weights for policy 0, policy_version 109713 (0.0032) +[2024-06-18 09:59:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1797537792. Throughput: 0: 42889.4. Samples: 1797708640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:11,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 09:59:15,919][12883] Updated weights for policy 0, policy_version 109723 (0.0035) +[2024-06-18 09:59:16,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 1797767168. Throughput: 0: 42774.3. Samples: 1797835040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:16,997][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 09:59:19,315][12883] Updated weights for policy 0, policy_version 109733 (0.0031) +[2024-06-18 09:59:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1797980160. Throughput: 0: 42853.6. Samples: 1798088820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:21,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 09:59:23,509][12883] Updated weights for policy 0, policy_version 109743 (0.0031) +[2024-06-18 09:59:26,843][12883] Updated weights for policy 0, policy_version 109753 (0.0030) +[2024-06-18 09:59:26,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1798193152. Throughput: 0: 42952.0. Samples: 1798345640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:26,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 09:59:31,033][12883] Updated weights for policy 0, policy_version 109763 (0.0039) +[2024-06-18 09:59:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1798389760. Throughput: 0: 42709.0. Samples: 1798470180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:31,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 09:59:34,210][12883] Updated weights for policy 0, policy_version 109773 (0.0040) +[2024-06-18 09:59:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43146.1, 300 sec: 42821.0). Total num frames: 1798635520. Throughput: 0: 42995.1. Samples: 1798737460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:36,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 09:59:38,660][12883] Updated weights for policy 0, policy_version 109783 (0.0037) +[2024-06-18 09:59:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1798832128. Throughput: 0: 42993.3. Samples: 1798992440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:41,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 09:59:42,170][12883] Updated weights for policy 0, policy_version 109793 (0.0034) +[2024-06-18 09:59:46,468][12883] Updated weights for policy 0, policy_version 109803 (0.0026) +[2024-06-18 09:59:46,647][12862] Signal inference workers to stop experience collection... (26250 times) +[2024-06-18 09:59:46,677][12883] InferenceWorker_p0-w0: stopping experience collection (26250 times) +[2024-06-18 09:59:46,707][12862] Signal inference workers to resume experience collection... (26250 times) +[2024-06-18 09:59:46,712][12883] InferenceWorker_p0-w0: resuming experience collection (26250 times) +[2024-06-18 09:59:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1799045120. Throughput: 0: 42911.4. Samples: 1799120020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:46,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 09:59:49,711][12883] Updated weights for policy 0, policy_version 109813 (0.0024) +[2024-06-18 09:59:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1799258112. Throughput: 0: 42884.4. Samples: 1799375360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:51,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 09:59:54,017][12883] Updated weights for policy 0, policy_version 109823 (0.0037) +[2024-06-18 09:59:56,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1799471104. Throughput: 0: 42874.4. Samples: 1799638080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 09:59:56,996][12645] Avg episode reward: [(0, '0.735')] +[2024-06-18 09:59:57,288][12883] Updated weights for policy 0, policy_version 109833 (0.0044) +[2024-06-18 10:00:01,620][12883] Updated weights for policy 0, policy_version 109843 (0.0043) +[2024-06-18 10:00:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1799684096. Throughput: 0: 42932.8. Samples: 1799766920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 10:00:01,994][12645] Avg episode reward: [(0, '0.802')] +[2024-06-18 10:00:04,882][12883] Updated weights for policy 0, policy_version 109853 (0.0036) +[2024-06-18 10:00:06,994][12645] Fps is (10 sec: 44246.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1799913472. Throughput: 0: 42931.9. Samples: 1800020760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:06,994][12645] Avg episode reward: [(0, '0.657')] +[2024-06-18 10:00:09,176][12883] Updated weights for policy 0, policy_version 109863 (0.0022) +[2024-06-18 10:00:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1800126464. Throughput: 0: 43053.3. Samples: 1800283040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:11,994][12645] Avg episode reward: [(0, '0.687')] +[2024-06-18 10:00:12,604][12883] Updated weights for policy 0, policy_version 109873 (0.0031) +[2024-06-18 10:00:16,902][12883] Updated weights for policy 0, policy_version 109883 (0.0027) +[2024-06-18 10:00:16,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1800323072. Throughput: 0: 43177.4. Samples: 1800413160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:16,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 10:00:20,022][12883] Updated weights for policy 0, policy_version 109893 (0.0024) +[2024-06-18 10:00:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 1800568832. Throughput: 0: 42966.9. Samples: 1800670960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:21,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 10:00:24,482][12883] Updated weights for policy 0, policy_version 109903 (0.0034) +[2024-06-18 10:00:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1800749056. Throughput: 0: 43131.6. Samples: 1800933360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:26,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 10:00:28,011][12883] Updated weights for policy 0, policy_version 109913 (0.0030) +[2024-06-18 10:00:31,961][12883] Updated weights for policy 0, policy_version 109923 (0.0033) +[2024-06-18 10:00:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1800978432. Throughput: 0: 42975.2. Samples: 1801053900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:31,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 10:00:35,584][12883] Updated weights for policy 0, policy_version 109933 (0.0030) +[2024-06-18 10:00:36,994][12645] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1801224192. Throughput: 0: 43198.2. Samples: 1801319280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:36,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 10:00:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109938_1801224192.pth... +[2024-06-18 10:00:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109308_1790902272.pth +[2024-06-18 10:00:39,371][12883] Updated weights for policy 0, policy_version 109943 (0.0030) +[2024-06-18 10:00:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1801404416. Throughput: 0: 43193.3. Samples: 1801581680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:41,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 10:00:43,040][12883] Updated weights for policy 0, policy_version 109953 (0.0040) +[2024-06-18 10:00:46,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1801617408. Throughput: 0: 43033.0. Samples: 1801703400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:46,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 10:00:47,064][12883] Updated weights for policy 0, policy_version 109963 (0.0033) +[2024-06-18 10:00:50,499][12883] Updated weights for policy 0, policy_version 109973 (0.0031) +[2024-06-18 10:00:51,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 1801863168. Throughput: 0: 43154.4. Samples: 1801962700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:51,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 10:00:54,769][12883] Updated weights for policy 0, policy_version 109983 (0.0037) +[2024-06-18 10:00:56,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 1802059776. Throughput: 0: 43121.2. Samples: 1802223500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:00:56,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 10:00:57,933][12883] Updated weights for policy 0, policy_version 109993 (0.0033) +[2024-06-18 10:01:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1802272768. Throughput: 0: 42996.8. Samples: 1802348020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 10:01:01,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 10:01:02,293][12883] Updated weights for policy 0, policy_version 110003 (0.0034) +[2024-06-18 10:01:05,651][12883] Updated weights for policy 0, policy_version 110013 (0.0033) +[2024-06-18 10:01:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1802502144. Throughput: 0: 43080.7. Samples: 1802609600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:06,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 10:01:07,661][12862] Signal inference workers to stop experience collection... (26300 times) +[2024-06-18 10:01:07,662][12862] Signal inference workers to resume experience collection... (26300 times) +[2024-06-18 10:01:07,706][12883] InferenceWorker_p0-w0: stopping experience collection (26300 times) +[2024-06-18 10:01:07,706][12883] InferenceWorker_p0-w0: resuming experience collection (26300 times) +[2024-06-18 10:01:09,753][12883] Updated weights for policy 0, policy_version 110023 (0.0047) +[2024-06-18 10:01:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1802682368. Throughput: 0: 43171.5. Samples: 1802876080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:11,994][12645] Avg episode reward: [(0, '0.183')] +[2024-06-18 10:01:13,199][12883] Updated weights for policy 0, policy_version 110033 (0.0028) +[2024-06-18 10:01:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1802911744. Throughput: 0: 43200.8. Samples: 1802997940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:16,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 10:01:17,331][12883] Updated weights for policy 0, policy_version 110043 (0.0020) +[2024-06-18 10:01:20,704][12883] Updated weights for policy 0, policy_version 110053 (0.0037) +[2024-06-18 10:01:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 1803141120. Throughput: 0: 43087.7. Samples: 1803258220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:21,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 10:01:24,781][12883] Updated weights for policy 0, policy_version 110063 (0.0035) +[2024-06-18 10:01:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42932.0). Total num frames: 1803354112. Throughput: 0: 43198.2. Samples: 1803525600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:26,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 10:01:28,342][12883] Updated weights for policy 0, policy_version 110073 (0.0043) +[2024-06-18 10:01:31,996][12645] Fps is (10 sec: 42588.9, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 1803567104. Throughput: 0: 43283.6. Samples: 1803651260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:31,997][12645] Avg episode reward: [(0, '0.153')] +[2024-06-18 10:01:32,483][12883] Updated weights for policy 0, policy_version 110083 (0.0036) +[2024-06-18 10:01:35,768][12883] Updated weights for policy 0, policy_version 110093 (0.0030) +[2024-06-18 10:01:36,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1803796480. Throughput: 0: 43285.7. Samples: 1803910560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:36,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 10:01:39,945][12883] Updated weights for policy 0, policy_version 110103 (0.0036) +[2024-06-18 10:01:41,994][12645] Fps is (10 sec: 42607.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1803993088. Throughput: 0: 43236.9. Samples: 1804169160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:41,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 10:01:43,352][12883] Updated weights for policy 0, policy_version 110113 (0.0045) +[2024-06-18 10:01:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1804206080. Throughput: 0: 43244.0. Samples: 1804294000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:47,000][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 10:01:47,676][12883] Updated weights for policy 0, policy_version 110123 (0.0041) +[2024-06-18 10:01:50,935][12883] Updated weights for policy 0, policy_version 110133 (0.0029) +[2024-06-18 10:01:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1804451840. Throughput: 0: 43154.7. Samples: 1804551560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:51,995][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 10:01:55,243][12883] Updated weights for policy 0, policy_version 110143 (0.0032) +[2024-06-18 10:01:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1804632064. Throughput: 0: 43115.5. Samples: 1804816280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:01:56,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 10:01:58,508][12883] Updated weights for policy 0, policy_version 110153 (0.0047) +[2024-06-18 10:02:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1804861440. Throughput: 0: 43174.5. Samples: 1804940800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 10:02:01,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 10:02:02,856][12883] Updated weights for policy 0, policy_version 110163 (0.0033) +[2024-06-18 10:02:06,191][12883] Updated weights for policy 0, policy_version 110173 (0.0036) +[2024-06-18 10:02:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1805074432. Throughput: 0: 43075.6. Samples: 1805196620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:06,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 10:02:11,152][12883] Updated weights for policy 0, policy_version 110183 (0.0036) +[2024-06-18 10:02:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1805271040. Throughput: 0: 42910.6. Samples: 1805456580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:11,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 10:02:14,275][12883] Updated weights for policy 0, policy_version 110193 (0.0046) +[2024-06-18 10:02:16,104][12862] Signal inference workers to stop experience collection... (26350 times) +[2024-06-18 10:02:16,129][12883] InferenceWorker_p0-w0: stopping experience collection (26350 times) +[2024-06-18 10:02:16,159][12862] Signal inference workers to resume experience collection... (26350 times) +[2024-06-18 10:02:16,160][12883] InferenceWorker_p0-w0: resuming experience collection (26350 times) +[2024-06-18 10:02:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 1805516800. Throughput: 0: 42735.4. Samples: 1805574260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:16,994][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 10:02:18,838][12883] Updated weights for policy 0, policy_version 110203 (0.0047) +[2024-06-18 10:02:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 1805713408. Throughput: 0: 42598.8. Samples: 1805827500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:21,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 10:02:22,027][12883] Updated weights for policy 0, policy_version 110213 (0.0026) +[2024-06-18 10:02:26,498][12883] Updated weights for policy 0, policy_version 110223 (0.0044) +[2024-06-18 10:02:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1805910016. Throughput: 0: 42565.5. Samples: 1806084600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:26,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 10:02:29,860][12883] Updated weights for policy 0, policy_version 110233 (0.0036) +[2024-06-18 10:02:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43419.1, 300 sec: 43098.2). Total num frames: 1806172160. Throughput: 0: 42552.8. Samples: 1806208880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:31,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 10:02:34,073][12883] Updated weights for policy 0, policy_version 110243 (0.0028) +[2024-06-18 10:02:36,994][12645] Fps is (10 sec: 42597.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1806336000. Throughput: 0: 42553.2. Samples: 1806466460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:36,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 10:02:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110250_1806336000.pth... +[2024-06-18 10:02:37,053][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109622_1796046848.pth +[2024-06-18 10:02:37,522][12883] Updated weights for policy 0, policy_version 110253 (0.0029) +[2024-06-18 10:02:41,563][12883] Updated weights for policy 0, policy_version 110263 (0.0031) +[2024-06-18 10:02:41,996][12645] Fps is (10 sec: 37675.1, 60 sec: 42596.9, 300 sec: 42820.2). Total num frames: 1806548992. Throughput: 0: 42244.6. Samples: 1806717380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:41,996][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 10:02:45,431][12883] Updated weights for policy 0, policy_version 110273 (0.0027) +[2024-06-18 10:02:46,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1806794752. Throughput: 0: 42392.5. Samples: 1806848460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:46,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 10:02:49,210][12883] Updated weights for policy 0, policy_version 110283 (0.0044) +[2024-06-18 10:02:51,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 1806974976. Throughput: 0: 42380.4. Samples: 1807103740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:51,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 10:02:53,387][12883] Updated weights for policy 0, policy_version 110293 (0.0027) +[2024-06-18 10:02:56,817][12883] Updated weights for policy 0, policy_version 110303 (0.0024) +[2024-06-18 10:02:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1807204352. Throughput: 0: 42168.0. Samples: 1807354140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:02:56,994][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 10:03:01,071][12883] Updated weights for policy 0, policy_version 110313 (0.0033) +[2024-06-18 10:03:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1807433728. Throughput: 0: 42571.6. Samples: 1807489980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) +[2024-06-18 10:03:01,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 10:03:05,092][12883] Updated weights for policy 0, policy_version 110323 (0.0042) +[2024-06-18 10:03:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 1807613952. Throughput: 0: 42563.5. Samples: 1807742860. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:07,003][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 10:03:08,882][12883] Updated weights for policy 0, policy_version 110333 (0.0030) +[2024-06-18 10:03:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1807826944. Throughput: 0: 42394.2. Samples: 1807992340. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:11,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 10:03:12,717][12883] Updated weights for policy 0, policy_version 110343 (0.0027) +[2024-06-18 10:03:16,632][12883] Updated weights for policy 0, policy_version 110353 (0.0044) +[2024-06-18 10:03:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 1808056320. Throughput: 0: 42545.8. Samples: 1808123440. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:16,994][12645] Avg episode reward: [(0, '0.622')] +[2024-06-18 10:03:20,304][12883] Updated weights for policy 0, policy_version 110363 (0.0037) +[2024-06-18 10:03:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1808269312. Throughput: 0: 42448.2. Samples: 1808376620. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:21,994][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 10:03:23,991][12862] Signal inference workers to stop experience collection... (26400 times) +[2024-06-18 10:03:24,039][12883] InferenceWorker_p0-w0: stopping experience collection (26400 times) +[2024-06-18 10:03:24,042][12862] Signal inference workers to resume experience collection... (26400 times) +[2024-06-18 10:03:24,060][12883] InferenceWorker_p0-w0: resuming experience collection (26400 times) +[2024-06-18 10:03:24,187][12883] Updated weights for policy 0, policy_version 110373 (0.0026) +[2024-06-18 10:03:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1808482304. Throughput: 0: 42518.6. Samples: 1808630620. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:26,994][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 10:03:27,883][12883] Updated weights for policy 0, policy_version 110383 (0.0029) +[2024-06-18 10:03:31,865][12883] Updated weights for policy 0, policy_version 110393 (0.0031) +[2024-06-18 10:03:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42820.9). Total num frames: 1808678912. Throughput: 0: 42461.0. Samples: 1808759200. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:31,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 10:03:35,349][12883] Updated weights for policy 0, policy_version 110403 (0.0043) +[2024-06-18 10:03:36,996][12645] Fps is (10 sec: 44227.0, 60 sec: 43143.1, 300 sec: 42931.3). Total num frames: 1808924672. Throughput: 0: 42529.0. Samples: 1809017640. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:36,996][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 10:03:39,811][12883] Updated weights for policy 0, policy_version 110413 (0.0033) +[2024-06-18 10:03:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 1809121280. Throughput: 0: 42466.2. Samples: 1809265120. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:41,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 10:03:43,193][12883] Updated weights for policy 0, policy_version 110423 (0.0031) +[2024-06-18 10:03:46,994][12645] Fps is (10 sec: 37692.0, 60 sec: 41779.3, 300 sec: 42820.6). Total num frames: 1809301504. Throughput: 0: 42313.5. Samples: 1809394080. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:46,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 10:03:47,514][12883] Updated weights for policy 0, policy_version 110433 (0.0045) +[2024-06-18 10:03:51,025][12883] Updated weights for policy 0, policy_version 110443 (0.0045) +[2024-06-18 10:03:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1809547264. Throughput: 0: 42342.7. Samples: 1809648280. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:51,994][12645] Avg episode reward: [(0, '0.657')] +[2024-06-18 10:03:55,141][12883] Updated weights for policy 0, policy_version 110453 (0.0037) +[2024-06-18 10:03:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1809760256. Throughput: 0: 42379.5. Samples: 1809899420. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:03:56,994][12645] Avg episode reward: [(0, '0.657')] +[2024-06-18 10:03:58,623][12883] Updated weights for policy 0, policy_version 110463 (0.0020) +[2024-06-18 10:04:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 1809940480. Throughput: 0: 42312.4. Samples: 1810027500. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:04:01,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 10:04:02,711][12883] Updated weights for policy 0, policy_version 110473 (0.0029) +[2024-06-18 10:04:06,152][12883] Updated weights for policy 0, policy_version 110483 (0.0040) +[2024-06-18 10:04:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1810202624. Throughput: 0: 42535.5. Samples: 1810290720. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) +[2024-06-18 10:04:06,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 10:04:10,170][12883] Updated weights for policy 0, policy_version 110493 (0.0033) +[2024-06-18 10:04:11,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1810399232. Throughput: 0: 42622.6. Samples: 1810548640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:11,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 10:04:13,596][12883] Updated weights for policy 0, policy_version 110503 (0.0031) +[2024-06-18 10:04:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1810595840. Throughput: 0: 42432.3. Samples: 1810668660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:16,995][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 10:04:17,956][12883] Updated weights for policy 0, policy_version 110513 (0.0038) +[2024-06-18 10:04:21,081][12883] Updated weights for policy 0, policy_version 110523 (0.0032) +[2024-06-18 10:04:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1810841600. Throughput: 0: 42667.5. Samples: 1810937580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:21,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 10:04:25,303][12883] Updated weights for policy 0, policy_version 110533 (0.0025) +[2024-06-18 10:04:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 1811021824. Throughput: 0: 42930.1. Samples: 1811196980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:26,994][12645] Avg episode reward: [(0, '0.564')] +[2024-06-18 10:04:28,743][12883] Updated weights for policy 0, policy_version 110543 (0.0039) +[2024-06-18 10:04:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1811218432. Throughput: 0: 42681.2. Samples: 1811314740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:31,996][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 10:04:32,943][12883] Updated weights for policy 0, policy_version 110553 (0.0044) +[2024-06-18 10:04:36,719][12883] Updated weights for policy 0, policy_version 110563 (0.0027) +[2024-06-18 10:04:36,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42325.3, 300 sec: 42820.2). Total num frames: 1811464192. Throughput: 0: 42745.0. Samples: 1811571900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:36,997][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 10:04:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110563_1811464192.pth... +[2024-06-18 10:04:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109938_1801224192.pth +[2024-06-18 10:04:40,859][12883] Updated weights for policy 0, policy_version 110573 (0.0024) +[2024-06-18 10:04:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1811660800. Throughput: 0: 42957.4. Samples: 1811832500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:41,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 10:04:43,035][12862] Signal inference workers to stop experience collection... (26450 times) +[2024-06-18 10:04:43,035][12862] Signal inference workers to resume experience collection... (26450 times) +[2024-06-18 10:04:43,076][12883] InferenceWorker_p0-w0: stopping experience collection (26450 times) +[2024-06-18 10:04:43,076][12883] InferenceWorker_p0-w0: resuming experience collection (26450 times) +[2024-06-18 10:04:44,473][12883] Updated weights for policy 0, policy_version 110583 (0.0029) +[2024-06-18 10:04:46,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1811873792. Throughput: 0: 42812.4. Samples: 1811954060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:46,994][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 10:04:48,858][12883] Updated weights for policy 0, policy_version 110593 (0.0028) +[2024-06-18 10:04:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 1812103168. Throughput: 0: 42716.1. Samples: 1812212940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:51,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 10:04:52,003][12883] Updated weights for policy 0, policy_version 110603 (0.0040) +[2024-06-18 10:04:56,415][12883] Updated weights for policy 0, policy_version 110613 (0.0033) +[2024-06-18 10:04:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1812316160. Throughput: 0: 42795.2. Samples: 1812474420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:04:56,994][12645] Avg episode reward: [(0, '0.667')] +[2024-06-18 10:04:59,522][12883] Updated weights for policy 0, policy_version 110623 (0.0029) +[2024-06-18 10:05:01,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1812512768. Throughput: 0: 42842.7. Samples: 1812596580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:05:01,994][12645] Avg episode reward: [(0, '0.685')] +[2024-06-18 10:05:03,923][12883] Updated weights for policy 0, policy_version 110633 (0.0026) +[2024-06-18 10:05:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1812742144. Throughput: 0: 42727.9. Samples: 1812860340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:05:06,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 10:05:07,243][12883] Updated weights for policy 0, policy_version 110643 (0.0040) +[2024-06-18 10:05:11,576][12883] Updated weights for policy 0, policy_version 110653 (0.0027) +[2024-06-18 10:05:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1812955136. Throughput: 0: 42600.5. Samples: 1813114000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:11,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 10:05:15,066][12883] Updated weights for policy 0, policy_version 110663 (0.0039) +[2024-06-18 10:05:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.4). Total num frames: 1813168128. Throughput: 0: 42766.2. Samples: 1813239220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:16,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 10:05:19,076][12883] Updated weights for policy 0, policy_version 110673 (0.0029) +[2024-06-18 10:05:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1813381120. Throughput: 0: 42845.2. Samples: 1813499840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:21,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 10:05:22,762][12883] Updated weights for policy 0, policy_version 110683 (0.0027) +[2024-06-18 10:05:26,789][12883] Updated weights for policy 0, policy_version 110693 (0.0033) +[2024-06-18 10:05:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1813594112. Throughput: 0: 42811.1. Samples: 1813759000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:26,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 10:05:30,455][12883] Updated weights for policy 0, policy_version 110703 (0.0041) +[2024-06-18 10:05:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1813823488. Throughput: 0: 42880.9. Samples: 1813883700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:31,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 10:05:34,421][12883] Updated weights for policy 0, policy_version 110713 (0.0025) +[2024-06-18 10:05:36,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 1814036480. Throughput: 0: 42932.9. Samples: 1814145020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:36,996][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 10:05:37,944][12883] Updated weights for policy 0, policy_version 110723 (0.0040) +[2024-06-18 10:05:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1814233088. Throughput: 0: 42848.9. Samples: 1814402620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:41,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 10:05:42,031][12883] Updated weights for policy 0, policy_version 110733 (0.0033) +[2024-06-18 10:05:45,535][12883] Updated weights for policy 0, policy_version 110743 (0.0036) +[2024-06-18 10:05:46,994][12645] Fps is (10 sec: 42607.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1814462464. Throughput: 0: 42889.4. Samples: 1814526600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:46,994][12645] Avg episode reward: [(0, '0.208')] +[2024-06-18 10:05:49,505][12883] Updated weights for policy 0, policy_version 110753 (0.0031) +[2024-06-18 10:05:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1814691840. Throughput: 0: 42871.7. Samples: 1814789560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:51,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 10:05:53,122][12883] Updated weights for policy 0, policy_version 110763 (0.0023) +[2024-06-18 10:05:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1814888448. Throughput: 0: 42933.5. Samples: 1815046000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:05:56,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 10:05:57,088][12883] Updated weights for policy 0, policy_version 110773 (0.0041) +[2024-06-18 10:06:00,727][12883] Updated weights for policy 0, policy_version 110783 (0.0044) +[2024-06-18 10:06:02,000][12645] Fps is (10 sec: 40933.9, 60 sec: 43140.1, 300 sec: 42708.6). Total num frames: 1815101440. Throughput: 0: 42886.5. Samples: 1815169380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:06:02,001][12645] Avg episode reward: [(0, '0.788')] +[2024-06-18 10:06:04,528][12862] Signal inference workers to stop experience collection... (26500 times) +[2024-06-18 10:06:04,528][12862] Signal inference workers to resume experience collection... (26500 times) +[2024-06-18 10:06:04,561][12883] InferenceWorker_p0-w0: stopping experience collection (26500 times) +[2024-06-18 10:06:04,562][12883] InferenceWorker_p0-w0: resuming experience collection (26500 times) +[2024-06-18 10:06:04,683][12883] Updated weights for policy 0, policy_version 110793 (0.0038) +[2024-06-18 10:06:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1815314432. Throughput: 0: 42927.7. Samples: 1815431580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 10:06:06,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 10:06:08,266][12883] Updated weights for policy 0, policy_version 110803 (0.0031) +[2024-06-18 10:06:11,994][12645] Fps is (10 sec: 42625.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1815527424. Throughput: 0: 42993.0. Samples: 1815693680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:11,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 10:06:12,314][12883] Updated weights for policy 0, policy_version 110813 (0.0029) +[2024-06-18 10:06:15,883][12883] Updated weights for policy 0, policy_version 110823 (0.0039) +[2024-06-18 10:06:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1815756800. Throughput: 0: 43030.3. Samples: 1815820060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:16,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 10:06:20,182][12883] Updated weights for policy 0, policy_version 110833 (0.0038) +[2024-06-18 10:06:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1815969792. Throughput: 0: 42903.9. Samples: 1816075600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:21,996][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 10:06:23,527][12883] Updated weights for policy 0, policy_version 110843 (0.0037) +[2024-06-18 10:06:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1816166400. Throughput: 0: 42862.7. Samples: 1816331440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:26,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 10:06:28,023][12883] Updated weights for policy 0, policy_version 110853 (0.0038) +[2024-06-18 10:06:31,184][12883] Updated weights for policy 0, policy_version 110863 (0.0036) +[2024-06-18 10:06:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1816412160. Throughput: 0: 43044.9. Samples: 1816463620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:31,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 10:06:35,666][12883] Updated weights for policy 0, policy_version 110873 (0.0040) +[2024-06-18 10:06:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1816592384. Throughput: 0: 42864.4. Samples: 1816718460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:36,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 10:06:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110876_1816592384.pth... +[2024-06-18 10:06:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110250_1806336000.pth +[2024-06-18 10:06:38,838][12883] Updated weights for policy 0, policy_version 110883 (0.0036) +[2024-06-18 10:06:41,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1816805376. Throughput: 0: 42903.9. Samples: 1816976780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:41,997][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 10:06:43,382][12883] Updated weights for policy 0, policy_version 110893 (0.0024) +[2024-06-18 10:06:46,494][12883] Updated weights for policy 0, policy_version 110903 (0.0037) +[2024-06-18 10:06:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1817051136. Throughput: 0: 43082.4. Samples: 1817107820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:46,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 10:06:51,095][12883] Updated weights for policy 0, policy_version 110913 (0.0034) +[2024-06-18 10:06:51,994][12645] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1817231360. Throughput: 0: 42845.3. Samples: 1817359620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:51,994][12645] Avg episode reward: [(0, '0.156')] +[2024-06-18 10:06:54,236][12883] Updated weights for policy 0, policy_version 110923 (0.0031) +[2024-06-18 10:06:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1817444352. Throughput: 0: 42739.9. Samples: 1817616980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:06:56,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 10:06:58,698][12883] Updated weights for policy 0, policy_version 110933 (0.0036) +[2024-06-18 10:07:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 1817673728. Throughput: 0: 42685.4. Samples: 1817740900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:07:01,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 10:07:02,098][12883] Updated weights for policy 0, policy_version 110943 (0.0037) +[2024-06-18 10:07:06,208][12883] Updated weights for policy 0, policy_version 110953 (0.0036) +[2024-06-18 10:07:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1817886720. Throughput: 0: 42807.6. Samples: 1818001940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 10:07:06,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 10:07:09,903][12883] Updated weights for policy 0, policy_version 110963 (0.0034) +[2024-06-18 10:07:11,993][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1818083328. Throughput: 0: 42833.4. Samples: 1818258940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:11,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 10:07:14,183][12883] Updated weights for policy 0, policy_version 110973 (0.0036) +[2024-06-18 10:07:16,999][12645] Fps is (10 sec: 44215.3, 60 sec: 42868.0, 300 sec: 42764.3). Total num frames: 1818329088. Throughput: 0: 42589.7. Samples: 1818380360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:16,999][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 10:07:17,401][12883] Updated weights for policy 0, policy_version 110983 (0.0033) +[2024-06-18 10:07:21,892][12883] Updated weights for policy 0, policy_version 110993 (0.0041) +[2024-06-18 10:07:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1818509312. Throughput: 0: 42731.6. Samples: 1818641380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:21,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 10:07:24,938][12883] Updated weights for policy 0, policy_version 111003 (0.0036) +[2024-06-18 10:07:26,994][12645] Fps is (10 sec: 39340.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1818722304. Throughput: 0: 42619.4. Samples: 1818894560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:26,997][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 10:07:29,521][12883] Updated weights for policy 0, policy_version 111013 (0.0040) +[2024-06-18 10:07:29,897][12862] Signal inference workers to stop experience collection... (26550 times) +[2024-06-18 10:07:29,897][12862] Signal inference workers to resume experience collection... (26550 times) +[2024-06-18 10:07:29,946][12883] InferenceWorker_p0-w0: stopping experience collection (26550 times) +[2024-06-18 10:07:29,947][12883] InferenceWorker_p0-w0: resuming experience collection (26550 times) +[2024-06-18 10:07:31,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1818951680. Throughput: 0: 42545.3. Samples: 1819022360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:31,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 10:07:33,031][12883] Updated weights for policy 0, policy_version 111023 (0.0038) +[2024-06-18 10:07:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1819131904. Throughput: 0: 42656.4. Samples: 1819279160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:36,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 10:07:37,318][12883] Updated weights for policy 0, policy_version 111033 (0.0033) +[2024-06-18 10:07:40,623][12883] Updated weights for policy 0, policy_version 111043 (0.0027) +[2024-06-18 10:07:41,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 1819377664. Throughput: 0: 42506.3. Samples: 1819529760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:41,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 10:07:44,967][12883] Updated weights for policy 0, policy_version 111053 (0.0042) +[2024-06-18 10:07:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1819590656. Throughput: 0: 42771.2. Samples: 1819665600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:46,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 10:07:48,025][12883] Updated weights for policy 0, policy_version 111063 (0.0027) +[2024-06-18 10:07:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1819787264. Throughput: 0: 42591.1. Samples: 1819918540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:51,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 10:07:52,467][12883] Updated weights for policy 0, policy_version 111073 (0.0033) +[2024-06-18 10:07:56,073][12883] Updated weights for policy 0, policy_version 111083 (0.0031) +[2024-06-18 10:07:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1820033024. Throughput: 0: 42532.3. Samples: 1820172900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:07:56,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 10:08:00,091][12883] Updated weights for policy 0, policy_version 111093 (0.0034) +[2024-06-18 10:08:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1820229632. Throughput: 0: 42824.7. Samples: 1820307260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:08:01,996][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 10:08:03,599][12883] Updated weights for policy 0, policy_version 111103 (0.0036) +[2024-06-18 10:08:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1820426240. Throughput: 0: 42627.5. Samples: 1820559620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:08:06,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 10:08:07,684][12883] Updated weights for policy 0, policy_version 111113 (0.0040) +[2024-06-18 10:08:11,429][12883] Updated weights for policy 0, policy_version 111123 (0.0021) +[2024-06-18 10:08:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1820672000. Throughput: 0: 42469.8. Samples: 1820805700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 10:08:11,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 10:08:15,279][12883] Updated weights for policy 0, policy_version 111133 (0.0033) +[2024-06-18 10:08:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42328.8, 300 sec: 42709.5). Total num frames: 1820868608. Throughput: 0: 42670.0. Samples: 1820942500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:16,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 10:08:18,963][12883] Updated weights for policy 0, policy_version 111143 (0.0046) +[2024-06-18 10:08:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1821048832. Throughput: 0: 42560.9. Samples: 1821194400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:21,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 10:08:23,068][12883] Updated weights for policy 0, policy_version 111153 (0.0037) +[2024-06-18 10:08:26,858][12883] Updated weights for policy 0, policy_version 111163 (0.0041) +[2024-06-18 10:08:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1821294592. Throughput: 0: 42673.3. Samples: 1821450060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:26,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 10:08:30,731][12883] Updated weights for policy 0, policy_version 111173 (0.0030) +[2024-06-18 10:08:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1821507584. Throughput: 0: 42594.6. Samples: 1821582360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:31,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 10:08:34,496][12883] Updated weights for policy 0, policy_version 111183 (0.0035) +[2024-06-18 10:08:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1821704192. Throughput: 0: 42556.1. Samples: 1821833560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:36,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 10:08:37,086][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111189_1821720576.pth... +[2024-06-18 10:08:37,134][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110563_1811464192.pth +[2024-06-18 10:08:38,646][12883] Updated weights for policy 0, policy_version 111193 (0.0022) +[2024-06-18 10:08:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1821933568. Throughput: 0: 42538.7. Samples: 1822087140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:41,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 10:08:42,316][12883] Updated weights for policy 0, policy_version 111203 (0.0032) +[2024-06-18 10:08:46,499][12883] Updated weights for policy 0, policy_version 111213 (0.0022) +[2024-06-18 10:08:47,000][12645] Fps is (10 sec: 44209.1, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 1822146560. Throughput: 0: 42472.3. Samples: 1822218780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:47,000][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 10:08:49,949][12883] Updated weights for policy 0, policy_version 111223 (0.0027) +[2024-06-18 10:08:51,995][12645] Fps is (10 sec: 42593.0, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 1822359552. Throughput: 0: 42522.9. Samples: 1822473200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:51,995][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 10:08:54,210][12883] Updated weights for policy 0, policy_version 111233 (0.0033) +[2024-06-18 10:08:57,000][12645] Fps is (10 sec: 42597.6, 60 sec: 42320.8, 300 sec: 42819.6). Total num frames: 1822572544. Throughput: 0: 42704.6. Samples: 1822727680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:08:57,001][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 10:08:57,869][12883] Updated weights for policy 0, policy_version 111243 (0.0032) +[2024-06-18 10:09:01,988][12883] Updated weights for policy 0, policy_version 111253 (0.0037) +[2024-06-18 10:09:01,994][12645] Fps is (10 sec: 40964.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1822769152. Throughput: 0: 42544.8. Samples: 1822857020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:09:01,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 10:09:05,525][12883] Updated weights for policy 0, policy_version 111263 (0.0042) +[2024-06-18 10:09:06,994][12645] Fps is (10 sec: 44264.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1823014912. Throughput: 0: 42636.8. Samples: 1823113060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:09:06,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 10:09:09,523][12862] Signal inference workers to stop experience collection... (26600 times) +[2024-06-18 10:09:09,524][12862] Signal inference workers to resume experience collection... (26600 times) +[2024-06-18 10:09:09,535][12883] InferenceWorker_p0-w0: stopping experience collection (26600 times) +[2024-06-18 10:09:09,547][12883] InferenceWorker_p0-w0: resuming experience collection (26600 times) +[2024-06-18 10:09:09,687][12883] Updated weights for policy 0, policy_version 111273 (0.0042) +[2024-06-18 10:09:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1823227904. Throughput: 0: 42490.7. Samples: 1823362140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 10:09:11,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 10:09:13,222][12883] Updated weights for policy 0, policy_version 111283 (0.0025) +[2024-06-18 10:09:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1823391744. Throughput: 0: 42440.4. Samples: 1823492180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:16,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 10:09:17,335][12883] Updated weights for policy 0, policy_version 111293 (0.0024) +[2024-06-18 10:09:20,841][12883] Updated weights for policy 0, policy_version 111303 (0.0036) +[2024-06-18 10:09:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1823653888. Throughput: 0: 42647.0. Samples: 1823752680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:21,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 10:09:25,004][12883] Updated weights for policy 0, policy_version 111313 (0.0035) +[2024-06-18 10:09:26,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 1823866880. Throughput: 0: 42587.7. Samples: 1824003600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:26,995][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 10:09:28,582][12883] Updated weights for policy 0, policy_version 111323 (0.0031) +[2024-06-18 10:09:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1824047104. Throughput: 0: 42496.1. Samples: 1824130840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:31,994][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 10:09:32,711][12883] Updated weights for policy 0, policy_version 111333 (0.0037) +[2024-06-18 10:09:36,359][12883] Updated weights for policy 0, policy_version 111343 (0.0032) +[2024-06-18 10:09:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1824276480. Throughput: 0: 42575.3. Samples: 1824389040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:36,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 10:09:40,511][12883] Updated weights for policy 0, policy_version 111353 (0.0037) +[2024-06-18 10:09:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1824489472. Throughput: 0: 42392.4. Samples: 1824635060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:41,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 10:09:44,161][12883] Updated weights for policy 0, policy_version 111363 (0.0051) +[2024-06-18 10:09:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42329.6, 300 sec: 42653.9). Total num frames: 1824686080. Throughput: 0: 42549.7. Samples: 1824771760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:46,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 10:09:48,007][12883] Updated weights for policy 0, policy_version 111373 (0.0038) +[2024-06-18 10:09:51,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42053.1, 300 sec: 42598.4). Total num frames: 1824882688. Throughput: 0: 42394.6. Samples: 1825020820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:51,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 10:09:52,130][12883] Updated weights for policy 0, policy_version 111383 (0.0036) +[2024-06-18 10:09:55,958][12883] Updated weights for policy 0, policy_version 111393 (0.0024) +[2024-06-18 10:09:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 1825128448. Throughput: 0: 42584.8. Samples: 1825278460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:09:56,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 10:09:59,733][12883] Updated weights for policy 0, policy_version 111403 (0.0029) +[2024-06-18 10:10:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1825325056. Throughput: 0: 42620.0. Samples: 1825410080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:10:01,995][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 10:10:03,372][12883] Updated weights for policy 0, policy_version 111413 (0.0034) +[2024-06-18 10:10:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1825538048. Throughput: 0: 42427.1. Samples: 1825661900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:10:06,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 10:10:07,142][12883] Updated weights for policy 0, policy_version 111423 (0.0034) +[2024-06-18 10:10:10,866][12883] Updated weights for policy 0, policy_version 111433 (0.0041) +[2024-06-18 10:10:11,351][12862] Signal inference workers to stop experience collection... (26650 times) +[2024-06-18 10:10:11,351][12862] Signal inference workers to resume experience collection... (26650 times) +[2024-06-18 10:10:11,372][12883] InferenceWorker_p0-w0: stopping experience collection (26650 times) +[2024-06-18 10:10:11,372][12883] InferenceWorker_p0-w0: resuming experience collection (26650 times) +[2024-06-18 10:10:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1825767424. Throughput: 0: 42638.4. Samples: 1825922320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:10:11,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 10:10:14,658][12883] Updated weights for policy 0, policy_version 111443 (0.0028) +[2024-06-18 10:10:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1825980416. Throughput: 0: 42763.4. Samples: 1826055200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:16,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 10:10:18,632][12883] Updated weights for policy 0, policy_version 111453 (0.0037) +[2024-06-18 10:10:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1826177024. Throughput: 0: 42551.3. Samples: 1826303840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:21,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 10:10:22,238][12883] Updated weights for policy 0, policy_version 111463 (0.0037) +[2024-06-18 10:10:26,207][12883] Updated weights for policy 0, policy_version 111473 (0.0038) +[2024-06-18 10:10:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1826406400. Throughput: 0: 42847.3. Samples: 1826563200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:26,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 10:10:29,772][12883] Updated weights for policy 0, policy_version 111483 (0.0040) +[2024-06-18 10:10:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1826586624. Throughput: 0: 42575.3. Samples: 1826687640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:31,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 10:10:34,305][12883] Updated weights for policy 0, policy_version 111493 (0.0033) +[2024-06-18 10:10:37,000][12645] Fps is (10 sec: 42572.3, 60 sec: 42594.0, 300 sec: 42708.6). Total num frames: 1826832384. Throughput: 0: 42610.6. Samples: 1826938560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:37,001][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 10:10:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111501_1826832384.pth... +[2024-06-18 10:10:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110876_1816592384.pth +[2024-06-18 10:10:37,774][12883] Updated weights for policy 0, policy_version 111503 (0.0027) +[2024-06-18 10:10:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1827012608. Throughput: 0: 42540.5. Samples: 1827192780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:41,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 10:10:42,083][12883] Updated weights for policy 0, policy_version 111513 (0.0038) +[2024-06-18 10:10:45,319][12883] Updated weights for policy 0, policy_version 111523 (0.0027) +[2024-06-18 10:10:46,994][12645] Fps is (10 sec: 40985.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1827241984. Throughput: 0: 42363.9. Samples: 1827316460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:46,994][12645] Avg episode reward: [(0, '0.748')] +[2024-06-18 10:10:49,843][12883] Updated weights for policy 0, policy_version 111533 (0.0030) +[2024-06-18 10:10:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1827471360. Throughput: 0: 42524.9. Samples: 1827575520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:51,994][12645] Avg episode reward: [(0, '0.749')] +[2024-06-18 10:10:52,973][12883] Updated weights for policy 0, policy_version 111543 (0.0023) +[2024-06-18 10:10:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42543.8). Total num frames: 1827651584. Throughput: 0: 42386.3. Samples: 1827829700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:10:56,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 10:10:57,646][12883] Updated weights for policy 0, policy_version 111553 (0.0043) +[2024-06-18 10:11:01,079][12883] Updated weights for policy 0, policy_version 111563 (0.0036) +[2024-06-18 10:11:01,995][12645] Fps is (10 sec: 42591.6, 60 sec: 42870.4, 300 sec: 42653.7). Total num frames: 1827897344. Throughput: 0: 42038.1. Samples: 1827946980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:11:01,996][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 10:11:05,326][12883] Updated weights for policy 0, policy_version 111573 (0.0045) +[2024-06-18 10:11:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1828093952. Throughput: 0: 42266.2. Samples: 1828205820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:11:06,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 10:11:08,841][12883] Updated weights for policy 0, policy_version 111583 (0.0034) +[2024-06-18 10:11:11,994][12645] Fps is (10 sec: 39327.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1828290560. Throughput: 0: 42031.7. Samples: 1828454620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:11:11,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 10:11:12,963][12883] Updated weights for policy 0, policy_version 111593 (0.0028) +[2024-06-18 10:11:16,440][12883] Updated weights for policy 0, policy_version 111603 (0.0040) +[2024-06-18 10:11:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1828536320. Throughput: 0: 42132.8. Samples: 1828583620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:16,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 10:11:20,566][12883] Updated weights for policy 0, policy_version 111613 (0.0034) +[2024-06-18 10:11:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1828700160. Throughput: 0: 42220.6. Samples: 1828838220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:21,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 10:11:24,073][12883] Updated weights for policy 0, policy_version 111623 (0.0038) +[2024-06-18 10:11:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1828929536. Throughput: 0: 42206.7. Samples: 1829092080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:26,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 10:11:28,599][12883] Updated weights for policy 0, policy_version 111633 (0.0034) +[2024-06-18 10:11:31,527][12862] Signal inference workers to stop experience collection... (26700 times) +[2024-06-18 10:11:31,527][12862] Signal inference workers to resume experience collection... (26700 times) +[2024-06-18 10:11:31,552][12883] InferenceWorker_p0-w0: stopping experience collection (26700 times) +[2024-06-18 10:11:31,552][12883] InferenceWorker_p0-w0: resuming experience collection (26700 times) +[2024-06-18 10:11:31,666][12883] Updated weights for policy 0, policy_version 111643 (0.0030) +[2024-06-18 10:11:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1829158912. Throughput: 0: 42314.7. Samples: 1829220620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:31,994][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 10:11:36,055][12883] Updated weights for policy 0, policy_version 111653 (0.0034) +[2024-06-18 10:11:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42056.7, 300 sec: 42543.2). Total num frames: 1829355520. Throughput: 0: 42335.1. Samples: 1829480600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:36,994][12645] Avg episode reward: [(0, '0.681')] +[2024-06-18 10:11:39,765][12883] Updated weights for policy 0, policy_version 111663 (0.0044) +[2024-06-18 10:11:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1829568512. Throughput: 0: 42286.5. Samples: 1829732600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:41,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 10:11:43,609][12883] Updated weights for policy 0, policy_version 111673 (0.0043) +[2024-06-18 10:11:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1829797888. Throughput: 0: 42515.6. Samples: 1829860120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:46,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 10:11:47,302][12883] Updated weights for policy 0, policy_version 111683 (0.0030) +[2024-06-18 10:11:51,316][12883] Updated weights for policy 0, policy_version 111693 (0.0044) +[2024-06-18 10:11:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1829994496. Throughput: 0: 42537.3. Samples: 1830120000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:51,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 10:11:54,804][12883] Updated weights for policy 0, policy_version 111703 (0.0031) +[2024-06-18 10:11:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1830191104. Throughput: 0: 42695.6. Samples: 1830375920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:11:56,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 10:11:59,047][12883] Updated weights for policy 0, policy_version 111713 (0.0028) +[2024-06-18 10:12:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42599.5, 300 sec: 42598.4). Total num frames: 1830453248. Throughput: 0: 42664.9. Samples: 1830503540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:12:01,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 10:12:02,191][12883] Updated weights for policy 0, policy_version 111723 (0.0036) +[2024-06-18 10:12:06,695][12883] Updated weights for policy 0, policy_version 111733 (0.0042) +[2024-06-18 10:12:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1830633472. Throughput: 0: 42732.8. Samples: 1830761200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:12:06,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 10:12:09,780][12883] Updated weights for policy 0, policy_version 111743 (0.0028) +[2024-06-18 10:12:11,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42432.5). Total num frames: 1830846464. Throughput: 0: 42788.5. Samples: 1831017560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:12:11,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 10:12:14,411][12883] Updated weights for policy 0, policy_version 111753 (0.0033) +[2024-06-18 10:12:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1831075840. Throughput: 0: 42731.2. Samples: 1831143520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 10:12:16,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 10:12:17,495][12883] Updated weights for policy 0, policy_version 111763 (0.0043) +[2024-06-18 10:12:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1831288832. Throughput: 0: 42657.8. Samples: 1831400200. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:12:21,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 10:12:21,999][12883] Updated weights for policy 0, policy_version 111773 (0.0033) +[2024-06-18 10:12:25,770][12883] Updated weights for policy 0, policy_version 111783 (0.0038) +[2024-06-18 10:12:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1831501824. Throughput: 0: 42851.7. Samples: 1831660920. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:12:26,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 10:12:29,472][12883] Updated weights for policy 0, policy_version 111793 (0.0036) +[2024-06-18 10:12:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1831731200. Throughput: 0: 42729.5. Samples: 1831782940. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:12:31,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 10:12:33,330][12883] Updated weights for policy 0, policy_version 111803 (0.0023) +[2024-06-18 10:12:36,994][12645] Fps is (10 sec: 42597.2, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1831927808. Throughput: 0: 42770.5. Samples: 1832044680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:12:36,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 10:12:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111812_1831927808.pth... +[2024-06-18 10:12:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111189_1821720576.pth +[2024-06-18 10:12:37,256][12883] Updated weights for policy 0, policy_version 111813 (0.0041) +[2024-06-18 10:12:40,983][12883] Updated weights for policy 0, policy_version 111823 (0.0027) +[2024-06-18 10:12:42,000][12645] Fps is (10 sec: 40934.1, 60 sec: 42867.1, 300 sec: 42541.9). Total num frames: 1832140800. Throughput: 0: 42809.5. Samples: 1832302620. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:12:42,000][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 10:12:44,868][12883] Updated weights for policy 0, policy_version 111833 (0.0036) +[2024-06-18 10:12:46,994][12645] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1832353792. Throughput: 0: 42737.9. Samples: 1832426740. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:12:46,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 10:12:48,763][12883] Updated weights for policy 0, policy_version 111843 (0.0033) +[2024-06-18 10:12:51,994][12645] Fps is (10 sec: 44263.9, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1832583168. Throughput: 0: 42840.4. Samples: 1832689020. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:12:51,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 10:12:52,545][12883] Updated weights for policy 0, policy_version 111853 (0.0042) +[2024-06-18 10:12:56,422][12883] Updated weights for policy 0, policy_version 111863 (0.0037) +[2024-06-18 10:12:56,995][12645] Fps is (10 sec: 42591.2, 60 sec: 43143.3, 300 sec: 42542.6). Total num frames: 1832779776. Throughput: 0: 42663.3. Samples: 1832937480. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:12:56,996][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 10:13:00,414][12883] Updated weights for policy 0, policy_version 111873 (0.0034) +[2024-06-18 10:13:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1832992768. Throughput: 0: 42603.0. Samples: 1833060660. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:13:01,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 10:13:04,237][12883] Updated weights for policy 0, policy_version 111883 (0.0034) +[2024-06-18 10:13:06,994][12645] Fps is (10 sec: 42605.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1833205760. Throughput: 0: 42643.5. Samples: 1833319160. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:13:06,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 10:13:08,120][12883] Updated weights for policy 0, policy_version 111893 (0.0047) +[2024-06-18 10:13:08,465][12862] Signal inference workers to stop experience collection... (26750 times) +[2024-06-18 10:13:08,507][12883] InferenceWorker_p0-w0: stopping experience collection (26750 times) +[2024-06-18 10:13:08,531][12862] Signal inference workers to resume experience collection... (26750 times) +[2024-06-18 10:13:08,536][12883] InferenceWorker_p0-w0: resuming experience collection (26750 times) +[2024-06-18 10:13:11,822][12883] Updated weights for policy 0, policy_version 111903 (0.0028) +[2024-06-18 10:13:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1833418752. Throughput: 0: 42547.1. Samples: 1833575540. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:13:11,994][12645] Avg episode reward: [(0, '0.678')] +[2024-06-18 10:13:15,654][12883] Updated weights for policy 0, policy_version 111913 (0.0037) +[2024-06-18 10:13:17,000][12645] Fps is (10 sec: 42571.2, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 1833631744. Throughput: 0: 42723.2. Samples: 1833705760. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) +[2024-06-18 10:13:17,009][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 10:13:19,212][12883] Updated weights for policy 0, policy_version 111923 (0.0024) +[2024-06-18 10:13:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1833844736. Throughput: 0: 42632.6. Samples: 1833963140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:13:21,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 10:13:23,250][12883] Updated weights for policy 0, policy_version 111933 (0.0029) +[2024-06-18 10:13:26,994][12645] Fps is (10 sec: 42625.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1834057728. Throughput: 0: 42566.4. Samples: 1834217840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:13:26,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 10:13:27,058][12883] Updated weights for policy 0, policy_version 111943 (0.0032) +[2024-06-18 10:13:30,967][12883] Updated weights for policy 0, policy_version 111953 (0.0026) +[2024-06-18 10:13:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1834287104. Throughput: 0: 42731.4. Samples: 1834349660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:13:31,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 10:13:34,598][12883] Updated weights for policy 0, policy_version 111963 (0.0038) +[2024-06-18 10:13:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1834483712. Throughput: 0: 42544.1. Samples: 1834603500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:13:36,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 10:13:38,655][12883] Updated weights for policy 0, policy_version 111973 (0.0034) +[2024-06-18 10:13:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42602.9, 300 sec: 42543.8). Total num frames: 1834696704. Throughput: 0: 42695.9. Samples: 1834858720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:13:41,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 10:13:42,470][12883] Updated weights for policy 0, policy_version 111983 (0.0034) +[2024-06-18 10:13:46,374][12883] Updated weights for policy 0, policy_version 111993 (0.0028) +[2024-06-18 10:13:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42598.6). Total num frames: 1834926080. Throughput: 0: 42834.1. Samples: 1834988200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:13:46,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 10:13:50,242][12883] Updated weights for policy 0, policy_version 112003 (0.0031) +[2024-06-18 10:13:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.4, 300 sec: 42488.2). Total num frames: 1835106304. Throughput: 0: 42598.6. Samples: 1835236100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:13:51,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 10:13:54,035][12883] Updated weights for policy 0, policy_version 112013 (0.0033) +[2024-06-18 10:13:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42599.6, 300 sec: 42598.4). Total num frames: 1835335680. Throughput: 0: 42755.1. Samples: 1835499520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:13:56,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 10:13:57,695][12883] Updated weights for policy 0, policy_version 112023 (0.0034) +[2024-06-18 10:14:01,470][12883] Updated weights for policy 0, policy_version 112033 (0.0027) +[2024-06-18 10:14:01,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1835565056. Throughput: 0: 42831.8. Samples: 1835632920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:14:01,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 10:14:05,275][12883] Updated weights for policy 0, policy_version 112043 (0.0024) +[2024-06-18 10:14:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1835761664. Throughput: 0: 42772.4. Samples: 1835887900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:14:06,995][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 10:14:08,938][12883] Updated weights for policy 0, policy_version 112053 (0.0026) +[2024-06-18 10:14:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1835974656. Throughput: 0: 42847.9. Samples: 1836146000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:14:11,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 10:14:12,838][12883] Updated weights for policy 0, policy_version 112063 (0.0026) +[2024-06-18 10:14:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42602.9, 300 sec: 42487.3). Total num frames: 1836187648. Throughput: 0: 42797.4. Samples: 1836275540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:14:16,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 10:14:17,048][12883] Updated weights for policy 0, policy_version 112073 (0.0041) +[2024-06-18 10:14:20,478][12883] Updated weights for policy 0, policy_version 112083 (0.0039) +[2024-06-18 10:14:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1836417024. Throughput: 0: 42749.9. Samples: 1836527240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:14:21,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 10:14:24,849][12883] Updated weights for policy 0, policy_version 112093 (0.0035) +[2024-06-18 10:14:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1836613632. Throughput: 0: 42756.8. Samples: 1836782780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:14:26,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 10:14:28,135][12883] Updated weights for policy 0, policy_version 112103 (0.0039) +[2024-06-18 10:14:31,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 1836843008. Throughput: 0: 42719.4. Samples: 1836910660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:14:31,996][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 10:14:32,359][12883] Updated weights for policy 0, policy_version 112113 (0.0035) +[2024-06-18 10:14:35,044][12862] Signal inference workers to stop experience collection... (26800 times) +[2024-06-18 10:14:35,044][12862] Signal inference workers to resume experience collection... (26800 times) +[2024-06-18 10:14:35,075][12883] InferenceWorker_p0-w0: stopping experience collection (26800 times) +[2024-06-18 10:14:35,075][12883] InferenceWorker_p0-w0: resuming experience collection (26800 times) +[2024-06-18 10:14:35,697][12883] Updated weights for policy 0, policy_version 112123 (0.0035) +[2024-06-18 10:14:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1837056000. Throughput: 0: 42836.5. Samples: 1837163740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:14:36,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 10:14:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112125_1837056000.pth... +[2024-06-18 10:14:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111501_1826832384.pth +[2024-06-18 10:14:39,916][12883] Updated weights for policy 0, policy_version 112133 (0.0042) +[2024-06-18 10:14:41,996][12645] Fps is (10 sec: 40960.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1837252608. Throughput: 0: 42866.3. Samples: 1837428600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:14:41,996][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 10:14:43,252][12883] Updated weights for policy 0, policy_version 112143 (0.0024) +[2024-06-18 10:14:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1837481984. Throughput: 0: 42756.4. Samples: 1837556960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:14:46,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 10:14:47,439][12883] Updated weights for policy 0, policy_version 112153 (0.0040) +[2024-06-18 10:14:51,421][12883] Updated weights for policy 0, policy_version 112163 (0.0047) +[2024-06-18 10:14:51,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1837694976. Throughput: 0: 42620.6. Samples: 1837805820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:14:51,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 10:14:55,318][12883] Updated weights for policy 0, policy_version 112173 (0.0039) +[2024-06-18 10:14:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1837891584. Throughput: 0: 42485.2. Samples: 1838057840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:14:56,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 10:14:58,991][12883] Updated weights for policy 0, policy_version 112183 (0.0037) +[2024-06-18 10:15:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1838104576. Throughput: 0: 42512.4. Samples: 1838188600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:15:01,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 10:15:03,190][12883] Updated weights for policy 0, policy_version 112193 (0.0041) +[2024-06-18 10:15:06,612][12883] Updated weights for policy 0, policy_version 112203 (0.0039) +[2024-06-18 10:15:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1838333952. Throughput: 0: 42488.3. Samples: 1838439220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:15:06,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 10:15:11,079][12883] Updated weights for policy 0, policy_version 112213 (0.0041) +[2024-06-18 10:15:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1838530560. Throughput: 0: 42497.4. Samples: 1838695160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:15:11,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 10:15:14,293][12883] Updated weights for policy 0, policy_version 112223 (0.0039) +[2024-06-18 10:15:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1838743552. Throughput: 0: 42481.6. Samples: 1838822240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 10:15:16,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 10:15:18,543][12883] Updated weights for policy 0, policy_version 112233 (0.0031) +[2024-06-18 10:15:21,925][12883] Updated weights for policy 0, policy_version 112243 (0.0035) +[2024-06-18 10:15:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1838989312. Throughput: 0: 42743.0. Samples: 1839087180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:15:21,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 10:15:25,993][12883] Updated weights for policy 0, policy_version 112253 (0.0040) +[2024-06-18 10:15:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1839185920. Throughput: 0: 42491.0. Samples: 1839340600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:15:26,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 10:15:29,906][12883] Updated weights for policy 0, policy_version 112263 (0.0039) +[2024-06-18 10:15:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42326.9, 300 sec: 42543.8). Total num frames: 1839382528. Throughput: 0: 42331.2. Samples: 1839461860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:15:31,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 10:15:33,803][12883] Updated weights for policy 0, policy_version 112273 (0.0040) +[2024-06-18 10:15:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1839611904. Throughput: 0: 42604.4. Samples: 1839723020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:15:36,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 10:15:37,792][12883] Updated weights for policy 0, policy_version 112283 (0.0041) +[2024-06-18 10:15:41,482][12883] Updated weights for policy 0, policy_version 112293 (0.0026) +[2024-06-18 10:15:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 1839824896. Throughput: 0: 42545.9. Samples: 1839972400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:15:41,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 10:15:45,411][12883] Updated weights for policy 0, policy_version 112303 (0.0043) +[2024-06-18 10:15:46,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1840037888. Throughput: 0: 42522.7. Samples: 1840102220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:15:46,997][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 10:15:49,080][12883] Updated weights for policy 0, policy_version 112313 (0.0033) +[2024-06-18 10:15:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1840250880. Throughput: 0: 42791.1. Samples: 1840364820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:15:51,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 10:15:53,038][12883] Updated weights for policy 0, policy_version 112323 (0.0042) +[2024-06-18 10:15:56,723][12883] Updated weights for policy 0, policy_version 112333 (0.0031) +[2024-06-18 10:15:56,994][12645] Fps is (10 sec: 44247.2, 60 sec: 43144.6, 300 sec: 42654.2). Total num frames: 1840480256. Throughput: 0: 42643.1. Samples: 1840614100. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:15:56,994][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 10:15:59,244][12862] Signal inference workers to stop experience collection... (26850 times) +[2024-06-18 10:15:59,244][12862] Signal inference workers to resume experience collection... (26850 times) +[2024-06-18 10:15:59,298][12883] InferenceWorker_p0-w0: stopping experience collection (26850 times) +[2024-06-18 10:15:59,299][12883] InferenceWorker_p0-w0: resuming experience collection (26850 times) +[2024-06-18 10:16:00,796][12883] Updated weights for policy 0, policy_version 112343 (0.0028) +[2024-06-18 10:16:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1840660480. Throughput: 0: 42639.2. Samples: 1840741000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:16:01,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 10:16:04,278][12883] Updated weights for policy 0, policy_version 112353 (0.0027) +[2024-06-18 10:16:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1840873472. Throughput: 0: 42570.6. Samples: 1841002860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:16:06,998][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 10:16:08,405][12883] Updated weights for policy 0, policy_version 112363 (0.0036) +[2024-06-18 10:16:11,938][12883] Updated weights for policy 0, policy_version 112373 (0.0032) +[2024-06-18 10:16:11,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1841119232. Throughput: 0: 42479.5. Samples: 1841252180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:16:11,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 10:16:16,051][12883] Updated weights for policy 0, policy_version 112383 (0.0041) +[2024-06-18 10:16:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1841315840. Throughput: 0: 42804.0. Samples: 1841388040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:16:16,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 10:16:19,455][12883] Updated weights for policy 0, policy_version 112393 (0.0031) +[2024-06-18 10:16:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1841512448. Throughput: 0: 42515.6. Samples: 1841636220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:16:21,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 10:16:23,739][12883] Updated weights for policy 0, policy_version 112403 (0.0023) +[2024-06-18 10:16:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1841741824. Throughput: 0: 42566.6. Samples: 1841887900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:16:26,994][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 10:16:27,615][12883] Updated weights for policy 0, policy_version 112413 (0.0034) +[2024-06-18 10:16:31,553][12883] Updated weights for policy 0, policy_version 112423 (0.0038) +[2024-06-18 10:16:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1841954816. Throughput: 0: 42719.9. Samples: 1842024520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:16:31,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 10:16:35,155][12883] Updated weights for policy 0, policy_version 112433 (0.0032) +[2024-06-18 10:16:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1842135040. Throughput: 0: 42473.9. Samples: 1842276140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:16:36,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 10:16:37,100][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112436_1842151424.pth... +[2024-06-18 10:16:37,145][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111812_1831927808.pth +[2024-06-18 10:16:39,102][12883] Updated weights for policy 0, policy_version 112443 (0.0027) +[2024-06-18 10:16:41,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1842380800. Throughput: 0: 42592.9. Samples: 1842530780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:16:41,994][12645] Avg episode reward: [(0, '0.668')] +[2024-06-18 10:16:42,878][12883] Updated weights for policy 0, policy_version 112453 (0.0037) +[2024-06-18 10:16:46,882][12883] Updated weights for policy 0, policy_version 112463 (0.0036) +[2024-06-18 10:16:46,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 1842593792. Throughput: 0: 42784.1. Samples: 1842666280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:16:46,994][12645] Avg episode reward: [(0, '0.675')] +[2024-06-18 10:16:50,914][12883] Updated weights for policy 0, policy_version 112473 (0.0034) +[2024-06-18 10:16:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1842790400. Throughput: 0: 42566.3. Samples: 1842918340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:16:51,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 10:16:54,497][12883] Updated weights for policy 0, policy_version 112483 (0.0039) +[2024-06-18 10:16:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1843036160. Throughput: 0: 42638.2. Samples: 1843170900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:16:56,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 10:16:58,624][12883] Updated weights for policy 0, policy_version 112493 (0.0035) +[2024-06-18 10:17:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1843232768. Throughput: 0: 42545.7. Samples: 1843302600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:17:01,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 10:17:02,131][12883] Updated weights for policy 0, policy_version 112503 (0.0041) +[2024-06-18 10:17:06,323][12883] Updated weights for policy 0, policy_version 112513 (0.0030) +[2024-06-18 10:17:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1843445760. Throughput: 0: 42612.0. Samples: 1843553760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:17:06,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 10:17:09,313][12862] Signal inference workers to stop experience collection... (26900 times) +[2024-06-18 10:17:09,368][12883] InferenceWorker_p0-w0: stopping experience collection (26900 times) +[2024-06-18 10:17:09,373][12862] Signal inference workers to resume experience collection... (26900 times) +[2024-06-18 10:17:09,381][12883] InferenceWorker_p0-w0: resuming experience collection (26900 times) +[2024-06-18 10:17:09,840][12883] Updated weights for policy 0, policy_version 112523 (0.0037) +[2024-06-18 10:17:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1843658752. Throughput: 0: 42696.5. Samples: 1843809240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:17:11,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 10:17:13,831][12883] Updated weights for policy 0, policy_version 112533 (0.0028) +[2024-06-18 10:17:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1843871744. Throughput: 0: 42481.4. Samples: 1843936180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:17:16,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 10:17:17,452][12883] Updated weights for policy 0, policy_version 112543 (0.0032) +[2024-06-18 10:17:21,665][12883] Updated weights for policy 0, policy_version 112553 (0.0031) +[2024-06-18 10:17:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1844068352. Throughput: 0: 42487.2. Samples: 1844188060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 10:17:21,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 10:17:25,273][12883] Updated weights for policy 0, policy_version 112563 (0.0043) +[2024-06-18 10:17:26,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 1844281344. Throughput: 0: 42553.3. Samples: 1844445780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:17:26,997][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 10:17:29,298][12883] Updated weights for policy 0, policy_version 112573 (0.0034) +[2024-06-18 10:17:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1844494336. Throughput: 0: 42405.6. Samples: 1844574540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:17:31,995][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 10:17:33,034][12883] Updated weights for policy 0, policy_version 112583 (0.0037) +[2024-06-18 10:17:36,837][12883] Updated weights for policy 0, policy_version 112593 (0.0038) +[2024-06-18 10:17:36,994][12645] Fps is (10 sec: 44246.8, 60 sec: 43144.5, 300 sec: 42654.8). Total num frames: 1844723712. Throughput: 0: 42450.7. Samples: 1844828620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:17:36,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 10:17:41,016][12883] Updated weights for policy 0, policy_version 112603 (0.0034) +[2024-06-18 10:17:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1844936704. Throughput: 0: 42486.6. Samples: 1845082800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:17:41,994][12645] Avg episode reward: [(0, '0.706')] +[2024-06-18 10:17:44,645][12883] Updated weights for policy 0, policy_version 112613 (0.0038) +[2024-06-18 10:17:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1845133312. Throughput: 0: 42504.5. Samples: 1845215300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:17:46,994][12645] Avg episode reward: [(0, '0.820')] +[2024-06-18 10:17:48,602][12883] Updated weights for policy 0, policy_version 112623 (0.0039) +[2024-06-18 10:17:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 1845346304. Throughput: 0: 42499.6. Samples: 1845466240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:17:51,994][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 10:17:52,469][12883] Updated weights for policy 0, policy_version 112633 (0.0031) +[2024-06-18 10:17:56,321][12883] Updated weights for policy 0, policy_version 112643 (0.0037) +[2024-06-18 10:17:56,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 1845575680. Throughput: 0: 42565.8. Samples: 1845724800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:17:56,996][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 10:18:00,323][12883] Updated weights for policy 0, policy_version 112653 (0.0036) +[2024-06-18 10:18:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1845772288. Throughput: 0: 42585.4. Samples: 1845852520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:18:01,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 10:18:04,042][12883] Updated weights for policy 0, policy_version 112663 (0.0036) +[2024-06-18 10:18:06,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1845985280. Throughput: 0: 42817.1. Samples: 1846114840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:18:06,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 10:18:07,926][12883] Updated weights for policy 0, policy_version 112673 (0.0028) +[2024-06-18 10:18:11,687][12883] Updated weights for policy 0, policy_version 112683 (0.0032) +[2024-06-18 10:18:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 1846198272. Throughput: 0: 42772.9. Samples: 1846370460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:18:11,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 10:18:15,518][12883] Updated weights for policy 0, policy_version 112693 (0.0033) +[2024-06-18 10:18:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1846411264. Throughput: 0: 42746.7. Samples: 1846498140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:18:16,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 10:18:19,191][12883] Updated weights for policy 0, policy_version 112703 (0.0031) +[2024-06-18 10:18:21,240][12862] Signal inference workers to stop experience collection... (26950 times) +[2024-06-18 10:18:21,240][12862] Signal inference workers to resume experience collection... (26950 times) +[2024-06-18 10:18:21,284][12883] InferenceWorker_p0-w0: stopping experience collection (26950 times) +[2024-06-18 10:18:21,285][12883] InferenceWorker_p0-w0: resuming experience collection (26950 times) +[2024-06-18 10:18:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1846640640. Throughput: 0: 42818.6. Samples: 1846755460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) +[2024-06-18 10:18:21,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 10:18:23,513][12883] Updated weights for policy 0, policy_version 112713 (0.0038) +[2024-06-18 10:18:26,774][12883] Updated weights for policy 0, policy_version 112723 (0.0033) +[2024-06-18 10:18:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 1846853632. Throughput: 0: 42851.1. Samples: 1847011100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:18:26,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 10:18:31,095][12883] Updated weights for policy 0, policy_version 112733 (0.0042) +[2024-06-18 10:18:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1847066624. Throughput: 0: 42764.4. Samples: 1847139700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:18:31,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 10:18:34,421][12883] Updated weights for policy 0, policy_version 112743 (0.0046) +[2024-06-18 10:18:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1847279616. Throughput: 0: 42948.4. Samples: 1847398920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:18:36,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 10:18:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112749_1847279616.pth... +[2024-06-18 10:18:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112125_1837056000.pth +[2024-06-18 10:18:38,657][12883] Updated weights for policy 0, policy_version 112753 (0.0047) +[2024-06-18 10:18:41,932][12883] Updated weights for policy 0, policy_version 112763 (0.0041) +[2024-06-18 10:18:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1847508992. Throughput: 0: 42898.1. Samples: 1847655120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:18:41,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 10:18:46,198][12883] Updated weights for policy 0, policy_version 112773 (0.0034) +[2024-06-18 10:18:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1847705600. Throughput: 0: 42927.6. Samples: 1847784260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:18:46,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 10:18:49,646][12883] Updated weights for policy 0, policy_version 112783 (0.0025) +[2024-06-18 10:18:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1847918592. Throughput: 0: 42791.5. Samples: 1848040460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:18:51,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 10:18:53,697][12883] Updated weights for policy 0, policy_version 112793 (0.0037) +[2024-06-18 10:18:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 1848147968. Throughput: 0: 42689.8. Samples: 1848291500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:18:56,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 10:18:57,271][12883] Updated weights for policy 0, policy_version 112803 (0.0033) +[2024-06-18 10:19:01,357][12883] Updated weights for policy 0, policy_version 112813 (0.0035) +[2024-06-18 10:19:01,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1848344576. Throughput: 0: 42861.5. Samples: 1848426900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:19:01,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 10:19:04,807][12883] Updated weights for policy 0, policy_version 112823 (0.0046) +[2024-06-18 10:19:06,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1848541184. Throughput: 0: 42829.3. Samples: 1848682780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:19:06,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 10:19:08,863][12883] Updated weights for policy 0, policy_version 112833 (0.0032) +[2024-06-18 10:19:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1848786944. Throughput: 0: 42827.6. Samples: 1848938340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:19:11,996][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 10:19:12,376][12883] Updated weights for policy 0, policy_version 112843 (0.0033) +[2024-06-18 10:19:16,922][12883] Updated weights for policy 0, policy_version 112853 (0.0034) +[2024-06-18 10:19:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1848983552. Throughput: 0: 43034.3. Samples: 1849076240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:19:16,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 10:19:20,060][12883] Updated weights for policy 0, policy_version 112863 (0.0031) +[2024-06-18 10:19:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1849196544. Throughput: 0: 42834.2. Samples: 1849326460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:19:21,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 10:19:24,447][12883] Updated weights for policy 0, policy_version 112873 (0.0041) +[2024-06-18 10:19:26,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 1849442304. Throughput: 0: 42853.9. Samples: 1849583540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:19:26,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 10:19:27,932][12883] Updated weights for policy 0, policy_version 112883 (0.0033) +[2024-06-18 10:19:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1849622528. Throughput: 0: 42954.5. Samples: 1849717220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:19:31,995][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 10:19:32,185][12883] Updated weights for policy 0, policy_version 112893 (0.0034) +[2024-06-18 10:19:35,925][12883] Updated weights for policy 0, policy_version 112903 (0.0038) +[2024-06-18 10:19:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1849851904. Throughput: 0: 42869.5. Samples: 1849969580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:19:36,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 10:19:39,842][12883] Updated weights for policy 0, policy_version 112913 (0.0036) +[2024-06-18 10:19:41,532][12862] Signal inference workers to stop experience collection... (27000 times) +[2024-06-18 10:19:41,532][12862] Signal inference workers to resume experience collection... (27000 times) +[2024-06-18 10:19:41,548][12883] InferenceWorker_p0-w0: stopping experience collection (27000 times) +[2024-06-18 10:19:41,548][12883] InferenceWorker_p0-w0: resuming experience collection (27000 times) +[2024-06-18 10:19:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1850081280. Throughput: 0: 42850.6. Samples: 1850219780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:19:41,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 10:19:43,650][12883] Updated weights for policy 0, policy_version 112923 (0.0031) +[2024-06-18 10:19:46,999][12645] Fps is (10 sec: 40939.2, 60 sec: 42594.8, 300 sec: 42597.7). Total num frames: 1850261504. Throughput: 0: 42768.0. Samples: 1850351680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:19:46,999][12645] Avg episode reward: [(0, '0.717')] +[2024-06-18 10:19:47,444][12883] Updated weights for policy 0, policy_version 112933 (0.0057) +[2024-06-18 10:19:51,434][12883] Updated weights for policy 0, policy_version 112943 (0.0034) +[2024-06-18 10:19:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1850490880. Throughput: 0: 42613.8. Samples: 1850600400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:19:51,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 10:19:55,146][12883] Updated weights for policy 0, policy_version 112953 (0.0039) +[2024-06-18 10:19:56,994][12645] Fps is (10 sec: 45897.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1850720256. Throughput: 0: 42541.7. Samples: 1850852720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:19:56,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 10:19:58,977][12883] Updated weights for policy 0, policy_version 112963 (0.0029) +[2024-06-18 10:20:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1850900480. Throughput: 0: 42345.3. Samples: 1850981780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:20:01,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 10:20:02,799][12883] Updated weights for policy 0, policy_version 112973 (0.0042) +[2024-06-18 10:20:06,586][12883] Updated weights for policy 0, policy_version 112983 (0.0032) +[2024-06-18 10:20:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1851129856. Throughput: 0: 42498.4. Samples: 1851238880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:20:06,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 10:20:10,565][12883] Updated weights for policy 0, policy_version 112993 (0.0027) +[2024-06-18 10:20:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1851342848. Throughput: 0: 42503.5. Samples: 1851496200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:20:11,995][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 10:20:14,162][12883] Updated weights for policy 0, policy_version 113003 (0.0023) +[2024-06-18 10:20:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1851539456. Throughput: 0: 42417.4. Samples: 1851626000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:20:16,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 10:20:18,292][12883] Updated weights for policy 0, policy_version 113013 (0.0030) +[2024-06-18 10:20:21,734][12883] Updated weights for policy 0, policy_version 113023 (0.0035) +[2024-06-18 10:20:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1851768832. Throughput: 0: 42447.9. Samples: 1851879740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:20:21,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 10:20:25,730][12883] Updated weights for policy 0, policy_version 113033 (0.0030) +[2024-06-18 10:20:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1851981824. Throughput: 0: 42618.3. Samples: 1852137600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:20:26,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 10:20:29,516][12883] Updated weights for policy 0, policy_version 113043 (0.0039) +[2024-06-18 10:20:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1852178432. Throughput: 0: 42503.7. Samples: 1852264140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:20:31,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 10:20:33,428][12883] Updated weights for policy 0, policy_version 113053 (0.0038) +[2024-06-18 10:20:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1852407808. Throughput: 0: 42683.2. Samples: 1852521140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:20:36,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 10:20:37,055][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113063_1852424192.pth... +[2024-06-18 10:20:37,061][12883] Updated weights for policy 0, policy_version 113063 (0.0033) +[2024-06-18 10:20:37,121][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112436_1842151424.pth +[2024-06-18 10:20:40,940][12883] Updated weights for policy 0, policy_version 113073 (0.0047) +[2024-06-18 10:20:41,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1852620800. Throughput: 0: 42655.2. Samples: 1852772200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:20:41,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 10:20:44,792][12883] Updated weights for policy 0, policy_version 113083 (0.0034) +[2024-06-18 10:20:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42601.9, 300 sec: 42598.4). Total num frames: 1852817408. Throughput: 0: 42723.4. Samples: 1852904340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:20:46,994][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 10:20:48,821][12883] Updated weights for policy 0, policy_version 113093 (0.0034) +[2024-06-18 10:20:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1853046784. Throughput: 0: 42551.1. Samples: 1853153680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:20:51,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 10:20:52,438][12883] Updated weights for policy 0, policy_version 113103 (0.0035) +[2024-06-18 10:20:56,688][12883] Updated weights for policy 0, policy_version 113113 (0.0031) +[2024-06-18 10:20:56,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1853259776. Throughput: 0: 42531.2. Samples: 1853410100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:20:56,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 10:20:59,968][12883] Updated weights for policy 0, policy_version 113123 (0.0031) +[2024-06-18 10:21:00,700][12862] Signal inference workers to stop experience collection... (27050 times) +[2024-06-18 10:21:00,700][12862] Signal inference workers to resume experience collection... (27050 times) +[2024-06-18 10:21:00,749][12883] InferenceWorker_p0-w0: stopping experience collection (27050 times) +[2024-06-18 10:21:00,749][12883] InferenceWorker_p0-w0: resuming experience collection (27050 times) +[2024-06-18 10:21:01,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1853456384. Throughput: 0: 42387.5. Samples: 1853533440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:21:01,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 10:21:04,239][12883] Updated weights for policy 0, policy_version 113133 (0.0048) +[2024-06-18 10:21:06,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1853669376. Throughput: 0: 42505.3. Samples: 1853792480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:21:06,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 10:21:07,746][12883] Updated weights for policy 0, policy_version 113143 (0.0033) +[2024-06-18 10:21:11,982][12883] Updated weights for policy 0, policy_version 113153 (0.0030) +[2024-06-18 10:21:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1853898752. Throughput: 0: 42556.4. Samples: 1854052640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:21:11,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 10:21:15,395][12883] Updated weights for policy 0, policy_version 113163 (0.0024) +[2024-06-18 10:21:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1854111744. Throughput: 0: 42579.2. Samples: 1854180200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:21:16,994][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 10:21:19,707][12883] Updated weights for policy 0, policy_version 113173 (0.0032) +[2024-06-18 10:21:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1854308352. Throughput: 0: 42335.5. Samples: 1854426240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:21:21,996][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 10:21:23,076][12883] Updated weights for policy 0, policy_version 113183 (0.0038) +[2024-06-18 10:21:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1854504960. Throughput: 0: 42639.4. Samples: 1854690980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 10:21:26,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 10:21:27,567][12883] Updated weights for policy 0, policy_version 113193 (0.0046) +[2024-06-18 10:21:31,051][12883] Updated weights for policy 0, policy_version 113203 (0.0025) +[2024-06-18 10:21:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1854750720. Throughput: 0: 42387.7. Samples: 1854811780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:21:31,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 10:21:35,282][12883] Updated weights for policy 0, policy_version 113213 (0.0047) +[2024-06-18 10:21:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1854947328. Throughput: 0: 42470.2. Samples: 1855064840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:21:36,994][12645] Avg episode reward: [(0, '0.138')] +[2024-06-18 10:21:38,880][12883] Updated weights for policy 0, policy_version 113223 (0.0041) +[2024-06-18 10:21:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1855160320. Throughput: 0: 42543.5. Samples: 1855324560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:21:41,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 10:21:42,926][12883] Updated weights for policy 0, policy_version 113233 (0.0041) +[2024-06-18 10:21:46,619][12883] Updated weights for policy 0, policy_version 113243 (0.0043) +[2024-06-18 10:21:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1855389696. Throughput: 0: 42688.0. Samples: 1855454400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:21:46,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 10:21:50,624][12883] Updated weights for policy 0, policy_version 113253 (0.0030) +[2024-06-18 10:21:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1855586304. Throughput: 0: 42624.4. Samples: 1855710580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:21:51,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 10:21:54,245][12883] Updated weights for policy 0, policy_version 113263 (0.0049) +[2024-06-18 10:21:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1855799296. Throughput: 0: 42504.8. Samples: 1855965360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:21:56,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 10:21:58,225][12883] Updated weights for policy 0, policy_version 113273 (0.0033) +[2024-06-18 10:22:00,417][12862] Signal inference workers to stop experience collection... (27100 times) +[2024-06-18 10:22:00,456][12883] InferenceWorker_p0-w0: stopping experience collection (27100 times) +[2024-06-18 10:22:00,465][12862] Signal inference workers to resume experience collection... (27100 times) +[2024-06-18 10:22:00,476][12883] InferenceWorker_p0-w0: resuming experience collection (27100 times) +[2024-06-18 10:22:01,971][12883] Updated weights for policy 0, policy_version 113283 (0.0030) +[2024-06-18 10:22:01,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1856028672. Throughput: 0: 42497.1. Samples: 1856092660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:22:01,996][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 10:22:05,803][12883] Updated weights for policy 0, policy_version 113293 (0.0038) +[2024-06-18 10:22:06,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1856241664. Throughput: 0: 42842.3. Samples: 1856354140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:22:06,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 10:22:09,595][12883] Updated weights for policy 0, policy_version 113303 (0.0025) +[2024-06-18 10:22:11,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1856438272. Throughput: 0: 42609.4. Samples: 1856608400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:22:11,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 10:22:13,442][12883] Updated weights for policy 0, policy_version 113313 (0.0046) +[2024-06-18 10:22:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1856634880. Throughput: 0: 42742.6. Samples: 1856735200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:22:16,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 10:22:17,337][12883] Updated weights for policy 0, policy_version 113323 (0.0039) +[2024-06-18 10:22:21,139][12883] Updated weights for policy 0, policy_version 113333 (0.0043) +[2024-06-18 10:22:21,996][12645] Fps is (10 sec: 45865.0, 60 sec: 43142.9, 300 sec: 42765.0). Total num frames: 1856897024. Throughput: 0: 42868.9. Samples: 1856994040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:22:21,997][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 10:22:24,999][12883] Updated weights for policy 0, policy_version 113343 (0.0021) +[2024-06-18 10:22:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1857077248. Throughput: 0: 42834.1. Samples: 1857252100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) +[2024-06-18 10:22:26,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 10:22:28,654][12883] Updated weights for policy 0, policy_version 113353 (0.0044) +[2024-06-18 10:22:31,994][12645] Fps is (10 sec: 39330.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1857290240. Throughput: 0: 42654.7. Samples: 1857373860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:22:31,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 10:22:32,588][12883] Updated weights for policy 0, policy_version 113363 (0.0028) +[2024-06-18 10:22:36,284][12883] Updated weights for policy 0, policy_version 113373 (0.0027) +[2024-06-18 10:22:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1857536000. Throughput: 0: 42749.3. Samples: 1857634300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:22:36,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 10:22:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113375_1857536000.pth... +[2024-06-18 10:22:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112749_1847279616.pth +[2024-06-18 10:22:40,311][12883] Updated weights for policy 0, policy_version 113383 (0.0036) +[2024-06-18 10:22:41,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1857732608. Throughput: 0: 42869.1. Samples: 1857894460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:22:41,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 10:22:43,736][12883] Updated weights for policy 0, policy_version 113393 (0.0032) +[2024-06-18 10:22:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1857945600. Throughput: 0: 42797.2. Samples: 1858018440. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:22:46,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 10:22:47,800][12883] Updated weights for policy 0, policy_version 113403 (0.0036) +[2024-06-18 10:22:51,404][12883] Updated weights for policy 0, policy_version 113413 (0.0038) +[2024-06-18 10:22:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42709.8). Total num frames: 1858174976. Throughput: 0: 42854.2. Samples: 1858282580. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:22:51,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 10:22:55,814][12883] Updated weights for policy 0, policy_version 113423 (0.0047) +[2024-06-18 10:22:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1858371584. Throughput: 0: 42637.0. Samples: 1858527060. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:22:56,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 10:22:59,035][12883] Updated weights for policy 0, policy_version 113433 (0.0030) +[2024-06-18 10:23:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1858568192. Throughput: 0: 42684.4. Samples: 1858656000. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:23:01,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 10:23:03,292][12883] Updated weights for policy 0, policy_version 113443 (0.0027) +[2024-06-18 10:23:06,630][12862] Signal inference workers to stop experience collection... (27150 times) +[2024-06-18 10:23:06,686][12862] Signal inference workers to resume experience collection... (27150 times) +[2024-06-18 10:23:06,688][12883] InferenceWorker_p0-w0: stopping experience collection (27150 times) +[2024-06-18 10:23:06,709][12883] InferenceWorker_p0-w0: resuming experience collection (27150 times) +[2024-06-18 10:23:06,823][12883] Updated weights for policy 0, policy_version 113453 (0.0029) +[2024-06-18 10:23:06,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1858813952. Throughput: 0: 42733.5. Samples: 1858916960. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:23:06,995][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 10:23:11,193][12883] Updated weights for policy 0, policy_version 113463 (0.0032) +[2024-06-18 10:23:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1858994176. Throughput: 0: 42693.9. Samples: 1859173320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:23:11,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 10:23:14,386][12883] Updated weights for policy 0, policy_version 113473 (0.0031) +[2024-06-18 10:23:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1859207168. Throughput: 0: 42686.6. Samples: 1859294760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:23:16,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 10:23:18,819][12883] Updated weights for policy 0, policy_version 113483 (0.0032) +[2024-06-18 10:23:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1859452928. Throughput: 0: 42699.2. Samples: 1859555760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:23:21,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 10:23:22,046][12883] Updated weights for policy 0, policy_version 113493 (0.0038) +[2024-06-18 10:23:26,310][12883] Updated weights for policy 0, policy_version 113503 (0.0028) +[2024-06-18 10:23:26,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1859649536. Throughput: 0: 42629.3. Samples: 1859812780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) +[2024-06-18 10:23:26,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 10:23:29,630][12883] Updated weights for policy 0, policy_version 113513 (0.0027) +[2024-06-18 10:23:31,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1859862528. Throughput: 0: 42650.8. Samples: 1859937820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:23:31,996][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 10:23:33,709][12883] Updated weights for policy 0, policy_version 113523 (0.0041) +[2024-06-18 10:23:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 1860091904. Throughput: 0: 42730.2. Samples: 1860205440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:23:36,994][12645] Avg episode reward: [(0, '0.668')] +[2024-06-18 10:23:37,441][12883] Updated weights for policy 0, policy_version 113533 (0.0028) +[2024-06-18 10:23:41,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1860288512. Throughput: 0: 42993.2. Samples: 1860461760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:23:41,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 10:23:42,001][12883] Updated weights for policy 0, policy_version 113543 (0.0041) +[2024-06-18 10:23:44,932][12883] Updated weights for policy 0, policy_version 113553 (0.0042) +[2024-06-18 10:23:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1860517888. Throughput: 0: 42795.5. Samples: 1860581800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:23:46,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 10:23:49,474][12883] Updated weights for policy 0, policy_version 113563 (0.0034) +[2024-06-18 10:23:51,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 1860730880. Throughput: 0: 42842.0. Samples: 1860844940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:23:51,996][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 10:23:52,907][12883] Updated weights for policy 0, policy_version 113573 (0.0022) +[2024-06-18 10:23:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1860911104. Throughput: 0: 42856.9. Samples: 1861101880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:23:56,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 10:23:57,177][12883] Updated weights for policy 0, policy_version 113583 (0.0029) +[2024-06-18 10:24:00,316][12883] Updated weights for policy 0, policy_version 113593 (0.0045) +[2024-06-18 10:24:01,994][12645] Fps is (10 sec: 42607.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1861156864. Throughput: 0: 42872.1. Samples: 1861224000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:24:01,994][12645] Avg episode reward: [(0, '0.689')] +[2024-06-18 10:24:04,811][12883] Updated weights for policy 0, policy_version 113603 (0.0032) +[2024-06-18 10:24:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1861369856. Throughput: 0: 42734.2. Samples: 1861478800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:24:06,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 10:24:08,165][12883] Updated weights for policy 0, policy_version 113613 (0.0026) +[2024-06-18 10:24:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1861566464. Throughput: 0: 42757.7. Samples: 1861736880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:24:11,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 10:24:12,320][12883] Updated weights for policy 0, policy_version 113623 (0.0040) +[2024-06-18 10:24:15,717][12883] Updated weights for policy 0, policy_version 113633 (0.0036) +[2024-06-18 10:24:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1861795840. Throughput: 0: 42712.4. Samples: 1861859780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:24:16,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 10:24:20,417][12883] Updated weights for policy 0, policy_version 113643 (0.0028) +[2024-06-18 10:24:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1862008832. Throughput: 0: 42545.6. Samples: 1862120000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:24:21,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 10:24:23,379][12883] Updated weights for policy 0, policy_version 113653 (0.0032) +[2024-06-18 10:24:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1862205440. Throughput: 0: 42512.5. Samples: 1862374820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 10:24:26,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 10:24:27,989][12883] Updated weights for policy 0, policy_version 113663 (0.0037) +[2024-06-18 10:24:31,194][12883] Updated weights for policy 0, policy_version 113673 (0.0043) +[2024-06-18 10:24:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 1862451200. Throughput: 0: 42648.6. Samples: 1862500980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:24:31,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 10:24:35,548][12883] Updated weights for policy 0, policy_version 113683 (0.0037) +[2024-06-18 10:24:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 1862631424. Throughput: 0: 42402.9. Samples: 1862752980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:24:36,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 10:24:37,151][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113687_1862647808.pth... +[2024-06-18 10:24:37,216][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113063_1852424192.pth +[2024-06-18 10:24:38,603][12862] Signal inference workers to stop experience collection... (27200 times) +[2024-06-18 10:24:38,603][12862] Signal inference workers to resume experience collection... (27200 times) +[2024-06-18 10:24:38,647][12883] InferenceWorker_p0-w0: stopping experience collection (27200 times) +[2024-06-18 10:24:38,647][12883] InferenceWorker_p0-w0: resuming experience collection (27200 times) +[2024-06-18 10:24:39,039][12883] Updated weights for policy 0, policy_version 113693 (0.0044) +[2024-06-18 10:24:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42654.7). Total num frames: 1862844416. Throughput: 0: 42496.1. Samples: 1863014200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:24:41,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 10:24:43,438][12883] Updated weights for policy 0, policy_version 113703 (0.0037) +[2024-06-18 10:24:46,632][12883] Updated weights for policy 0, policy_version 113713 (0.0039) +[2024-06-18 10:24:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1863090176. Throughput: 0: 42516.9. Samples: 1863137260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:24:46,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 10:24:50,999][12883] Updated weights for policy 0, policy_version 113723 (0.0032) +[2024-06-18 10:24:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 1863286784. Throughput: 0: 42622.7. Samples: 1863396820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:24:51,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 10:24:54,605][12883] Updated weights for policy 0, policy_version 113733 (0.0028) +[2024-06-18 10:24:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1863483392. Throughput: 0: 42666.8. Samples: 1863656880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:24:56,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 10:24:58,533][12883] Updated weights for policy 0, policy_version 113743 (0.0039) +[2024-06-18 10:25:01,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 1863696384. Throughput: 0: 42509.8. Samples: 1863772820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:25:01,996][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 10:25:02,386][12883] Updated weights for policy 0, policy_version 113753 (0.0049) +[2024-06-18 10:25:06,061][12883] Updated weights for policy 0, policy_version 113763 (0.0047) +[2024-06-18 10:25:06,996][12645] Fps is (10 sec: 45864.7, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1863942144. Throughput: 0: 42428.6. Samples: 1864029380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:25:06,997][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 10:25:10,502][12883] Updated weights for policy 0, policy_version 113773 (0.0037) +[2024-06-18 10:25:11,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1864105984. Throughput: 0: 42596.9. Samples: 1864291680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:25:11,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 10:25:13,628][12883] Updated weights for policy 0, policy_version 113783 (0.0041) +[2024-06-18 10:25:16,994][12645] Fps is (10 sec: 39330.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1864335360. Throughput: 0: 42397.2. Samples: 1864408860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:25:16,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 10:25:18,157][12883] Updated weights for policy 0, policy_version 113793 (0.0033) +[2024-06-18 10:25:21,265][12883] Updated weights for policy 0, policy_version 113803 (0.0035) +[2024-06-18 10:25:21,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1864581120. Throughput: 0: 42648.1. Samples: 1864672140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:25:21,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 10:25:25,642][12883] Updated weights for policy 0, policy_version 113813 (0.0031) +[2024-06-18 10:25:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1864744960. Throughput: 0: 42523.8. Samples: 1864927780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:25:26,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 10:25:29,122][12883] Updated weights for policy 0, policy_version 113823 (0.0035) +[2024-06-18 10:25:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1864990720. Throughput: 0: 42517.9. Samples: 1865050560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) +[2024-06-18 10:25:31,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 10:25:33,183][12883] Updated weights for policy 0, policy_version 113833 (0.0038) +[2024-06-18 10:25:36,711][12883] Updated weights for policy 0, policy_version 113843 (0.0038) +[2024-06-18 10:25:36,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1865220096. Throughput: 0: 42663.8. Samples: 1865316700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:25:36,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 10:25:40,737][12883] Updated weights for policy 0, policy_version 113853 (0.0041) +[2024-06-18 10:25:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 1865400320. Throughput: 0: 42643.5. Samples: 1865575840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:25:41,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 10:25:44,409][12883] Updated weights for policy 0, policy_version 113863 (0.0030) +[2024-06-18 10:25:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1865629696. Throughput: 0: 42756.2. Samples: 1865696760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:25:46,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 10:25:48,271][12883] Updated weights for policy 0, policy_version 113873 (0.0030) +[2024-06-18 10:25:51,971][12883] Updated weights for policy 0, policy_version 113883 (0.0037) +[2024-06-18 10:25:51,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1865859072. Throughput: 0: 42994.6. Samples: 1865964040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:25:51,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 10:25:52,063][12862] Signal inference workers to stop experience collection... (27250 times) +[2024-06-18 10:25:52,110][12883] InferenceWorker_p0-w0: stopping experience collection (27250 times) +[2024-06-18 10:25:52,121][12862] Signal inference workers to resume experience collection... (27250 times) +[2024-06-18 10:25:52,140][12883] InferenceWorker_p0-w0: resuming experience collection (27250 times) +[2024-06-18 10:25:56,282][12883] Updated weights for policy 0, policy_version 113893 (0.0035) +[2024-06-18 10:25:56,996][12645] Fps is (10 sec: 39313.3, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 1866022912. Throughput: 0: 42735.2. Samples: 1866214860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:25:56,997][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 10:25:59,570][12883] Updated weights for policy 0, policy_version 113903 (0.0044) +[2024-06-18 10:26:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1866268672. Throughput: 0: 42852.6. Samples: 1866337220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:26:01,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 10:26:03,835][12883] Updated weights for policy 0, policy_version 113913 (0.0027) +[2024-06-18 10:26:06,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1866481664. Throughput: 0: 42918.2. Samples: 1866603460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:26:06,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 10:26:07,311][12883] Updated weights for policy 0, policy_version 113923 (0.0050) +[2024-06-18 10:26:11,666][12883] Updated weights for policy 0, policy_version 113933 (0.0033) +[2024-06-18 10:26:11,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1866678272. Throughput: 0: 42958.3. Samples: 1866860900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:26:11,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 10:26:14,962][12883] Updated weights for policy 0, policy_version 113943 (0.0030) +[2024-06-18 10:26:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1866924032. Throughput: 0: 42952.4. Samples: 1866983420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:26:16,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 10:26:19,392][12883] Updated weights for policy 0, policy_version 113953 (0.0041) +[2024-06-18 10:26:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1867120640. Throughput: 0: 42700.6. Samples: 1867238220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:26:21,994][12645] Avg episode reward: [(0, '0.754')] +[2024-06-18 10:26:22,823][12883] Updated weights for policy 0, policy_version 113963 (0.0051) +[2024-06-18 10:26:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1867317248. Throughput: 0: 42731.0. Samples: 1867498740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:26:26,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 10:26:27,160][12883] Updated weights for policy 0, policy_version 113973 (0.0040) +[2024-06-18 10:26:30,462][12883] Updated weights for policy 0, policy_version 113983 (0.0042) +[2024-06-18 10:26:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1867563008. Throughput: 0: 42825.8. Samples: 1867623920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 10:26:31,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 10:26:34,865][12883] Updated weights for policy 0, policy_version 113993 (0.0039) +[2024-06-18 10:26:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1867759616. Throughput: 0: 42638.6. Samples: 1867882780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:26:36,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 10:26:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114000_1867776000.pth... +[2024-06-18 10:26:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113375_1857536000.pth +[2024-06-18 10:26:38,089][12883] Updated weights for policy 0, policy_version 114003 (0.0042) +[2024-06-18 10:26:41,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1867956224. Throughput: 0: 42766.7. Samples: 1868139360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:26:41,997][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 10:26:42,587][12883] Updated weights for policy 0, policy_version 114013 (0.0027) +[2024-06-18 10:26:46,106][12883] Updated weights for policy 0, policy_version 114023 (0.0035) +[2024-06-18 10:26:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1868201984. Throughput: 0: 42812.7. Samples: 1868263800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:26:46,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 10:26:50,141][12883] Updated weights for policy 0, policy_version 114033 (0.0032) +[2024-06-18 10:26:51,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1868398592. Throughput: 0: 42660.9. Samples: 1868523200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:26:51,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 10:26:53,732][12883] Updated weights for policy 0, policy_version 114043 (0.0036) +[2024-06-18 10:26:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43146.2, 300 sec: 42654.3). Total num frames: 1868611584. Throughput: 0: 42577.9. Samples: 1868776900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:26:56,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 10:26:57,572][12883] Updated weights for policy 0, policy_version 114053 (0.0034) +[2024-06-18 10:27:01,233][12883] Updated weights for policy 0, policy_version 114063 (0.0024) +[2024-06-18 10:27:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1868857344. Throughput: 0: 42797.3. Samples: 1868909300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:27:01,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 10:27:04,959][12862] Signal inference workers to stop experience collection... (27300 times) +[2024-06-18 10:27:04,959][12862] Signal inference workers to resume experience collection... (27300 times) +[2024-06-18 10:27:05,002][12883] InferenceWorker_p0-w0: stopping experience collection (27300 times) +[2024-06-18 10:27:05,002][12883] InferenceWorker_p0-w0: resuming experience collection (27300 times) +[2024-06-18 10:27:05,090][12883] Updated weights for policy 0, policy_version 114073 (0.0034) +[2024-06-18 10:27:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1869037568. Throughput: 0: 42846.5. Samples: 1869166320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:27:06,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 10:27:08,967][12883] Updated weights for policy 0, policy_version 114083 (0.0030) +[2024-06-18 10:27:11,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1869250560. Throughput: 0: 42669.1. Samples: 1869418940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:27:11,996][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 10:27:12,899][12883] Updated weights for policy 0, policy_version 114093 (0.0041) +[2024-06-18 10:27:16,569][12883] Updated weights for policy 0, policy_version 114103 (0.0035) +[2024-06-18 10:27:16,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1869479936. Throughput: 0: 42798.8. Samples: 1869549860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:27:16,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 10:27:20,539][12883] Updated weights for policy 0, policy_version 114113 (0.0042) +[2024-06-18 10:27:21,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1869692928. Throughput: 0: 42705.0. Samples: 1869804500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:27:21,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 10:27:24,146][12883] Updated weights for policy 0, policy_version 114123 (0.0030) +[2024-06-18 10:27:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1869905920. Throughput: 0: 42759.9. Samples: 1870063460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:27:26,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 10:27:28,382][12883] Updated weights for policy 0, policy_version 114133 (0.0044) +[2024-06-18 10:27:31,795][12883] Updated weights for policy 0, policy_version 114143 (0.0030) +[2024-06-18 10:27:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1870118912. Throughput: 0: 42850.8. Samples: 1870192080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) +[2024-06-18 10:27:31,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 10:27:35,773][12883] Updated weights for policy 0, policy_version 114153 (0.0034) +[2024-06-18 10:27:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1870331904. Throughput: 0: 42805.4. Samples: 1870449440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:27:36,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 10:27:39,585][12883] Updated weights for policy 0, policy_version 114163 (0.0035) +[2024-06-18 10:27:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 1870544896. Throughput: 0: 42926.7. Samples: 1870708600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:27:41,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 10:27:43,116][12883] Updated weights for policy 0, policy_version 114173 (0.0028) +[2024-06-18 10:27:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1870757888. Throughput: 0: 42819.5. Samples: 1870836180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:27:46,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 10:27:47,119][12883] Updated weights for policy 0, policy_version 114183 (0.0033) +[2024-06-18 10:27:50,611][12883] Updated weights for policy 0, policy_version 114193 (0.0040) +[2024-06-18 10:27:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1870987264. Throughput: 0: 42769.9. Samples: 1871090960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:27:51,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 10:27:54,753][12883] Updated weights for policy 0, policy_version 114203 (0.0033) +[2024-06-18 10:27:56,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1871183872. Throughput: 0: 42862.7. Samples: 1871347760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:27:56,996][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 10:27:58,170][12883] Updated weights for policy 0, policy_version 114213 (0.0037) +[2024-06-18 10:28:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1871396864. Throughput: 0: 42968.0. Samples: 1871483420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:28:01,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 10:28:02,375][12883] Updated weights for policy 0, policy_version 114223 (0.0032) +[2024-06-18 10:28:05,765][12883] Updated weights for policy 0, policy_version 114233 (0.0029) +[2024-06-18 10:28:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1871626240. Throughput: 0: 43016.4. Samples: 1871740240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:28:06,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 10:28:10,050][12883] Updated weights for policy 0, policy_version 114243 (0.0030) +[2024-06-18 10:28:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43146.1, 300 sec: 42820.6). Total num frames: 1871839232. Throughput: 0: 42841.7. Samples: 1871991340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:28:11,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 10:28:13,377][12883] Updated weights for policy 0, policy_version 114253 (0.0030) +[2024-06-18 10:28:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1872052224. Throughput: 0: 42913.3. Samples: 1872123180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:28:16,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 10:28:17,505][12883] Updated weights for policy 0, policy_version 114263 (0.0036) +[2024-06-18 10:28:21,213][12883] Updated weights for policy 0, policy_version 114273 (0.0042) +[2024-06-18 10:28:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1872281600. Throughput: 0: 42942.6. Samples: 1872381860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:28:21,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 10:28:25,222][12883] Updated weights for policy 0, policy_version 114283 (0.0042) +[2024-06-18 10:28:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 1872494592. Throughput: 0: 42846.6. Samples: 1872636700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:28:26,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 10:28:28,772][12883] Updated weights for policy 0, policy_version 114293 (0.0035) +[2024-06-18 10:28:31,143][12862] Signal inference workers to stop experience collection... (27350 times) +[2024-06-18 10:28:31,143][12862] Signal inference workers to resume experience collection... (27350 times) +[2024-06-18 10:28:31,191][12883] InferenceWorker_p0-w0: stopping experience collection (27350 times) +[2024-06-18 10:28:31,191][12883] InferenceWorker_p0-w0: resuming experience collection (27350 times) +[2024-06-18 10:28:31,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1872691200. Throughput: 0: 42912.6. Samples: 1872767340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 10:28:31,996][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 10:28:32,763][12883] Updated weights for policy 0, policy_version 114303 (0.0027) +[2024-06-18 10:28:36,310][12883] Updated weights for policy 0, policy_version 114313 (0.0026) +[2024-06-18 10:28:37,000][12645] Fps is (10 sec: 40934.6, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 1872904192. Throughput: 0: 42916.7. Samples: 1873022480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:28:37,000][12645] Avg episode reward: [(0, '0.775')] +[2024-06-18 10:28:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114313_1872904192.pth... +[2024-06-18 10:28:37,108][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113687_1862647808.pth +[2024-06-18 10:28:40,427][12883] Updated weights for policy 0, policy_version 114323 (0.0040) +[2024-06-18 10:28:41,994][12645] Fps is (10 sec: 44246.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1873133568. Throughput: 0: 43020.3. Samples: 1873283580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:28:41,994][12645] Avg episode reward: [(0, '0.753')] +[2024-06-18 10:28:44,034][12883] Updated weights for policy 0, policy_version 114333 (0.0039) +[2024-06-18 10:28:46,994][12645] Fps is (10 sec: 40985.6, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1873313792. Throughput: 0: 42963.0. Samples: 1873416760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:28:46,994][12645] Avg episode reward: [(0, '0.786')] +[2024-06-18 10:28:48,207][12883] Updated weights for policy 0, policy_version 114343 (0.0033) +[2024-06-18 10:28:51,554][12883] Updated weights for policy 0, policy_version 114353 (0.0027) +[2024-06-18 10:28:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1873559552. Throughput: 0: 42839.7. Samples: 1873668020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:28:51,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 10:28:55,829][12883] Updated weights for policy 0, policy_version 114363 (0.0040) +[2024-06-18 10:28:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 1873756160. Throughput: 0: 43076.4. Samples: 1873929780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:28:56,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 10:28:59,057][12883] Updated weights for policy 0, policy_version 114373 (0.0037) +[2024-06-18 10:29:01,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1873969152. Throughput: 0: 42881.9. Samples: 1874052960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:29:01,996][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 10:29:03,506][12883] Updated weights for policy 0, policy_version 114383 (0.0035) +[2024-06-18 10:29:06,555][12883] Updated weights for policy 0, policy_version 114393 (0.0046) +[2024-06-18 10:29:06,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1874231296. Throughput: 0: 42981.0. Samples: 1874316000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:29:06,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 10:29:11,167][12883] Updated weights for policy 0, policy_version 114403 (0.0038) +[2024-06-18 10:29:11,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1874395136. Throughput: 0: 43106.8. Samples: 1874576500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:29:11,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 10:29:14,334][12883] Updated weights for policy 0, policy_version 114413 (0.0037) +[2024-06-18 10:29:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1874608128. Throughput: 0: 42857.3. Samples: 1874695820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:29:16,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 10:29:18,754][12883] Updated weights for policy 0, policy_version 114423 (0.0040) +[2024-06-18 10:29:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1874853888. Throughput: 0: 43050.0. Samples: 1874959460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:29:21,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 10:29:22,051][12883] Updated weights for policy 0, policy_version 114433 (0.0032) +[2024-06-18 10:29:26,486][12883] Updated weights for policy 0, policy_version 114443 (0.0034) +[2024-06-18 10:29:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1875050496. Throughput: 0: 43067.6. Samples: 1875221620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:29:26,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 10:29:29,443][12883] Updated weights for policy 0, policy_version 114453 (0.0037) +[2024-06-18 10:29:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1875247104. Throughput: 0: 42813.4. Samples: 1875343360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:29:31,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 10:29:34,146][12883] Updated weights for policy 0, policy_version 114463 (0.0027) +[2024-06-18 10:29:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43422.2, 300 sec: 42931.6). Total num frames: 1875509248. Throughput: 0: 43017.8. Samples: 1875603820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 10:29:36,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 10:29:37,029][12883] Updated weights for policy 0, policy_version 114473 (0.0036) +[2024-06-18 10:29:37,497][12862] Signal inference workers to stop experience collection... (27400 times) +[2024-06-18 10:29:37,497][12862] Signal inference workers to resume experience collection... (27400 times) +[2024-06-18 10:29:37,540][12883] InferenceWorker_p0-w0: stopping experience collection (27400 times) +[2024-06-18 10:29:37,540][12883] InferenceWorker_p0-w0: resuming experience collection (27400 times) +[2024-06-18 10:29:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1875673088. Throughput: 0: 43118.7. Samples: 1875870120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:29:41,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 10:29:42,048][12883] Updated weights for policy 0, policy_version 114483 (0.0033) +[2024-06-18 10:29:44,452][12883] Updated weights for policy 0, policy_version 114493 (0.0040) +[2024-06-18 10:29:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1875902464. Throughput: 0: 42911.5. Samples: 1875983880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:29:46,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 10:29:49,470][12883] Updated weights for policy 0, policy_version 114503 (0.0031) +[2024-06-18 10:29:51,994][12645] Fps is (10 sec: 49152.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 1876164608. Throughput: 0: 42996.5. Samples: 1876250840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:29:51,994][12645] Avg episode reward: [(0, '0.764')] +[2024-06-18 10:29:52,368][12883] Updated weights for policy 0, policy_version 114513 (0.0028) +[2024-06-18 10:29:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1876328448. Throughput: 0: 42977.2. Samples: 1876510480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:29:56,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 10:29:57,014][12883] Updated weights for policy 0, policy_version 114523 (0.0032) +[2024-06-18 10:29:59,962][12883] Updated weights for policy 0, policy_version 114533 (0.0033) +[2024-06-18 10:30:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43146.1, 300 sec: 42765.3). Total num frames: 1876557824. Throughput: 0: 42964.4. Samples: 1876629220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:30:01,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 10:30:04,604][12883] Updated weights for policy 0, policy_version 114543 (0.0027) +[2024-06-18 10:30:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1876787200. Throughput: 0: 42993.4. Samples: 1876894160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:30:06,994][12645] Avg episode reward: [(0, '0.849')] +[2024-06-18 10:30:07,092][12862] Saving new best policy, reward=0.849! +[2024-06-18 10:30:07,532][12883] Updated weights for policy 0, policy_version 114553 (0.0036) +[2024-06-18 10:30:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1876983808. Throughput: 0: 42942.6. Samples: 1877154040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:30:11,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 10:30:12,327][12883] Updated weights for policy 0, policy_version 114563 (0.0038) +[2024-06-18 10:30:15,276][12883] Updated weights for policy 0, policy_version 114573 (0.0047) +[2024-06-18 10:30:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1877213184. Throughput: 0: 42930.2. Samples: 1877275220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:30:16,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 10:30:20,177][12883] Updated weights for policy 0, policy_version 114583 (0.0039) +[2024-06-18 10:30:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 1877409792. Throughput: 0: 42882.2. Samples: 1877533520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:30:21,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 10:30:22,966][12883] Updated weights for policy 0, policy_version 114593 (0.0028) +[2024-06-18 10:30:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1877606400. Throughput: 0: 42636.4. Samples: 1877788760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:30:26,994][12645] Avg episode reward: [(0, '0.728')] +[2024-06-18 10:30:27,949][12883] Updated weights for policy 0, policy_version 114603 (0.0034) +[2024-06-18 10:30:30,589][12883] Updated weights for policy 0, policy_version 114613 (0.0041) +[2024-06-18 10:30:31,996][12645] Fps is (10 sec: 45864.6, 60 sec: 43689.0, 300 sec: 42875.8). Total num frames: 1877868544. Throughput: 0: 42874.7. Samples: 1877913340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:30:31,997][12645] Avg episode reward: [(0, '0.654')] +[2024-06-18 10:30:35,543][12883] Updated weights for policy 0, policy_version 114623 (0.0036) +[2024-06-18 10:30:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 1878048768. Throughput: 0: 42671.0. Samples: 1878171040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) +[2024-06-18 10:30:36,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 10:30:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114627_1878048768.pth... +[2024-06-18 10:30:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114000_1867776000.pth +[2024-06-18 10:30:38,539][12883] Updated weights for policy 0, policy_version 114633 (0.0042) +[2024-06-18 10:30:41,994][12645] Fps is (10 sec: 37691.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1878245376. Throughput: 0: 42661.3. Samples: 1878430240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:30:41,994][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 10:30:43,317][12883] Updated weights for policy 0, policy_version 114643 (0.0039) +[2024-06-18 10:30:46,128][12883] Updated weights for policy 0, policy_version 114653 (0.0035) +[2024-06-18 10:30:47,000][12645] Fps is (10 sec: 47484.0, 60 sec: 43686.0, 300 sec: 42930.7). Total num frames: 1878523904. Throughput: 0: 42756.6. Samples: 1878553540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:30:47,001][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 10:30:51,010][12883] Updated weights for policy 0, policy_version 114663 (0.0033) +[2024-06-18 10:30:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42932.0). Total num frames: 1878687744. Throughput: 0: 42680.8. Samples: 1878814800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:30:51,994][12645] Avg episode reward: [(0, '0.618')] +[2024-06-18 10:30:53,987][12883] Updated weights for policy 0, policy_version 114673 (0.0042) +[2024-06-18 10:30:56,994][12645] Fps is (10 sec: 37706.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1878900736. Throughput: 0: 42515.6. Samples: 1879067240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:30:56,994][12645] Avg episode reward: [(0, '0.735')] +[2024-06-18 10:30:58,658][12883] Updated weights for policy 0, policy_version 114683 (0.0041) +[2024-06-18 10:31:00,324][12862] Signal inference workers to stop experience collection... (27450 times) +[2024-06-18 10:31:00,359][12883] InferenceWorker_p0-w0: stopping experience collection (27450 times) +[2024-06-18 10:31:00,379][12862] Signal inference workers to resume experience collection... (27450 times) +[2024-06-18 10:31:00,380][12883] InferenceWorker_p0-w0: resuming experience collection (27450 times) +[2024-06-18 10:31:01,595][12883] Updated weights for policy 0, policy_version 114693 (0.0038) +[2024-06-18 10:31:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1879130112. Throughput: 0: 42575.5. Samples: 1879191120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:31:01,995][12645] Avg episode reward: [(0, '0.762')] +[2024-06-18 10:31:06,652][12883] Updated weights for policy 0, policy_version 114703 (0.0035) +[2024-06-18 10:31:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 1879310336. Throughput: 0: 42634.6. Samples: 1879452080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:31:06,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 10:31:09,121][12883] Updated weights for policy 0, policy_version 114713 (0.0034) +[2024-06-18 10:31:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1879539712. Throughput: 0: 42548.1. Samples: 1879703420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:31:11,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 10:31:14,501][12883] Updated weights for policy 0, policy_version 114723 (0.0045) +[2024-06-18 10:31:16,793][12883] Updated weights for policy 0, policy_version 114733 (0.0029) +[2024-06-18 10:31:16,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1879785472. Throughput: 0: 42691.4. Samples: 1879834360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:31:16,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 10:31:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1879932928. Throughput: 0: 42624.1. Samples: 1880089120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:31:21,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 10:31:22,031][12883] Updated weights for policy 0, policy_version 114743 (0.0031) +[2024-06-18 10:31:24,674][12883] Updated weights for policy 0, policy_version 114753 (0.0033) +[2024-06-18 10:31:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1880195072. Throughput: 0: 42465.8. Samples: 1880341200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:31:26,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 10:31:29,570][12883] Updated weights for policy 0, policy_version 114763 (0.0030) +[2024-06-18 10:31:31,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42326.9, 300 sec: 42876.1). Total num frames: 1880408064. Throughput: 0: 42711.3. Samples: 1880475280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:31:31,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 10:31:32,310][12883] Updated weights for policy 0, policy_version 114773 (0.0037) +[2024-06-18 10:31:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 1880588288. Throughput: 0: 42521.8. Samples: 1880728280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 10:31:36,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 10:31:37,129][12883] Updated weights for policy 0, policy_version 114783 (0.0044) +[2024-06-18 10:31:39,912][12883] Updated weights for policy 0, policy_version 114793 (0.0040) +[2024-06-18 10:31:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1880850432. Throughput: 0: 42528.1. Samples: 1880981000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:31:41,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 10:31:44,635][12883] Updated weights for policy 0, policy_version 114803 (0.0032) +[2024-06-18 10:31:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42056.8, 300 sec: 42876.1). Total num frames: 1881047040. Throughput: 0: 42733.5. Samples: 1881114120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:31:46,994][12645] Avg episode reward: [(0, '0.707')] +[2024-06-18 10:31:47,887][12883] Updated weights for policy 0, policy_version 114813 (0.0029) +[2024-06-18 10:31:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1881243648. Throughput: 0: 42562.7. Samples: 1881367400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:31:51,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 10:31:52,271][12883] Updated weights for policy 0, policy_version 114823 (0.0028) +[2024-06-18 10:31:55,509][12883] Updated weights for policy 0, policy_version 114833 (0.0040) +[2024-06-18 10:31:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1881456640. Throughput: 0: 42655.0. Samples: 1881622900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:31:56,994][12645] Avg episode reward: [(0, '0.319')] +[2024-06-18 10:32:00,143][12883] Updated weights for policy 0, policy_version 114843 (0.0046) +[2024-06-18 10:32:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1881686016. Throughput: 0: 42729.3. Samples: 1881757180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:32:02,003][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 10:32:03,091][12883] Updated weights for policy 0, policy_version 114853 (0.0027) +[2024-06-18 10:32:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 1881866240. Throughput: 0: 42687.2. Samples: 1882010040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:32:06,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 10:32:07,637][12883] Updated weights for policy 0, policy_version 114863 (0.0031) +[2024-06-18 10:32:07,654][12862] Signal inference workers to stop experience collection... (27500 times) +[2024-06-18 10:32:07,655][12862] Signal inference workers to resume experience collection... (27500 times) +[2024-06-18 10:32:07,673][12883] InferenceWorker_p0-w0: stopping experience collection (27500 times) +[2024-06-18 10:32:07,673][12883] InferenceWorker_p0-w0: resuming experience collection (27500 times) +[2024-06-18 10:32:10,662][12883] Updated weights for policy 0, policy_version 114873 (0.0028) +[2024-06-18 10:32:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1882112000. Throughput: 0: 42753.3. Samples: 1882265100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:32:11,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 10:32:15,174][12883] Updated weights for policy 0, policy_version 114883 (0.0033) +[2024-06-18 10:32:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1882324992. Throughput: 0: 42773.8. Samples: 1882400100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:32:17,002][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 10:32:18,393][12883] Updated weights for policy 0, policy_version 114893 (0.0039) +[2024-06-18 10:32:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1882521600. Throughput: 0: 42700.7. Samples: 1882649820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:32:21,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 10:32:22,825][12883] Updated weights for policy 0, policy_version 114903 (0.0038) +[2024-06-18 10:32:26,157][12883] Updated weights for policy 0, policy_version 114913 (0.0041) +[2024-06-18 10:32:26,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1882750976. Throughput: 0: 42702.0. Samples: 1882902600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:32:26,994][12645] Avg episode reward: [(0, '0.671')] +[2024-06-18 10:32:30,622][12883] Updated weights for policy 0, policy_version 114923 (0.0026) +[2024-06-18 10:32:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1882963968. Throughput: 0: 42725.7. Samples: 1883036780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:32:31,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 10:32:33,980][12883] Updated weights for policy 0, policy_version 114933 (0.0027) +[2024-06-18 10:32:36,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1883160576. Throughput: 0: 42752.9. Samples: 1883291280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 10:32:36,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 10:32:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114939_1883160576.pth... +[2024-06-18 10:32:37,052][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114313_1872904192.pth +[2024-06-18 10:32:38,312][12883] Updated weights for policy 0, policy_version 114943 (0.0032) +[2024-06-18 10:32:41,504][12883] Updated weights for policy 0, policy_version 114953 (0.0036) +[2024-06-18 10:32:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1883406336. Throughput: 0: 42624.1. Samples: 1883540980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:32:41,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 10:32:46,038][12883] Updated weights for policy 0, policy_version 114963 (0.0031) +[2024-06-18 10:32:46,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1883619328. Throughput: 0: 42725.0. Samples: 1883679800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:32:46,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 10:32:49,347][12883] Updated weights for policy 0, policy_version 114973 (0.0030) +[2024-06-18 10:32:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 1883783168. Throughput: 0: 42589.6. Samples: 1883926580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:32:51,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 10:32:53,553][12883] Updated weights for policy 0, policy_version 114983 (0.0036) +[2024-06-18 10:32:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1884028928. Throughput: 0: 42523.5. Samples: 1884178660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:32:56,997][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 10:32:57,079][12883] Updated weights for policy 0, policy_version 114993 (0.0032) +[2024-06-18 10:33:01,348][12883] Updated weights for policy 0, policy_version 115003 (0.0034) +[2024-06-18 10:33:01,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1884241920. Throughput: 0: 42491.1. Samples: 1884312200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:33:01,994][12645] Avg episode reward: [(0, '0.706')] +[2024-06-18 10:33:04,722][12883] Updated weights for policy 0, policy_version 115013 (0.0033) +[2024-06-18 10:33:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1884438528. Throughput: 0: 42432.5. Samples: 1884559280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:33:06,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 10:33:08,966][12883] Updated weights for policy 0, policy_version 115023 (0.0026) +[2024-06-18 10:33:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1884667904. Throughput: 0: 42685.9. Samples: 1884823460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:33:11,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 10:33:12,485][12883] Updated weights for policy 0, policy_version 115033 (0.0036) +[2024-06-18 10:33:16,605][12883] Updated weights for policy 0, policy_version 115043 (0.0029) +[2024-06-18 10:33:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1884897280. Throughput: 0: 42527.9. Samples: 1884950540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:33:16,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 10:33:20,179][12883] Updated weights for policy 0, policy_version 115053 (0.0027) +[2024-06-18 10:33:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1885077504. Throughput: 0: 42419.0. Samples: 1885200140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:33:21,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 10:33:23,457][12862] Signal inference workers to stop experience collection... (27550 times) +[2024-06-18 10:33:23,457][12862] Signal inference workers to resume experience collection... (27550 times) +[2024-06-18 10:33:23,469][12883] InferenceWorker_p0-w0: stopping experience collection (27550 times) +[2024-06-18 10:33:23,502][12883] InferenceWorker_p0-w0: resuming experience collection (27550 times) +[2024-06-18 10:33:24,182][12883] Updated weights for policy 0, policy_version 115063 (0.0022) +[2024-06-18 10:33:26,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 1885306880. Throughput: 0: 42642.3. Samples: 1885459880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:33:26,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 10:33:27,826][12883] Updated weights for policy 0, policy_version 115073 (0.0027) +[2024-06-18 10:33:31,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 1885503488. Throughput: 0: 42498.3. Samples: 1885592220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:33:31,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 10:33:32,073][12883] Updated weights for policy 0, policy_version 115083 (0.0037) +[2024-06-18 10:33:35,555][12883] Updated weights for policy 0, policy_version 115093 (0.0039) +[2024-06-18 10:33:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1885732864. Throughput: 0: 42593.4. Samples: 1885843280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) +[2024-06-18 10:33:36,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 10:33:39,832][12883] Updated weights for policy 0, policy_version 115103 (0.0036) +[2024-06-18 10:33:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1885962240. Throughput: 0: 42680.9. Samples: 1886099300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:33:41,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 10:33:43,185][12883] Updated weights for policy 0, policy_version 115113 (0.0031) +[2024-06-18 10:33:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1886142464. Throughput: 0: 42565.3. Samples: 1886227640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:33:46,994][12645] Avg episode reward: [(0, '0.070')] +[2024-06-18 10:33:47,290][12883] Updated weights for policy 0, policy_version 115123 (0.0032) +[2024-06-18 10:33:50,847][12883] Updated weights for policy 0, policy_version 115133 (0.0029) +[2024-06-18 10:33:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1886388224. Throughput: 0: 42864.6. Samples: 1886488180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:33:51,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 10:33:54,993][12883] Updated weights for policy 0, policy_version 115143 (0.0021) +[2024-06-18 10:33:56,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1886601216. Throughput: 0: 42587.0. Samples: 1886739880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:33:56,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 10:33:58,423][12883] Updated weights for policy 0, policy_version 115153 (0.0028) +[2024-06-18 10:34:01,996][12645] Fps is (10 sec: 40950.3, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1886797824. Throughput: 0: 42620.2. Samples: 1886868540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:01,997][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 10:34:02,619][12883] Updated weights for policy 0, policy_version 115163 (0.0037) +[2024-06-18 10:34:06,119][12883] Updated weights for policy 0, policy_version 115173 (0.0025) +[2024-06-18 10:34:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1887027200. Throughput: 0: 42786.2. Samples: 1887125520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:06,995][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 10:34:10,347][12883] Updated weights for policy 0, policy_version 115183 (0.0028) +[2024-06-18 10:34:11,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1887223808. Throughput: 0: 42868.8. Samples: 1887388980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:11,994][12645] Avg episode reward: [(0, '0.146')] +[2024-06-18 10:34:13,911][12883] Updated weights for policy 0, policy_version 115193 (0.0028) +[2024-06-18 10:34:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1887436800. Throughput: 0: 42587.9. Samples: 1887508680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:16,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 10:34:18,033][12883] Updated weights for policy 0, policy_version 115203 (0.0039) +[2024-06-18 10:34:21,542][12883] Updated weights for policy 0, policy_version 115213 (0.0030) +[2024-06-18 10:34:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1887666176. Throughput: 0: 42804.4. Samples: 1887769480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:21,994][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 10:34:25,594][12883] Updated weights for policy 0, policy_version 115223 (0.0023) +[2024-06-18 10:34:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1887862784. Throughput: 0: 42980.4. Samples: 1888033420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:26,994][12645] Avg episode reward: [(0, '0.693')] +[2024-06-18 10:34:28,993][12883] Updated weights for policy 0, policy_version 115233 (0.0032) +[2024-06-18 10:34:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1888092160. Throughput: 0: 42882.3. Samples: 1888157340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:31,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 10:34:33,135][12883] Updated weights for policy 0, policy_version 115243 (0.0050) +[2024-06-18 10:34:36,654][12883] Updated weights for policy 0, policy_version 115253 (0.0030) +[2024-06-18 10:34:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1888305152. Throughput: 0: 42790.5. Samples: 1888413760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:36,995][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 10:34:37,096][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115254_1888321536.pth... +[2024-06-18 10:34:37,145][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114627_1878048768.pth +[2024-06-18 10:34:40,743][12883] Updated weights for policy 0, policy_version 115263 (0.0032) +[2024-06-18 10:34:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1888501760. Throughput: 0: 42967.3. Samples: 1888673400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 10:34:41,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 10:34:44,185][12883] Updated weights for policy 0, policy_version 115273 (0.0038) +[2024-06-18 10:34:46,526][12862] Signal inference workers to stop experience collection... (27600 times) +[2024-06-18 10:34:46,526][12862] Signal inference workers to resume experience collection... (27600 times) +[2024-06-18 10:34:46,565][12883] InferenceWorker_p0-w0: stopping experience collection (27600 times) +[2024-06-18 10:34:46,565][12883] InferenceWorker_p0-w0: resuming experience collection (27600 times) +[2024-06-18 10:34:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1888731136. Throughput: 0: 42876.9. Samples: 1888797900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:34:46,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 10:34:48,246][12883] Updated weights for policy 0, policy_version 115283 (0.0035) +[2024-06-18 10:34:51,824][12883] Updated weights for policy 0, policy_version 115293 (0.0025) +[2024-06-18 10:34:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1888960512. Throughput: 0: 43010.0. Samples: 1889060960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:34:51,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 10:34:56,300][12883] Updated weights for policy 0, policy_version 115303 (0.0044) +[2024-06-18 10:34:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1889157120. Throughput: 0: 42850.6. Samples: 1889317260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:34:56,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 10:34:59,834][12883] Updated weights for policy 0, policy_version 115313 (0.0038) +[2024-06-18 10:35:01,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42872.9, 300 sec: 42653.9). Total num frames: 1889370112. Throughput: 0: 42895.4. Samples: 1889438980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:01,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 10:35:04,090][12883] Updated weights for policy 0, policy_version 115323 (0.0028) +[2024-06-18 10:35:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1889599488. Throughput: 0: 42949.9. Samples: 1889702220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:06,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 10:35:07,433][12883] Updated weights for policy 0, policy_version 115333 (0.0034) +[2024-06-18 10:35:11,543][12883] Updated weights for policy 0, policy_version 115343 (0.0045) +[2024-06-18 10:35:11,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1889779712. Throughput: 0: 42772.5. Samples: 1889958180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:11,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 10:35:15,073][12883] Updated weights for policy 0, policy_version 115353 (0.0037) +[2024-06-18 10:35:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1890025472. Throughput: 0: 42793.6. Samples: 1890083060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:16,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 10:35:18,968][12883] Updated weights for policy 0, policy_version 115363 (0.0034) +[2024-06-18 10:35:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1890222080. Throughput: 0: 42965.0. Samples: 1890347180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:21,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 10:35:22,646][12883] Updated weights for policy 0, policy_version 115373 (0.0039) +[2024-06-18 10:35:26,503][12883] Updated weights for policy 0, policy_version 115383 (0.0029) +[2024-06-18 10:35:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 1890451456. Throughput: 0: 42736.9. Samples: 1890596560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:26,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 10:35:30,362][12883] Updated weights for policy 0, policy_version 115393 (0.0028) +[2024-06-18 10:35:31,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1890680832. Throughput: 0: 42973.8. Samples: 1890731720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:31,994][12645] Avg episode reward: [(0, '0.564')] +[2024-06-18 10:35:34,102][12883] Updated weights for policy 0, policy_version 115403 (0.0038) +[2024-06-18 10:35:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1890861056. Throughput: 0: 42945.7. Samples: 1890993520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:36,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 10:35:38,153][12883] Updated weights for policy 0, policy_version 115413 (0.0033) +[2024-06-18 10:35:41,731][12883] Updated weights for policy 0, policy_version 115423 (0.0032) +[2024-06-18 10:35:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42654.9). Total num frames: 1891106816. Throughput: 0: 42930.8. Samples: 1891249140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:35:41,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 10:35:45,801][12883] Updated weights for policy 0, policy_version 115433 (0.0038) +[2024-06-18 10:35:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1891303424. Throughput: 0: 43066.8. Samples: 1891376980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:35:46,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 10:35:49,345][12883] Updated weights for policy 0, policy_version 115443 (0.0034) +[2024-06-18 10:35:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1891516416. Throughput: 0: 43001.4. Samples: 1891637280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:35:51,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 10:35:53,227][12883] Updated weights for policy 0, policy_version 115453 (0.0037) +[2024-06-18 10:35:56,807][12883] Updated weights for policy 0, policy_version 115463 (0.0023) +[2024-06-18 10:35:56,996][12645] Fps is (10 sec: 44227.2, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 1891745792. Throughput: 0: 42946.7. Samples: 1891890880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:35:56,996][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 10:36:01,089][12883] Updated weights for policy 0, policy_version 115473 (0.0028) +[2024-06-18 10:36:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1891942400. Throughput: 0: 43068.1. Samples: 1892021120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:01,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 10:36:04,336][12883] Updated weights for policy 0, policy_version 115483 (0.0028) +[2024-06-18 10:36:06,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1892139008. Throughput: 0: 42788.1. Samples: 1892272640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:06,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 10:36:07,463][12862] Signal inference workers to stop experience collection... (27650 times) +[2024-06-18 10:36:07,463][12862] Signal inference workers to resume experience collection... (27650 times) +[2024-06-18 10:36:07,487][12883] InferenceWorker_p0-w0: stopping experience collection (27650 times) +[2024-06-18 10:36:07,488][12883] InferenceWorker_p0-w0: resuming experience collection (27650 times) +[2024-06-18 10:36:08,753][12883] Updated weights for policy 0, policy_version 115493 (0.0037) +[2024-06-18 10:36:11,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1892401152. Throughput: 0: 42803.2. Samples: 1892522700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:11,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 10:36:11,998][12883] Updated weights for policy 0, policy_version 115503 (0.0037) +[2024-06-18 10:36:16,450][12883] Updated weights for policy 0, policy_version 115513 (0.0039) +[2024-06-18 10:36:16,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1892597760. Throughput: 0: 42944.8. Samples: 1892664240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:16,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 10:36:19,779][12883] Updated weights for policy 0, policy_version 115523 (0.0027) +[2024-06-18 10:36:21,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1892777984. Throughput: 0: 42524.4. Samples: 1892907120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:21,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 10:36:24,126][12883] Updated weights for policy 0, policy_version 115533 (0.0027) +[2024-06-18 10:36:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1893023744. Throughput: 0: 42623.5. Samples: 1893167200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:26,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 10:36:27,331][12883] Updated weights for policy 0, policy_version 115543 (0.0038) +[2024-06-18 10:36:31,660][12883] Updated weights for policy 0, policy_version 115553 (0.0032) +[2024-06-18 10:36:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1893236736. Throughput: 0: 42725.0. Samples: 1893299600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:31,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 10:36:35,664][12883] Updated weights for policy 0, policy_version 115563 (0.0031) +[2024-06-18 10:36:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1893433344. Throughput: 0: 42593.6. Samples: 1893554000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:36,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 10:36:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115566_1893433344.pth... +[2024-06-18 10:36:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114939_1883160576.pth +[2024-06-18 10:36:39,237][12883] Updated weights for policy 0, policy_version 115573 (0.0023) +[2024-06-18 10:36:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1893662720. Throughput: 0: 42522.6. Samples: 1893804300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 10:36:41,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 10:36:43,146][12883] Updated weights for policy 0, policy_version 115583 (0.0033) +[2024-06-18 10:36:46,968][12883] Updated weights for policy 0, policy_version 115593 (0.0044) +[2024-06-18 10:36:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1893875712. Throughput: 0: 42542.2. Samples: 1893935520. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:36:46,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 10:36:50,667][12883] Updated weights for policy 0, policy_version 115603 (0.0032) +[2024-06-18 10:36:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1894072320. Throughput: 0: 42534.1. Samples: 1894186680. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:36:51,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 10:36:54,634][12883] Updated weights for policy 0, policy_version 115613 (0.0035) +[2024-06-18 10:36:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1894301696. Throughput: 0: 42745.7. Samples: 1894446260. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:36:56,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 10:36:58,141][12883] Updated weights for policy 0, policy_version 115623 (0.0041) +[2024-06-18 10:37:01,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 1894498304. Throughput: 0: 42504.1. Samples: 1894577020. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:01,996][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 10:37:02,415][12883] Updated weights for policy 0, policy_version 115633 (0.0045) +[2024-06-18 10:37:05,881][12883] Updated weights for policy 0, policy_version 115643 (0.0031) +[2024-06-18 10:37:06,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1894711296. Throughput: 0: 42767.4. Samples: 1894831660. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:06,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 10:37:10,283][12883] Updated weights for policy 0, policy_version 115653 (0.0034) +[2024-06-18 10:37:11,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1894940672. Throughput: 0: 42736.4. Samples: 1895090340. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:11,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 10:37:13,745][12883] Updated weights for policy 0, policy_version 115663 (0.0033) +[2024-06-18 10:37:16,996][12645] Fps is (10 sec: 42589.7, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 1895137280. Throughput: 0: 42700.0. Samples: 1895221200. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:16,996][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 10:37:17,822][12883] Updated weights for policy 0, policy_version 115673 (0.0046) +[2024-06-18 10:37:21,968][12883] Updated weights for policy 0, policy_version 115683 (0.0030) +[2024-06-18 10:37:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1895350272. Throughput: 0: 42494.8. Samples: 1895466260. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:21,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 10:37:25,142][12862] Signal inference workers to stop experience collection... (27700 times) +[2024-06-18 10:37:25,142][12862] Signal inference workers to resume experience collection... (27700 times) +[2024-06-18 10:37:25,168][12883] InferenceWorker_p0-w0: stopping experience collection (27700 times) +[2024-06-18 10:37:25,168][12883] InferenceWorker_p0-w0: resuming experience collection (27700 times) +[2024-06-18 10:37:25,439][12883] Updated weights for policy 0, policy_version 115693 (0.0043) +[2024-06-18 10:37:26,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1895563264. Throughput: 0: 42653.7. Samples: 1895723720. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:26,994][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 10:37:29,844][12883] Updated weights for policy 0, policy_version 115703 (0.0037) +[2024-06-18 10:37:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1895759872. Throughput: 0: 42540.9. Samples: 1895849860. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:31,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 10:37:33,625][12883] Updated weights for policy 0, policy_version 115713 (0.0034) +[2024-06-18 10:37:36,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 1895989248. Throughput: 0: 42350.4. Samples: 1896092540. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:36,996][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 10:37:37,487][12883] Updated weights for policy 0, policy_version 115723 (0.0041) +[2024-06-18 10:37:41,382][12883] Updated weights for policy 0, policy_version 115733 (0.0033) +[2024-06-18 10:37:41,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 1896202240. Throughput: 0: 42411.2. Samples: 1896354860. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:41,996][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 10:37:44,951][12883] Updated weights for policy 0, policy_version 115743 (0.0031) +[2024-06-18 10:37:46,994][12645] Fps is (10 sec: 39330.4, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1896382464. Throughput: 0: 42283.9. Samples: 1896479700. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) +[2024-06-18 10:37:46,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 10:37:49,112][12883] Updated weights for policy 0, policy_version 115753 (0.0041) +[2024-06-18 10:37:51,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1896628224. Throughput: 0: 42219.7. Samples: 1896731540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:37:51,994][12645] Avg episode reward: [(0, '0.663')] +[2024-06-18 10:37:52,567][12883] Updated weights for policy 0, policy_version 115763 (0.0042) +[2024-06-18 10:37:56,894][12883] Updated weights for policy 0, policy_version 115773 (0.0029) +[2024-06-18 10:37:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1896824832. Throughput: 0: 42274.3. Samples: 1896992680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:37:56,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 10:38:00,086][12883] Updated weights for policy 0, policy_version 115783 (0.0032) +[2024-06-18 10:38:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 1897037824. Throughput: 0: 42142.1. Samples: 1897117500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:01,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 10:38:04,419][12883] Updated weights for policy 0, policy_version 115793 (0.0033) +[2024-06-18 10:38:06,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1897283584. Throughput: 0: 42357.2. Samples: 1897372340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:06,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 10:38:07,607][12883] Updated weights for policy 0, policy_version 115803 (0.0042) +[2024-06-18 10:38:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1897463808. Throughput: 0: 42564.0. Samples: 1897639100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:11,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 10:38:12,086][12883] Updated weights for policy 0, policy_version 115813 (0.0031) +[2024-06-18 10:38:15,324][12883] Updated weights for policy 0, policy_version 115823 (0.0047) +[2024-06-18 10:38:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42326.8, 300 sec: 42709.5). Total num frames: 1897676800. Throughput: 0: 42327.5. Samples: 1897754600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:16,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 10:38:19,657][12883] Updated weights for policy 0, policy_version 115833 (0.0038) +[2024-06-18 10:38:21,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1897938944. Throughput: 0: 42671.5. Samples: 1898012660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:21,994][12645] Avg episode reward: [(0, '0.719')] +[2024-06-18 10:38:23,766][12883] Updated weights for policy 0, policy_version 115843 (0.0032) +[2024-06-18 10:38:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1898102784. Throughput: 0: 42572.3. Samples: 1898270520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:26,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 10:38:27,502][12883] Updated weights for policy 0, policy_version 115853 (0.0042) +[2024-06-18 10:38:31,357][12883] Updated weights for policy 0, policy_version 115863 (0.0032) +[2024-06-18 10:38:31,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1898315776. Throughput: 0: 42552.5. Samples: 1898394560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:31,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 10:38:35,066][12883] Updated weights for policy 0, policy_version 115873 (0.0040) +[2024-06-18 10:38:35,793][12862] Signal inference workers to stop experience collection... (27750 times) +[2024-06-18 10:38:35,847][12883] InferenceWorker_p0-w0: stopping experience collection (27750 times) +[2024-06-18 10:38:35,916][12862] Signal inference workers to resume experience collection... (27750 times) +[2024-06-18 10:38:35,916][12883] InferenceWorker_p0-w0: resuming experience collection (27750 times) +[2024-06-18 10:38:36,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1898577920. Throughput: 0: 42817.4. Samples: 1898658320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:36,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 10:38:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115880_1898577920.pth... +[2024-06-18 10:38:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115254_1888321536.pth +[2024-06-18 10:38:39,063][12883] Updated weights for policy 0, policy_version 115883 (0.0031) +[2024-06-18 10:38:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 1898741760. Throughput: 0: 42807.0. Samples: 1898919000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:41,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 10:38:42,883][12883] Updated weights for policy 0, policy_version 115893 (0.0024) +[2024-06-18 10:38:46,691][12883] Updated weights for policy 0, policy_version 115903 (0.0024) +[2024-06-18 10:38:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1898954752. Throughput: 0: 42645.7. Samples: 1899036560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 10:38:46,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 10:38:50,433][12883] Updated weights for policy 0, policy_version 115913 (0.0041) +[2024-06-18 10:38:51,994][12645] Fps is (10 sec: 47514.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1899216896. Throughput: 0: 42983.7. Samples: 1899306600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:38:51,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 10:38:54,785][12883] Updated weights for policy 0, policy_version 115923 (0.0037) +[2024-06-18 10:38:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1899397120. Throughput: 0: 42789.4. Samples: 1899564620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:38:56,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 10:38:58,020][12883] Updated weights for policy 0, policy_version 115933 (0.0037) +[2024-06-18 10:39:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1899593728. Throughput: 0: 42821.0. Samples: 1899681540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:01,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 10:39:02,503][12883] Updated weights for policy 0, policy_version 115943 (0.0027) +[2024-06-18 10:39:05,609][12883] Updated weights for policy 0, policy_version 115953 (0.0036) +[2024-06-18 10:39:06,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1899855872. Throughput: 0: 42944.7. Samples: 1899945180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:06,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 10:39:09,979][12883] Updated weights for policy 0, policy_version 115963 (0.0037) +[2024-06-18 10:39:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1900019712. Throughput: 0: 43003.2. Samples: 1900205660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:11,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 10:39:13,464][12883] Updated weights for policy 0, policy_version 115973 (0.0029) +[2024-06-18 10:39:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1900249088. Throughput: 0: 42889.7. Samples: 1900324600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:16,994][12645] Avg episode reward: [(0, '0.642')] +[2024-06-18 10:39:17,669][12883] Updated weights for policy 0, policy_version 115983 (0.0036) +[2024-06-18 10:39:21,047][12883] Updated weights for policy 0, policy_version 115993 (0.0029) +[2024-06-18 10:39:21,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1900494848. Throughput: 0: 42914.3. Samples: 1900589460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:21,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 10:39:25,283][12883] Updated weights for policy 0, policy_version 116003 (0.0031) +[2024-06-18 10:39:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1900675072. Throughput: 0: 42868.5. Samples: 1900848080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:26,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 10:39:28,462][12862] Signal inference workers to stop experience collection... (27800 times) +[2024-06-18 10:39:28,495][12883] InferenceWorker_p0-w0: stopping experience collection (27800 times) +[2024-06-18 10:39:28,515][12862] Signal inference workers to resume experience collection... (27800 times) +[2024-06-18 10:39:28,517][12883] InferenceWorker_p0-w0: resuming experience collection (27800 times) +[2024-06-18 10:39:28,911][12883] Updated weights for policy 0, policy_version 116013 (0.0031) +[2024-06-18 10:39:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1900904448. Throughput: 0: 42787.5. Samples: 1900962000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:31,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 10:39:33,191][12883] Updated weights for policy 0, policy_version 116023 (0.0031) +[2024-06-18 10:39:36,413][12883] Updated weights for policy 0, policy_version 116033 (0.0044) +[2024-06-18 10:39:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1901117440. Throughput: 0: 42712.2. Samples: 1901228660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:36,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 10:39:40,733][12883] Updated weights for policy 0, policy_version 116043 (0.0038) +[2024-06-18 10:39:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1901297664. Throughput: 0: 42801.3. Samples: 1901490680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:41,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 10:39:44,059][12883] Updated weights for policy 0, policy_version 116053 (0.0043) +[2024-06-18 10:39:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1901543424. Throughput: 0: 42795.4. Samples: 1901607340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 10:39:46,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 10:39:48,307][12883] Updated weights for policy 0, policy_version 116063 (0.0024) +[2024-06-18 10:39:51,744][12883] Updated weights for policy 0, policy_version 116073 (0.0047) +[2024-06-18 10:39:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1901756416. Throughput: 0: 42664.2. Samples: 1901865060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:39:51,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 10:39:55,748][12883] Updated weights for policy 0, policy_version 116083 (0.0040) +[2024-06-18 10:39:56,994][12645] Fps is (10 sec: 37684.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1901920256. Throughput: 0: 42577.9. Samples: 1902121660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:39:56,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 10:39:59,346][12883] Updated weights for policy 0, policy_version 116093 (0.0031) +[2024-06-18 10:40:01,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1902198784. Throughput: 0: 42627.1. Samples: 1902242820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:01,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 10:40:03,246][12883] Updated weights for policy 0, policy_version 116103 (0.0035) +[2024-06-18 10:40:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1902379008. Throughput: 0: 42684.4. Samples: 1902510260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:06,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 10:40:07,021][12883] Updated weights for policy 0, policy_version 116113 (0.0032) +[2024-06-18 10:40:10,858][12883] Updated weights for policy 0, policy_version 116123 (0.0026) +[2024-06-18 10:40:11,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1902575616. Throughput: 0: 42452.0. Samples: 1902758420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:11,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 10:40:14,676][12883] Updated weights for policy 0, policy_version 116133 (0.0027) +[2024-06-18 10:40:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1902837760. Throughput: 0: 42677.9. Samples: 1902882500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:16,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 10:40:18,322][12883] Updated weights for policy 0, policy_version 116143 (0.0032) +[2024-06-18 10:40:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1903001600. Throughput: 0: 42602.0. Samples: 1903145740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:21,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 10:40:22,652][12883] Updated weights for policy 0, policy_version 116153 (0.0035) +[2024-06-18 10:40:24,090][12862] Signal inference workers to stop experience collection... (27850 times) +[2024-06-18 10:40:24,139][12883] InferenceWorker_p0-w0: stopping experience collection (27850 times) +[2024-06-18 10:40:24,146][12862] Signal inference workers to resume experience collection... (27850 times) +[2024-06-18 10:40:24,162][12883] InferenceWorker_p0-w0: resuming experience collection (27850 times) +[2024-06-18 10:40:26,022][12883] Updated weights for policy 0, policy_version 116163 (0.0030) +[2024-06-18 10:40:26,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1903214592. Throughput: 0: 42215.0. Samples: 1903390360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:26,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 10:40:30,235][12883] Updated weights for policy 0, policy_version 116173 (0.0021) +[2024-06-18 10:40:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1903460352. Throughput: 0: 42524.6. Samples: 1903520940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:31,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 10:40:34,198][12883] Updated weights for policy 0, policy_version 116183 (0.0022) +[2024-06-18 10:40:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 1903624192. Throughput: 0: 42496.4. Samples: 1903777400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:36,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 10:40:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116189_1903640576.pth... +[2024-06-18 10:40:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115566_1893433344.pth +[2024-06-18 10:40:37,899][12883] Updated weights for policy 0, policy_version 116193 (0.0033) +[2024-06-18 10:40:41,887][12883] Updated weights for policy 0, policy_version 116203 (0.0027) +[2024-06-18 10:40:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1903869952. Throughput: 0: 42389.1. Samples: 1904029180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:41,995][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 10:40:45,512][12883] Updated weights for policy 0, policy_version 116213 (0.0041) +[2024-06-18 10:40:46,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1904099328. Throughput: 0: 42651.2. Samples: 1904162120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) +[2024-06-18 10:40:46,994][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 10:40:49,479][12883] Updated weights for policy 0, policy_version 116223 (0.0037) +[2024-06-18 10:40:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.0, 300 sec: 42432.1). Total num frames: 1904263168. Throughput: 0: 42241.2. Samples: 1904411120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:40:51,995][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 10:40:53,322][12883] Updated weights for policy 0, policy_version 116233 (0.0038) +[2024-06-18 10:40:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1904492544. Throughput: 0: 42168.3. Samples: 1904656000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:40:56,994][12645] Avg episode reward: [(0, '0.713')] +[2024-06-18 10:40:57,630][12883] Updated weights for policy 0, policy_version 116243 (0.0037) +[2024-06-18 10:41:01,103][12883] Updated weights for policy 0, policy_version 116253 (0.0029) +[2024-06-18 10:41:01,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1904738304. Throughput: 0: 42458.1. Samples: 1904793120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:01,994][12645] Avg episode reward: [(0, '0.682')] +[2024-06-18 10:41:05,208][12883] Updated weights for policy 0, policy_version 116263 (0.0026) +[2024-06-18 10:41:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1904902144. Throughput: 0: 42247.5. Samples: 1905046880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:06,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 10:41:08,687][12883] Updated weights for policy 0, policy_version 116273 (0.0036) +[2024-06-18 10:41:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1905147904. Throughput: 0: 42261.0. Samples: 1905292100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:11,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 10:41:13,209][12883] Updated weights for policy 0, policy_version 116283 (0.0046) +[2024-06-18 10:41:16,435][12883] Updated weights for policy 0, policy_version 116293 (0.0030) +[2024-06-18 10:41:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1905360896. Throughput: 0: 42376.4. Samples: 1905427880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:16,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 10:41:20,780][12883] Updated weights for policy 0, policy_version 116303 (0.0037) +[2024-06-18 10:41:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1905541120. Throughput: 0: 42329.8. Samples: 1905682240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:21,994][12645] Avg episode reward: [(0, '0.670')] +[2024-06-18 10:41:24,085][12883] Updated weights for policy 0, policy_version 116313 (0.0031) +[2024-06-18 10:41:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1905786880. Throughput: 0: 42445.0. Samples: 1905939200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:26,994][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 10:41:28,423][12883] Updated weights for policy 0, policy_version 116323 (0.0028) +[2024-06-18 10:41:31,671][12883] Updated weights for policy 0, policy_version 116333 (0.0036) +[2024-06-18 10:41:31,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1905999872. Throughput: 0: 42328.9. Samples: 1906066920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:31,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 10:41:36,143][12883] Updated weights for policy 0, policy_version 116343 (0.0029) +[2024-06-18 10:41:36,607][12862] Signal inference workers to stop experience collection... (27900 times) +[2024-06-18 10:41:36,607][12862] Signal inference workers to resume experience collection... (27900 times) +[2024-06-18 10:41:36,649][12883] InferenceWorker_p0-w0: stopping experience collection (27900 times) +[2024-06-18 10:41:36,649][12883] InferenceWorker_p0-w0: resuming experience collection (27900 times) +[2024-06-18 10:41:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1906212864. Throughput: 0: 42550.3. Samples: 1906325880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:36,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 10:41:39,268][12883] Updated weights for policy 0, policy_version 116353 (0.0028) +[2024-06-18 10:41:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1906425856. Throughput: 0: 42744.4. Samples: 1906579500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:41,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 10:41:43,936][12883] Updated weights for policy 0, policy_version 116363 (0.0036) +[2024-06-18 10:41:46,778][12883] Updated weights for policy 0, policy_version 116373 (0.0045) +[2024-06-18 10:41:46,997][12645] Fps is (10 sec: 44223.3, 60 sec: 42596.2, 300 sec: 42653.5). Total num frames: 1906655232. Throughput: 0: 42639.3. Samples: 1906712020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:46,997][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 10:41:51,555][12883] Updated weights for policy 0, policy_version 116383 (0.0033) +[2024-06-18 10:41:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.7, 300 sec: 42542.9). Total num frames: 1906851840. Throughput: 0: 42659.5. Samples: 1906966560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 10:41:51,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 10:41:54,635][12883] Updated weights for policy 0, policy_version 116393 (0.0033) +[2024-06-18 10:41:56,994][12645] Fps is (10 sec: 40972.7, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1907064832. Throughput: 0: 42834.6. Samples: 1907219660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:41:56,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 10:41:59,009][12883] Updated weights for policy 0, policy_version 116403 (0.0031) +[2024-06-18 10:42:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1907294208. Throughput: 0: 42876.5. Samples: 1907357320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:01,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 10:42:02,134][12883] Updated weights for policy 0, policy_version 116413 (0.0024) +[2024-06-18 10:42:06,561][12883] Updated weights for policy 0, policy_version 116423 (0.0045) +[2024-06-18 10:42:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1907490816. Throughput: 0: 42992.8. Samples: 1907616920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:06,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 10:42:09,827][12883] Updated weights for policy 0, policy_version 116433 (0.0022) +[2024-06-18 10:42:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1907720192. Throughput: 0: 42768.0. Samples: 1907863760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:11,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 10:42:14,057][12883] Updated weights for policy 0, policy_version 116443 (0.0025) +[2024-06-18 10:42:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1907916800. Throughput: 0: 42997.8. Samples: 1908001820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:16,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 10:42:17,715][12883] Updated weights for policy 0, policy_version 116453 (0.0027) +[2024-06-18 10:42:21,519][12883] Updated weights for policy 0, policy_version 116463 (0.0042) +[2024-06-18 10:42:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1908129792. Throughput: 0: 42981.4. Samples: 1908260040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:21,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 10:42:25,210][12883] Updated weights for policy 0, policy_version 116473 (0.0032) +[2024-06-18 10:42:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1908359168. Throughput: 0: 42856.6. Samples: 1908508040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:26,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 10:42:29,293][12883] Updated weights for policy 0, policy_version 116483 (0.0037) +[2024-06-18 10:42:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 1908555776. Throughput: 0: 42940.8. Samples: 1908644220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:31,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 10:42:32,747][12883] Updated weights for policy 0, policy_version 116493 (0.0029) +[2024-06-18 10:42:36,597][12883] Updated weights for policy 0, policy_version 116503 (0.0032) +[2024-06-18 10:42:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 1908785152. Throughput: 0: 43205.8. Samples: 1908910820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:36,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 10:42:37,137][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116505_1908817920.pth... +[2024-06-18 10:42:37,195][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115880_1898577920.pth +[2024-06-18 10:42:40,219][12883] Updated weights for policy 0, policy_version 116513 (0.0026) +[2024-06-18 10:42:41,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1909014528. Throughput: 0: 43240.4. Samples: 1909165480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:41,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 10:42:44,306][12883] Updated weights for policy 0, policy_version 116523 (0.0038) +[2024-06-18 10:42:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42327.6, 300 sec: 42598.4). Total num frames: 1909194752. Throughput: 0: 42952.0. Samples: 1909290160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:46,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 10:42:47,916][12883] Updated weights for policy 0, policy_version 116533 (0.0027) +[2024-06-18 10:42:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1909424128. Throughput: 0: 42999.7. Samples: 1909551900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 10:42:51,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 10:42:52,019][12883] Updated weights for policy 0, policy_version 116543 (0.0038) +[2024-06-18 10:42:55,567][12883] Updated weights for policy 0, policy_version 116553 (0.0033) +[2024-06-18 10:42:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1909653504. Throughput: 0: 43107.6. Samples: 1909803600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:42:56,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 10:42:59,478][12862] Signal inference workers to stop experience collection... (27950 times) +[2024-06-18 10:42:59,519][12883] InferenceWorker_p0-w0: stopping experience collection (27950 times) +[2024-06-18 10:42:59,524][12862] Signal inference workers to resume experience collection... (27950 times) +[2024-06-18 10:42:59,540][12883] InferenceWorker_p0-w0: resuming experience collection (27950 times) +[2024-06-18 10:42:59,543][12883] Updated weights for policy 0, policy_version 116563 (0.0032) +[2024-06-18 10:43:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1909850112. Throughput: 0: 42850.1. Samples: 1909930080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:01,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 10:43:03,662][12883] Updated weights for policy 0, policy_version 116573 (0.0035) +[2024-06-18 10:43:06,951][12883] Updated weights for policy 0, policy_version 116583 (0.0033) +[2024-06-18 10:43:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 1910095872. Throughput: 0: 42958.6. Samples: 1910193180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:06,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 10:43:11,264][12883] Updated weights for policy 0, policy_version 116593 (0.0044) +[2024-06-18 10:43:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1910308864. Throughput: 0: 43051.6. Samples: 1910445360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:11,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 10:43:14,586][12883] Updated weights for policy 0, policy_version 116603 (0.0031) +[2024-06-18 10:43:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1910489088. Throughput: 0: 42950.2. Samples: 1910576980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:16,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 10:43:18,846][12883] Updated weights for policy 0, policy_version 116613 (0.0025) +[2024-06-18 10:43:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1910718464. Throughput: 0: 42724.9. Samples: 1910833440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:21,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 10:43:22,346][12883] Updated weights for policy 0, policy_version 116623 (0.0028) +[2024-06-18 10:43:26,248][12883] Updated weights for policy 0, policy_version 116633 (0.0041) +[2024-06-18 10:43:26,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1910964224. Throughput: 0: 42735.6. Samples: 1911088580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:26,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 10:43:29,914][12883] Updated weights for policy 0, policy_version 116643 (0.0023) +[2024-06-18 10:43:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1911128064. Throughput: 0: 42921.2. Samples: 1911221620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:31,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 10:43:33,937][12883] Updated weights for policy 0, policy_version 116653 (0.0034) +[2024-06-18 10:43:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1911373824. Throughput: 0: 42875.5. Samples: 1911481300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:36,994][12645] Avg episode reward: [(0, '0.608')] +[2024-06-18 10:43:37,653][12883] Updated weights for policy 0, policy_version 116663 (0.0043) +[2024-06-18 10:43:41,433][12883] Updated weights for policy 0, policy_version 116673 (0.0038) +[2024-06-18 10:43:41,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1911603200. Throughput: 0: 42900.4. Samples: 1911734120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:41,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 10:43:45,162][12883] Updated weights for policy 0, policy_version 116683 (0.0032) +[2024-06-18 10:43:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1911767040. Throughput: 0: 43024.1. Samples: 1911866160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:46,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 10:43:49,083][12883] Updated weights for policy 0, policy_version 116693 (0.0026) +[2024-06-18 10:43:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1912029184. Throughput: 0: 43019.9. Samples: 1912129080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) +[2024-06-18 10:43:51,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 10:43:52,758][12883] Updated weights for policy 0, policy_version 116703 (0.0043) +[2024-06-18 10:43:56,672][12883] Updated weights for policy 0, policy_version 116713 (0.0032) +[2024-06-18 10:43:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1912225792. Throughput: 0: 43115.0. Samples: 1912385540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:43:56,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 10:44:00,584][12883] Updated weights for policy 0, policy_version 116723 (0.0049) +[2024-06-18 10:44:01,997][12645] Fps is (10 sec: 39309.8, 60 sec: 42869.3, 300 sec: 42598.0). Total num frames: 1912422400. Throughput: 0: 42939.2. Samples: 1912509380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:01,997][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 10:44:04,234][12883] Updated weights for policy 0, policy_version 116733 (0.0036) +[2024-06-18 10:44:07,000][12645] Fps is (10 sec: 44209.4, 60 sec: 42867.1, 300 sec: 42875.2). Total num frames: 1912668160. Throughput: 0: 43124.2. Samples: 1912774300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:07,000][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 10:44:08,234][12883] Updated weights for policy 0, policy_version 116743 (0.0038) +[2024-06-18 10:44:11,951][12883] Updated weights for policy 0, policy_version 116753 (0.0036) +[2024-06-18 10:44:11,994][12645] Fps is (10 sec: 45889.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1912881152. Throughput: 0: 43113.3. Samples: 1913028680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:11,994][12645] Avg episode reward: [(0, '0.697')] +[2024-06-18 10:44:15,804][12883] Updated weights for policy 0, policy_version 116763 (0.0032) +[2024-06-18 10:44:16,994][12645] Fps is (10 sec: 40985.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1913077760. Throughput: 0: 42938.2. Samples: 1913153840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:16,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 10:44:19,517][12883] Updated weights for policy 0, policy_version 116773 (0.0044) +[2024-06-18 10:44:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 1913323520. Throughput: 0: 43077.2. Samples: 1913419780. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:21,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 10:44:23,294][12883] Updated weights for policy 0, policy_version 116783 (0.0033) +[2024-06-18 10:44:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1913503744. Throughput: 0: 43091.2. Samples: 1913673220. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:26,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 10:44:27,476][12883] Updated weights for policy 0, policy_version 116793 (0.0032) +[2024-06-18 10:44:29,637][12862] Signal inference workers to stop experience collection... (28000 times) +[2024-06-18 10:44:29,638][12862] Signal inference workers to resume experience collection... (28000 times) +[2024-06-18 10:44:29,671][12883] InferenceWorker_p0-w0: stopping experience collection (28000 times) +[2024-06-18 10:44:29,671][12883] InferenceWorker_p0-w0: resuming experience collection (28000 times) +[2024-06-18 10:44:30,752][12883] Updated weights for policy 0, policy_version 116803 (0.0033) +[2024-06-18 10:44:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1913733120. Throughput: 0: 42958.1. Samples: 1913799280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:31,996][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 10:44:35,380][12883] Updated weights for policy 0, policy_version 116813 (0.0026) +[2024-06-18 10:44:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1913929728. Throughput: 0: 42852.6. Samples: 1914057440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:36,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 10:44:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116817_1913929728.pth... +[2024-06-18 10:44:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116189_1903640576.pth +[2024-06-18 10:44:38,423][12883] Updated weights for policy 0, policy_version 116823 (0.0034) +[2024-06-18 10:44:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1914142720. Throughput: 0: 42673.3. Samples: 1914305840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:41,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 10:44:43,276][12883] Updated weights for policy 0, policy_version 116833 (0.0029) +[2024-06-18 10:44:46,126][12883] Updated weights for policy 0, policy_version 116843 (0.0047) +[2024-06-18 10:44:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1914372096. Throughput: 0: 42855.9. Samples: 1914437760. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:46,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 10:44:50,999][12883] Updated weights for policy 0, policy_version 116853 (0.0036) +[2024-06-18 10:44:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1914568704. Throughput: 0: 42716.1. Samples: 1914696260. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) +[2024-06-18 10:44:51,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 10:44:53,750][12883] Updated weights for policy 0, policy_version 116863 (0.0057) +[2024-06-18 10:44:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1914781696. Throughput: 0: 42663.1. Samples: 1914948520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:44:56,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 10:44:58,456][12883] Updated weights for policy 0, policy_version 116873 (0.0036) +[2024-06-18 10:45:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42873.6, 300 sec: 42765.0). Total num frames: 1914994688. Throughput: 0: 42719.9. Samples: 1915076240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:01,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 10:45:02,137][12883] Updated weights for policy 0, policy_version 116883 (0.0023) +[2024-06-18 10:45:06,076][12883] Updated weights for policy 0, policy_version 116893 (0.0036) +[2024-06-18 10:45:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.7, 300 sec: 42820.5). Total num frames: 1915207680. Throughput: 0: 42522.3. Samples: 1915333280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:06,994][12645] Avg episode reward: [(0, '0.662')] +[2024-06-18 10:45:10,279][12883] Updated weights for policy 0, policy_version 116903 (0.0038) +[2024-06-18 10:45:11,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1915420672. Throughput: 0: 42437.9. Samples: 1915582920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:11,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 10:45:14,292][12883] Updated weights for policy 0, policy_version 116913 (0.0038) +[2024-06-18 10:45:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1915633664. Throughput: 0: 42544.9. Samples: 1915713800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:16,994][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 10:45:17,857][12883] Updated weights for policy 0, policy_version 116923 (0.0037) +[2024-06-18 10:45:21,900][12883] Updated weights for policy 0, policy_version 116933 (0.0033) +[2024-06-18 10:45:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 1915830272. Throughput: 0: 42264.5. Samples: 1915959340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:21,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 10:45:25,789][12883] Updated weights for policy 0, policy_version 116943 (0.0036) +[2024-06-18 10:45:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1916043264. Throughput: 0: 42350.7. Samples: 1916211620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:26,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 10:45:29,555][12883] Updated weights for policy 0, policy_version 116953 (0.0027) +[2024-06-18 10:45:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1916289024. Throughput: 0: 42294.7. Samples: 1916341020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:31,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 10:45:33,314][12883] Updated weights for policy 0, policy_version 116963 (0.0035) +[2024-06-18 10:45:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1916452864. Throughput: 0: 42157.4. Samples: 1916593340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:36,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 10:45:37,283][12883] Updated weights for policy 0, policy_version 116973 (0.0053) +[2024-06-18 10:45:38,963][12862] Signal inference workers to stop experience collection... (28050 times) +[2024-06-18 10:45:38,963][12862] Signal inference workers to resume experience collection... (28050 times) +[2024-06-18 10:45:38,986][12883] InferenceWorker_p0-w0: stopping experience collection (28050 times) +[2024-06-18 10:45:38,987][12883] InferenceWorker_p0-w0: resuming experience collection (28050 times) +[2024-06-18 10:45:40,834][12883] Updated weights for policy 0, policy_version 116983 (0.0027) +[2024-06-18 10:45:41,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1916665856. Throughput: 0: 42336.5. Samples: 1916853660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:41,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 10:45:45,125][12883] Updated weights for policy 0, policy_version 116993 (0.0030) +[2024-06-18 10:45:46,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 1916928000. Throughput: 0: 42226.0. Samples: 1916976400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:46,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 10:45:49,004][12883] Updated weights for policy 0, policy_version 117003 (0.0033) +[2024-06-18 10:45:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1917091840. Throughput: 0: 42174.2. Samples: 1917231120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:51,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 10:45:52,888][12883] Updated weights for policy 0, policy_version 117013 (0.0033) +[2024-06-18 10:45:56,600][12883] Updated weights for policy 0, policy_version 117023 (0.0049) +[2024-06-18 10:45:56,998][12645] Fps is (10 sec: 37665.8, 60 sec: 42049.1, 300 sec: 42597.8). Total num frames: 1917304832. Throughput: 0: 42251.2. Samples: 1917484420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) +[2024-06-18 10:45:56,999][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 10:46:00,554][12883] Updated weights for policy 0, policy_version 117033 (0.0044) +[2024-06-18 10:46:01,994][12645] Fps is (10 sec: 49152.6, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 1917583360. Throughput: 0: 42214.8. Samples: 1917613460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:01,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 10:46:04,183][12883] Updated weights for policy 0, policy_version 117043 (0.0036) +[2024-06-18 10:46:06,994][12645] Fps is (10 sec: 44256.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1917747200. Throughput: 0: 42420.0. Samples: 1917868240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:06,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 10:46:08,240][12883] Updated weights for policy 0, policy_version 117053 (0.0033) +[2024-06-18 10:46:11,890][12883] Updated weights for policy 0, policy_version 117063 (0.0032) +[2024-06-18 10:46:11,996][12645] Fps is (10 sec: 37674.4, 60 sec: 42323.7, 300 sec: 42709.2). Total num frames: 1917960192. Throughput: 0: 42515.7. Samples: 1918124920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:11,996][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 10:46:15,725][12883] Updated weights for policy 0, policy_version 117073 (0.0024) +[2024-06-18 10:46:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1918205952. Throughput: 0: 42464.0. Samples: 1918251900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:16,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 10:46:19,901][12883] Updated weights for policy 0, policy_version 117083 (0.0039) +[2024-06-18 10:46:21,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1918369792. Throughput: 0: 42505.7. Samples: 1918506100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:21,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 10:46:23,541][12883] Updated weights for policy 0, policy_version 117093 (0.0040) +[2024-06-18 10:46:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1918599168. Throughput: 0: 42373.3. Samples: 1918760460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:26,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 10:46:27,407][12883] Updated weights for policy 0, policy_version 117103 (0.0033) +[2024-06-18 10:46:31,033][12883] Updated weights for policy 0, policy_version 117113 (0.0030) +[2024-06-18 10:46:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1918828544. Throughput: 0: 42592.7. Samples: 1918893080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:31,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 10:46:34,879][12883] Updated weights for policy 0, policy_version 117123 (0.0038) +[2024-06-18 10:46:36,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1919008768. Throughput: 0: 42504.3. Samples: 1919143820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:36,995][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 10:46:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117127_1919008768.pth... +[2024-06-18 10:46:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116505_1908817920.pth +[2024-06-18 10:46:38,822][12883] Updated weights for policy 0, policy_version 117133 (0.0037) +[2024-06-18 10:46:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.4). Total num frames: 1919238144. Throughput: 0: 42575.4. Samples: 1919400120. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:41,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 10:46:42,346][12883] Updated weights for policy 0, policy_version 117143 (0.0032) +[2024-06-18 10:46:46,554][12883] Updated weights for policy 0, policy_version 117153 (0.0035) +[2024-06-18 10:46:46,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1919467520. Throughput: 0: 42620.3. Samples: 1919531380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:46,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 10:46:50,059][12883] Updated weights for policy 0, policy_version 117163 (0.0037) +[2024-06-18 10:46:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1919664128. Throughput: 0: 42664.4. Samples: 1919788140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:51,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 10:46:54,062][12883] Updated weights for policy 0, policy_version 117173 (0.0032) +[2024-06-18 10:46:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43147.7, 300 sec: 42709.5). Total num frames: 1919893504. Throughput: 0: 42633.6. Samples: 1920043340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) +[2024-06-18 10:46:56,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 10:46:57,657][12883] Updated weights for policy 0, policy_version 117183 (0.0042) +[2024-06-18 10:46:58,623][12862] Signal inference workers to stop experience collection... (28100 times) +[2024-06-18 10:46:58,624][12862] Signal inference workers to resume experience collection... (28100 times) +[2024-06-18 10:46:58,683][12883] InferenceWorker_p0-w0: stopping experience collection (28100 times) +[2024-06-18 10:46:58,683][12883] InferenceWorker_p0-w0: resuming experience collection (28100 times) +[2024-06-18 10:47:01,545][12883] Updated weights for policy 0, policy_version 117193 (0.0036) +[2024-06-18 10:47:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1920106496. Throughput: 0: 42802.7. Samples: 1920178020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:01,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 10:47:05,257][12883] Updated weights for policy 0, policy_version 117203 (0.0042) +[2024-06-18 10:47:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1920303104. Throughput: 0: 42803.1. Samples: 1920432240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:06,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 10:47:09,043][12883] Updated weights for policy 0, policy_version 117213 (0.0026) +[2024-06-18 10:47:11,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43146.0, 300 sec: 42820.5). Total num frames: 1920548864. Throughput: 0: 42795.8. Samples: 1920686280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:11,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 10:47:12,727][12883] Updated weights for policy 0, policy_version 117223 (0.0037) +[2024-06-18 10:47:16,859][12883] Updated weights for policy 0, policy_version 117233 (0.0032) +[2024-06-18 10:47:16,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1920745472. Throughput: 0: 42989.0. Samples: 1920827580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:16,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 10:47:20,648][12883] Updated weights for policy 0, policy_version 117243 (0.0030) +[2024-06-18 10:47:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1920958464. Throughput: 0: 43021.5. Samples: 1921079780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:21,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 10:47:24,635][12883] Updated weights for policy 0, policy_version 117253 (0.0028) +[2024-06-18 10:47:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1921204224. Throughput: 0: 42816.9. Samples: 1921326880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:26,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 10:47:28,153][12883] Updated weights for policy 0, policy_version 117263 (0.0033) +[2024-06-18 10:47:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1921384448. Throughput: 0: 42977.0. Samples: 1921465340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:31,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 10:47:32,011][12883] Updated weights for policy 0, policy_version 117273 (0.0042) +[2024-06-18 10:47:35,749][12883] Updated weights for policy 0, policy_version 117283 (0.0043) +[2024-06-18 10:47:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 1921597440. Throughput: 0: 42899.1. Samples: 1921718600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:36,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 10:47:39,753][12883] Updated weights for policy 0, policy_version 117293 (0.0045) +[2024-06-18 10:47:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1921826816. Throughput: 0: 42910.4. Samples: 1921974300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:41,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 10:47:43,346][12883] Updated weights for policy 0, policy_version 117303 (0.0027) +[2024-06-18 10:47:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1922007040. Throughput: 0: 42802.7. Samples: 1922104140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:46,994][12645] Avg episode reward: [(0, '0.778')] +[2024-06-18 10:47:47,576][12883] Updated weights for policy 0, policy_version 117313 (0.0036) +[2024-06-18 10:47:50,943][12883] Updated weights for policy 0, policy_version 117323 (0.0040) +[2024-06-18 10:47:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1922220032. Throughput: 0: 42796.5. Samples: 1922358080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:51,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 10:47:55,127][12883] Updated weights for policy 0, policy_version 117333 (0.0030) +[2024-06-18 10:47:56,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1922465792. Throughput: 0: 42745.9. Samples: 1922609840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 10:47:56,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 10:47:58,448][12883] Updated weights for policy 0, policy_version 117343 (0.0033) +[2024-06-18 10:48:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1922662400. Throughput: 0: 42685.2. Samples: 1922748420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:01,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 10:48:02,703][12883] Updated weights for policy 0, policy_version 117353 (0.0035) +[2024-06-18 10:48:06,414][12883] Updated weights for policy 0, policy_version 117363 (0.0044) +[2024-06-18 10:48:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1922875392. Throughput: 0: 42629.4. Samples: 1922998100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:06,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 10:48:10,362][12883] Updated weights for policy 0, policy_version 117373 (0.0040) +[2024-06-18 10:48:11,629][12862] Signal inference workers to stop experience collection... (28150 times) +[2024-06-18 10:48:11,669][12883] InferenceWorker_p0-w0: stopping experience collection (28150 times) +[2024-06-18 10:48:11,688][12862] Signal inference workers to resume experience collection... (28150 times) +[2024-06-18 10:48:11,691][12883] InferenceWorker_p0-w0: resuming experience collection (28150 times) +[2024-06-18 10:48:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1923104768. Throughput: 0: 42801.4. Samples: 1923252940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:11,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 10:48:14,134][12883] Updated weights for policy 0, policy_version 117383 (0.0028) +[2024-06-18 10:48:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1923301376. Throughput: 0: 42517.3. Samples: 1923378620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:16,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 10:48:18,146][12883] Updated weights for policy 0, policy_version 117393 (0.0033) +[2024-06-18 10:48:21,870][12883] Updated weights for policy 0, policy_version 117403 (0.0026) +[2024-06-18 10:48:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1923530752. Throughput: 0: 42524.5. Samples: 1923632200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:21,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 10:48:25,940][12883] Updated weights for policy 0, policy_version 117413 (0.0028) +[2024-06-18 10:48:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1923743744. Throughput: 0: 42547.9. Samples: 1923888960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:26,996][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 10:48:29,474][12883] Updated weights for policy 0, policy_version 117423 (0.0023) +[2024-06-18 10:48:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1923923968. Throughput: 0: 42434.2. Samples: 1924013680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:31,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 10:48:33,847][12883] Updated weights for policy 0, policy_version 117433 (0.0033) +[2024-06-18 10:48:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1924169728. Throughput: 0: 42566.3. Samples: 1924273560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:36,994][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 10:48:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117443_1924186112.pth... +[2024-06-18 10:48:37,033][12883] Updated weights for policy 0, policy_version 117443 (0.0044) +[2024-06-18 10:48:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116817_1913929728.pth +[2024-06-18 10:48:41,447][12883] Updated weights for policy 0, policy_version 117453 (0.0035) +[2024-06-18 10:48:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1924382720. Throughput: 0: 42702.4. Samples: 1924531440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:41,994][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 10:48:44,844][12883] Updated weights for policy 0, policy_version 117463 (0.0030) +[2024-06-18 10:48:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1924579328. Throughput: 0: 42441.4. Samples: 1924658280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:46,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 10:48:49,018][12883] Updated weights for policy 0, policy_version 117473 (0.0026) +[2024-06-18 10:48:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1924808704. Throughput: 0: 42661.8. Samples: 1924917880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:51,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 10:48:52,459][12883] Updated weights for policy 0, policy_version 117483 (0.0036) +[2024-06-18 10:48:56,867][12883] Updated weights for policy 0, policy_version 117493 (0.0039) +[2024-06-18 10:48:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42654.4). Total num frames: 1925005312. Throughput: 0: 42715.9. Samples: 1925175160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:48:56,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 10:48:59,963][12883] Updated weights for policy 0, policy_version 117503 (0.0036) +[2024-06-18 10:49:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42543.8). Total num frames: 1925218304. Throughput: 0: 42680.5. Samples: 1925299240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 10:49:01,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 10:49:04,313][12883] Updated weights for policy 0, policy_version 117513 (0.0031) +[2024-06-18 10:49:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1925447680. Throughput: 0: 42816.7. Samples: 1925558960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:06,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 10:49:07,959][12883] Updated weights for policy 0, policy_version 117523 (0.0036) +[2024-06-18 10:49:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1925627904. Throughput: 0: 42762.8. Samples: 1925813280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:11,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 10:49:12,312][12883] Updated weights for policy 0, policy_version 117533 (0.0033) +[2024-06-18 10:49:15,923][12883] Updated weights for policy 0, policy_version 117543 (0.0029) +[2024-06-18 10:49:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1925840896. Throughput: 0: 42662.6. Samples: 1925933500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:16,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 10:49:19,999][12883] Updated weights for policy 0, policy_version 117553 (0.0034) +[2024-06-18 10:49:21,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1926103040. Throughput: 0: 42655.2. Samples: 1926193040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:21,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 10:49:23,505][12883] Updated weights for policy 0, policy_version 117563 (0.0042) +[2024-06-18 10:49:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1926283264. Throughput: 0: 42665.1. Samples: 1926451380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:26,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 10:49:27,658][12883] Updated weights for policy 0, policy_version 117573 (0.0040) +[2024-06-18 10:49:31,517][12883] Updated weights for policy 0, policy_version 117583 (0.0030) +[2024-06-18 10:49:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1926496256. Throughput: 0: 42469.7. Samples: 1926569420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:31,995][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 10:49:35,420][12883] Updated weights for policy 0, policy_version 117593 (0.0035) +[2024-06-18 10:49:36,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1926742016. Throughput: 0: 42482.2. Samples: 1926829580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:36,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 10:49:39,026][12883] Updated weights for policy 0, policy_version 117603 (0.0048) +[2024-06-18 10:49:41,546][12862] Signal inference workers to stop experience collection... (28200 times) +[2024-06-18 10:49:41,547][12862] Signal inference workers to resume experience collection... (28200 times) +[2024-06-18 10:49:41,574][12883] InferenceWorker_p0-w0: stopping experience collection (28200 times) +[2024-06-18 10:49:41,575][12883] InferenceWorker_p0-w0: resuming experience collection (28200 times) +[2024-06-18 10:49:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1926922240. Throughput: 0: 42438.3. Samples: 1927084880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:41,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 10:49:43,232][12883] Updated weights for policy 0, policy_version 117613 (0.0025) +[2024-06-18 10:49:46,515][12883] Updated weights for policy 0, policy_version 117623 (0.0036) +[2024-06-18 10:49:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1927135232. Throughput: 0: 42328.0. Samples: 1927204000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:46,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 10:49:51,132][12883] Updated weights for policy 0, policy_version 117633 (0.0030) +[2024-06-18 10:49:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1927364608. Throughput: 0: 42511.2. Samples: 1927471960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:51,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 10:49:54,130][12883] Updated weights for policy 0, policy_version 117643 (0.0033) +[2024-06-18 10:49:56,997][12645] Fps is (10 sec: 40947.6, 60 sec: 42323.3, 300 sec: 42542.5). Total num frames: 1927544832. Throughput: 0: 42318.5. Samples: 1927717740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:49:56,997][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 10:49:59,058][12883] Updated weights for policy 0, policy_version 117653 (0.0042) +[2024-06-18 10:50:01,588][12883] Updated weights for policy 0, policy_version 117663 (0.0040) +[2024-06-18 10:50:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1927790592. Throughput: 0: 42492.3. Samples: 1927845660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 10:50:01,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 10:50:06,589][12883] Updated weights for policy 0, policy_version 117673 (0.0042) +[2024-06-18 10:50:06,996][12645] Fps is (10 sec: 42601.3, 60 sec: 42050.7, 300 sec: 42542.5). Total num frames: 1927970816. Throughput: 0: 42613.8. Samples: 1928110760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:06,997][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 10:50:09,053][12883] Updated weights for policy 0, policy_version 117683 (0.0046) +[2024-06-18 10:50:11,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 1928200192. Throughput: 0: 42393.6. Samples: 1928359180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:11,996][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 10:50:14,166][12883] Updated weights for policy 0, policy_version 117693 (0.0034) +[2024-06-18 10:50:16,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1928413184. Throughput: 0: 42754.6. Samples: 1928493380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:16,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 10:50:17,418][12883] Updated weights for policy 0, policy_version 117703 (0.0030) +[2024-06-18 10:50:21,944][12883] Updated weights for policy 0, policy_version 117713 (0.0031) +[2024-06-18 10:50:21,994][12645] Fps is (10 sec: 40968.7, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 1928609792. Throughput: 0: 42584.4. Samples: 1928745880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:21,995][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 10:50:25,131][12883] Updated weights for policy 0, policy_version 117723 (0.0037) +[2024-06-18 10:50:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1928839168. Throughput: 0: 42510.5. Samples: 1928997860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:26,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 10:50:29,846][12883] Updated weights for policy 0, policy_version 117733 (0.0026) +[2024-06-18 10:50:31,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1929068544. Throughput: 0: 42812.8. Samples: 1929130580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:31,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 10:50:32,760][12883] Updated weights for policy 0, policy_version 117743 (0.0038) +[2024-06-18 10:50:36,996][12645] Fps is (10 sec: 39313.1, 60 sec: 41504.6, 300 sec: 42598.1). Total num frames: 1929232384. Throughput: 0: 42480.4. Samples: 1929383680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:36,997][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 10:50:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117751_1929232384.pth... +[2024-06-18 10:50:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117127_1919008768.pth +[2024-06-18 10:50:37,390][12883] Updated weights for policy 0, policy_version 117753 (0.0045) +[2024-06-18 10:50:40,447][12883] Updated weights for policy 0, policy_version 117763 (0.0033) +[2024-06-18 10:50:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1929478144. Throughput: 0: 42613.0. Samples: 1929635200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:41,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 10:50:44,980][12862] Signal inference workers to stop experience collection... (28250 times) +[2024-06-18 10:50:44,981][12862] Signal inference workers to resume experience collection... (28250 times) +[2024-06-18 10:50:45,014][12883] InferenceWorker_p0-w0: stopping experience collection (28250 times) +[2024-06-18 10:50:45,014][12883] InferenceWorker_p0-w0: resuming experience collection (28250 times) +[2024-06-18 10:50:45,142][12883] Updated weights for policy 0, policy_version 117773 (0.0033) +[2024-06-18 10:50:46,994][12645] Fps is (10 sec: 49162.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1929723904. Throughput: 0: 42745.7. Samples: 1929769220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:46,994][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 10:50:47,987][12883] Updated weights for policy 0, policy_version 117783 (0.0023) +[2024-06-18 10:50:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42599.1). Total num frames: 1929871360. Throughput: 0: 42378.2. Samples: 1930017680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:51,994][12645] Avg episode reward: [(0, '0.191')] +[2024-06-18 10:50:52,805][12883] Updated weights for policy 0, policy_version 117793 (0.0038) +[2024-06-18 10:50:55,948][12883] Updated weights for policy 0, policy_version 117803 (0.0043) +[2024-06-18 10:50:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43146.5, 300 sec: 42542.8). Total num frames: 1930133504. Throughput: 0: 42428.2. Samples: 1930268360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:50:56,995][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 10:51:00,246][12883] Updated weights for policy 0, policy_version 117813 (0.0040) +[2024-06-18 10:51:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1930330112. Throughput: 0: 42509.9. Samples: 1930406320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 10:51:01,994][12645] Avg episode reward: [(0, '0.672')] +[2024-06-18 10:51:03,466][12883] Updated weights for policy 0, policy_version 117823 (0.0026) +[2024-06-18 10:51:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42600.0, 300 sec: 42598.7). Total num frames: 1930526720. Throughput: 0: 42502.8. Samples: 1930658500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:06,994][12645] Avg episode reward: [(0, '0.723')] +[2024-06-18 10:51:08,218][12883] Updated weights for policy 0, policy_version 117833 (0.0035) +[2024-06-18 10:51:11,005][12883] Updated weights for policy 0, policy_version 117843 (0.0024) +[2024-06-18 10:51:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 1930788864. Throughput: 0: 42547.6. Samples: 1930912500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:11,995][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 10:51:15,768][12883] Updated weights for policy 0, policy_version 117853 (0.0043) +[2024-06-18 10:51:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1930985472. Throughput: 0: 42643.9. Samples: 1931049560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:16,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 10:51:18,650][12883] Updated weights for policy 0, policy_version 117863 (0.0042) +[2024-06-18 10:51:21,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1931165696. Throughput: 0: 42489.7. Samples: 1931295620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:21,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 10:51:23,499][12883] Updated weights for policy 0, policy_version 117873 (0.0041) +[2024-06-18 10:51:26,340][12883] Updated weights for policy 0, policy_version 117883 (0.0021) +[2024-06-18 10:51:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1931427840. Throughput: 0: 42615.6. Samples: 1931552900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:26,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 10:51:31,258][12883] Updated weights for policy 0, policy_version 117893 (0.0053) +[2024-06-18 10:51:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1931608064. Throughput: 0: 42748.5. Samples: 1931692900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:31,995][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 10:51:34,085][12883] Updated weights for policy 0, policy_version 117903 (0.0035) +[2024-06-18 10:51:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 1931804672. Throughput: 0: 42547.9. Samples: 1931932340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:36,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 10:51:38,793][12883] Updated weights for policy 0, policy_version 117913 (0.0038) +[2024-06-18 10:51:41,797][12883] Updated weights for policy 0, policy_version 117923 (0.0035) +[2024-06-18 10:51:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1932066816. Throughput: 0: 42788.1. Samples: 1932193820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:41,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 10:51:46,354][12883] Updated weights for policy 0, policy_version 117933 (0.0046) +[2024-06-18 10:51:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1932247040. Throughput: 0: 42695.9. Samples: 1932327640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:46,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 10:51:49,395][12883] Updated weights for policy 0, policy_version 117943 (0.0035) +[2024-06-18 10:51:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1932460032. Throughput: 0: 42490.2. Samples: 1932570560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:51,994][12645] Avg episode reward: [(0, '0.205')] +[2024-06-18 10:51:53,728][12862] Signal inference workers to stop experience collection... (28300 times) +[2024-06-18 10:51:53,771][12883] InferenceWorker_p0-w0: stopping experience collection (28300 times) +[2024-06-18 10:51:53,793][12862] Signal inference workers to resume experience collection... (28300 times) +[2024-06-18 10:51:53,794][12883] InferenceWorker_p0-w0: resuming experience collection (28300 times) +[2024-06-18 10:51:54,225][12883] Updated weights for policy 0, policy_version 117953 (0.0039) +[2024-06-18 10:51:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1932673024. Throughput: 0: 42585.4. Samples: 1932828840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:51:56,994][12645] Avg episode reward: [(0, '0.866')] +[2024-06-18 10:51:57,128][12862] Saving new best policy, reward=0.866! +[2024-06-18 10:51:57,501][12883] Updated weights for policy 0, policy_version 117963 (0.0041) +[2024-06-18 10:52:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1932853248. Throughput: 0: 42281.0. Samples: 1932952200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 10:52:01,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 10:52:02,086][12883] Updated weights for policy 0, policy_version 117973 (0.0039) +[2024-06-18 10:52:04,987][12883] Updated weights for policy 0, policy_version 117983 (0.0045) +[2024-06-18 10:52:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1933115392. Throughput: 0: 42315.6. Samples: 1933199820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:06,994][12645] Avg episode reward: [(0, '0.807')] +[2024-06-18 10:52:09,643][12883] Updated weights for policy 0, policy_version 117993 (0.0032) +[2024-06-18 10:52:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1933312000. Throughput: 0: 42420.9. Samples: 1933461840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:11,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 10:52:12,783][12883] Updated weights for policy 0, policy_version 118003 (0.0028) +[2024-06-18 10:52:16,994][12645] Fps is (10 sec: 37682.7, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 1933492224. Throughput: 0: 42049.3. Samples: 1933585120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:16,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 10:52:17,419][12883] Updated weights for policy 0, policy_version 118013 (0.0042) +[2024-06-18 10:52:20,308][12883] Updated weights for policy 0, policy_version 118023 (0.0036) +[2024-06-18 10:52:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1933754368. Throughput: 0: 42423.5. Samples: 1933841400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:21,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 10:52:25,123][12883] Updated weights for policy 0, policy_version 118033 (0.0030) +[2024-06-18 10:52:26,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1933950976. Throughput: 0: 42454.2. Samples: 1934104260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:26,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 10:52:27,885][12883] Updated weights for policy 0, policy_version 118043 (0.0041) +[2024-06-18 10:52:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1934131200. Throughput: 0: 42138.7. Samples: 1934223880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:31,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 10:52:32,917][12883] Updated weights for policy 0, policy_version 118053 (0.0038) +[2024-06-18 10:52:35,858][12883] Updated weights for policy 0, policy_version 118063 (0.0028) +[2024-06-18 10:52:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1934393344. Throughput: 0: 42514.7. Samples: 1934483720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:36,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 10:52:37,030][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118067_1934409728.pth... +[2024-06-18 10:52:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117443_1924186112.pth +[2024-06-18 10:52:40,663][12883] Updated weights for policy 0, policy_version 118073 (0.0037) +[2024-06-18 10:52:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1934573568. Throughput: 0: 42610.7. Samples: 1934746320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:41,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 10:52:43,761][12883] Updated weights for policy 0, policy_version 118083 (0.0028) +[2024-06-18 10:52:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1934786560. Throughput: 0: 42372.8. Samples: 1934858980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:46,994][12645] Avg episode reward: [(0, '0.179')] +[2024-06-18 10:52:48,499][12883] Updated weights for policy 0, policy_version 118093 (0.0042) +[2024-06-18 10:52:51,494][12883] Updated weights for policy 0, policy_version 118103 (0.0032) +[2024-06-18 10:52:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1935032320. Throughput: 0: 42703.2. Samples: 1935121460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:51,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 10:52:56,034][12883] Updated weights for policy 0, policy_version 118113 (0.0034) +[2024-06-18 10:52:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1935212544. Throughput: 0: 42588.9. Samples: 1935378340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:52:56,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 10:52:59,023][12883] Updated weights for policy 0, policy_version 118123 (0.0024) +[2024-06-18 10:53:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1935409152. Throughput: 0: 42521.1. Samples: 1935498560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:53:01,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 10:53:03,581][12883] Updated weights for policy 0, policy_version 118133 (0.0035) +[2024-06-18 10:53:06,636][12883] Updated weights for policy 0, policy_version 118143 (0.0024) +[2024-06-18 10:53:06,676][12862] Signal inference workers to stop experience collection... (28350 times) +[2024-06-18 10:53:06,736][12883] InferenceWorker_p0-w0: stopping experience collection (28350 times) +[2024-06-18 10:53:06,792][12862] Signal inference workers to resume experience collection... (28350 times) +[2024-06-18 10:53:06,792][12883] InferenceWorker_p0-w0: resuming experience collection (28350 times) +[2024-06-18 10:53:06,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1935687680. Throughput: 0: 42745.0. Samples: 1935764920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) +[2024-06-18 10:53:06,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 10:53:11,258][12883] Updated weights for policy 0, policy_version 118153 (0.0035) +[2024-06-18 10:53:11,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1935835136. Throughput: 0: 42471.9. Samples: 1936015500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:11,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 10:53:14,587][12883] Updated weights for policy 0, policy_version 118163 (0.0043) +[2024-06-18 10:53:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1936064512. Throughput: 0: 42458.6. Samples: 1936134520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:16,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 10:53:18,917][12883] Updated weights for policy 0, policy_version 118173 (0.0037) +[2024-06-18 10:53:21,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 1936293888. Throughput: 0: 42616.4. Samples: 1936401460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:21,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 10:53:22,158][12883] Updated weights for policy 0, policy_version 118183 (0.0030) +[2024-06-18 10:53:26,819][12883] Updated weights for policy 0, policy_version 118193 (0.0033) +[2024-06-18 10:53:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1936474112. Throughput: 0: 42399.1. Samples: 1936654280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:26,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 10:53:29,775][12883] Updated weights for policy 0, policy_version 118203 (0.0030) +[2024-06-18 10:53:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1936703488. Throughput: 0: 42588.0. Samples: 1936775440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:31,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 10:53:34,554][12883] Updated weights for policy 0, policy_version 118213 (0.0046) +[2024-06-18 10:53:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 1936916480. Throughput: 0: 42493.2. Samples: 1937033660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:36,994][12645] Avg episode reward: [(0, '0.285')] +[2024-06-18 10:53:37,499][12883] Updated weights for policy 0, policy_version 118223 (0.0029) +[2024-06-18 10:53:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1937096704. Throughput: 0: 42472.5. Samples: 1937289600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:41,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 10:53:42,158][12883] Updated weights for policy 0, policy_version 118233 (0.0032) +[2024-06-18 10:53:45,173][12883] Updated weights for policy 0, policy_version 118243 (0.0038) +[2024-06-18 10:53:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1937342464. Throughput: 0: 42608.8. Samples: 1937415960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:46,994][12645] Avg episode reward: [(0, '0.728')] +[2024-06-18 10:53:49,795][12883] Updated weights for policy 0, policy_version 118253 (0.0040) +[2024-06-18 10:53:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 1937539072. Throughput: 0: 42277.2. Samples: 1937667400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:51,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 10:53:52,921][12883] Updated weights for policy 0, policy_version 118263 (0.0040) +[2024-06-18 10:53:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1937752064. Throughput: 0: 42462.0. Samples: 1937926280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:53:56,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 10:53:57,417][12883] Updated weights for policy 0, policy_version 118273 (0.0040) +[2024-06-18 10:54:00,588][12883] Updated weights for policy 0, policy_version 118283 (0.0045) +[2024-06-18 10:54:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1937981440. Throughput: 0: 42667.6. Samples: 1938054560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:54:01,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 10:54:05,006][12883] Updated weights for policy 0, policy_version 118293 (0.0029) +[2024-06-18 10:54:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 42542.8). Total num frames: 1938178048. Throughput: 0: 42446.6. Samples: 1938311560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 10:54:06,994][12645] Avg episode reward: [(0, '0.152')] +[2024-06-18 10:54:08,321][12883] Updated weights for policy 0, policy_version 118303 (0.0036) +[2024-06-18 10:54:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1938391040. Throughput: 0: 42439.6. Samples: 1938564060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:11,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 10:54:12,647][12883] Updated weights for policy 0, policy_version 118313 (0.0053) +[2024-06-18 10:54:15,113][12862] Signal inference workers to stop experience collection... (28400 times) +[2024-06-18 10:54:15,165][12862] Signal inference workers to resume experience collection... (28400 times) +[2024-06-18 10:54:15,168][12883] InferenceWorker_p0-w0: stopping experience collection (28400 times) +[2024-06-18 10:54:15,183][12883] InferenceWorker_p0-w0: resuming experience collection (28400 times) +[2024-06-18 10:54:16,091][12883] Updated weights for policy 0, policy_version 118323 (0.0036) +[2024-06-18 10:54:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1938636800. Throughput: 0: 42651.1. Samples: 1938694740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:16,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 10:54:20,159][12883] Updated weights for policy 0, policy_version 118333 (0.0037) +[2024-06-18 10:54:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 1938800640. Throughput: 0: 42469.8. Samples: 1938944800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:21,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 10:54:23,828][12883] Updated weights for policy 0, policy_version 118343 (0.0027) +[2024-06-18 10:54:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1939046400. Throughput: 0: 42320.4. Samples: 1939194020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:26,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 10:54:28,648][12883] Updated weights for policy 0, policy_version 118353 (0.0047) +[2024-06-18 10:54:31,396][12883] Updated weights for policy 0, policy_version 118363 (0.0044) +[2024-06-18 10:54:31,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1939275776. Throughput: 0: 42513.8. Samples: 1939329080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:31,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 10:54:36,150][12883] Updated weights for policy 0, policy_version 118373 (0.0027) +[2024-06-18 10:54:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1939456000. Throughput: 0: 42444.0. Samples: 1939577380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:36,994][12645] Avg episode reward: [(0, '0.657')] +[2024-06-18 10:54:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118375_1939456000.pth... +[2024-06-18 10:54:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117751_1929232384.pth +[2024-06-18 10:54:39,386][12883] Updated weights for policy 0, policy_version 118383 (0.0036) +[2024-06-18 10:54:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1939668992. Throughput: 0: 42288.9. Samples: 1939829280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:41,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 10:54:43,749][12883] Updated weights for policy 0, policy_version 118393 (0.0032) +[2024-06-18 10:54:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1939898368. Throughput: 0: 42169.7. Samples: 1939952200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:46,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 10:54:47,145][12883] Updated weights for policy 0, policy_version 118403 (0.0038) +[2024-06-18 10:54:51,302][12883] Updated weights for policy 0, policy_version 118413 (0.0029) +[2024-06-18 10:54:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 1940111360. Throughput: 0: 42240.9. Samples: 1940212400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:51,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 10:54:55,370][12883] Updated weights for policy 0, policy_version 118423 (0.0026) +[2024-06-18 10:54:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1940291584. Throughput: 0: 42287.5. Samples: 1940467000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:54:56,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 10:54:58,827][12883] Updated weights for policy 0, policy_version 118433 (0.0033) +[2024-06-18 10:55:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1940520960. Throughput: 0: 42150.2. Samples: 1940591500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:55:01,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 10:55:02,852][12883] Updated weights for policy 0, policy_version 118443 (0.0028) +[2024-06-18 10:55:06,450][12883] Updated weights for policy 0, policy_version 118453 (0.0041) +[2024-06-18 10:55:06,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 1940750336. Throughput: 0: 42413.9. Samples: 1940853420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 10:55:06,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 10:55:10,861][12883] Updated weights for policy 0, policy_version 118463 (0.0036) +[2024-06-18 10:55:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1940946944. Throughput: 0: 42461.0. Samples: 1941104760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:11,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 10:55:14,375][12883] Updated weights for policy 0, policy_version 118473 (0.0041) +[2024-06-18 10:55:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1941176320. Throughput: 0: 42212.4. Samples: 1941228640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:16,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 10:55:18,428][12883] Updated weights for policy 0, policy_version 118483 (0.0032) +[2024-06-18 10:55:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1941356544. Throughput: 0: 42381.0. Samples: 1941484520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:21,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 10:55:22,467][12883] Updated weights for policy 0, policy_version 118493 (0.0027) +[2024-06-18 10:55:25,888][12883] Updated weights for policy 0, policy_version 118503 (0.0028) +[2024-06-18 10:55:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1941602304. Throughput: 0: 42440.4. Samples: 1941739100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:26,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 10:55:30,262][12883] Updated weights for policy 0, policy_version 118513 (0.0032) +[2024-06-18 10:55:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 1941798912. Throughput: 0: 42600.4. Samples: 1941869220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:31,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 10:55:33,855][12883] Updated weights for policy 0, policy_version 118523 (0.0029) +[2024-06-18 10:55:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1941995520. Throughput: 0: 42441.3. Samples: 1942122260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:36,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 10:55:38,073][12883] Updated weights for policy 0, policy_version 118533 (0.0037) +[2024-06-18 10:55:41,431][12883] Updated weights for policy 0, policy_version 118543 (0.0046) +[2024-06-18 10:55:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.3, 300 sec: 42431.8). Total num frames: 1942241280. Throughput: 0: 42332.4. Samples: 1942371960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:41,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 10:55:45,635][12883] Updated weights for policy 0, policy_version 118553 (0.0041) +[2024-06-18 10:55:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 1942421504. Throughput: 0: 42552.0. Samples: 1942506340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:46,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 10:55:48,926][12883] Updated weights for policy 0, policy_version 118563 (0.0042) +[2024-06-18 10:55:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1942634496. Throughput: 0: 42413.3. Samples: 1942762020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:51,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 10:55:53,146][12883] Updated weights for policy 0, policy_version 118573 (0.0050) +[2024-06-18 10:55:56,614][12862] Signal inference workers to stop experience collection... (28450 times) +[2024-06-18 10:55:56,614][12862] Signal inference workers to resume experience collection... (28450 times) +[2024-06-18 10:55:56,623][12883] InferenceWorker_p0-w0: stopping experience collection (28450 times) +[2024-06-18 10:55:56,624][12883] InferenceWorker_p0-w0: resuming experience collection (28450 times) +[2024-06-18 10:55:56,767][12883] Updated weights for policy 0, policy_version 118583 (0.0042) +[2024-06-18 10:55:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1942863872. Throughput: 0: 42470.6. Samples: 1943015940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:55:56,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 10:56:00,847][12883] Updated weights for policy 0, policy_version 118593 (0.0028) +[2024-06-18 10:56:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1943060480. Throughput: 0: 42635.6. Samples: 1943147240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:56:01,994][12645] Avg episode reward: [(0, '0.267')] +[2024-06-18 10:56:04,334][12883] Updated weights for policy 0, policy_version 118603 (0.0024) +[2024-06-18 10:56:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1943273472. Throughput: 0: 42384.9. Samples: 1943391840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:56:06,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 10:56:08,459][12883] Updated weights for policy 0, policy_version 118613 (0.0036) +[2024-06-18 10:56:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1943486464. Throughput: 0: 42324.0. Samples: 1943643680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 10:56:11,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 10:56:12,139][12883] Updated weights for policy 0, policy_version 118623 (0.0029) +[2024-06-18 10:56:16,812][12883] Updated weights for policy 0, policy_version 118633 (0.0031) +[2024-06-18 10:56:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1943699456. Throughput: 0: 42310.7. Samples: 1943773200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:16,998][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 10:56:19,815][12883] Updated weights for policy 0, policy_version 118643 (0.0043) +[2024-06-18 10:56:21,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1943912448. Throughput: 0: 42138.6. Samples: 1944018500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:21,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 10:56:24,462][12883] Updated weights for policy 0, policy_version 118653 (0.0036) +[2024-06-18 10:56:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1944141824. Throughput: 0: 42362.8. Samples: 1944278280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:26,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 10:56:27,547][12883] Updated weights for policy 0, policy_version 118663 (0.0031) +[2024-06-18 10:56:31,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1944322048. Throughput: 0: 42265.4. Samples: 1944408280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:31,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 10:56:32,052][12883] Updated weights for policy 0, policy_version 118673 (0.0030) +[2024-06-18 10:56:35,357][12883] Updated weights for policy 0, policy_version 118683 (0.0037) +[2024-06-18 10:56:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1944567808. Throughput: 0: 42324.8. Samples: 1944666640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:36,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 10:56:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118687_1944567808.pth... +[2024-06-18 10:56:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118067_1934409728.pth +[2024-06-18 10:56:39,659][12883] Updated weights for policy 0, policy_version 118693 (0.0038) +[2024-06-18 10:56:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1944764416. Throughput: 0: 42437.4. Samples: 1944925620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:41,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 10:56:43,222][12883] Updated weights for policy 0, policy_version 118703 (0.0031) +[2024-06-18 10:56:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1944961024. Throughput: 0: 42290.5. Samples: 1945050320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:46,995][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 10:56:47,501][12883] Updated weights for policy 0, policy_version 118713 (0.0044) +[2024-06-18 10:56:51,115][12883] Updated weights for policy 0, policy_version 118723 (0.0034) +[2024-06-18 10:56:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1945190400. Throughput: 0: 42655.6. Samples: 1945311340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:51,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 10:56:54,915][12883] Updated weights for policy 0, policy_version 118733 (0.0027) +[2024-06-18 10:56:56,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1945403392. Throughput: 0: 42661.7. Samples: 1945563460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:56:56,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 10:56:58,749][12883] Updated weights for policy 0, policy_version 118743 (0.0034) +[2024-06-18 10:57:01,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1945600000. Throughput: 0: 42699.1. Samples: 1945694660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:57:01,994][12645] Avg episode reward: [(0, '0.688')] +[2024-06-18 10:57:02,726][12883] Updated weights for policy 0, policy_version 118753 (0.0033) +[2024-06-18 10:57:06,324][12883] Updated weights for policy 0, policy_version 118763 (0.0042) +[2024-06-18 10:57:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1945845760. Throughput: 0: 42970.3. Samples: 1945952160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:57:06,994][12645] Avg episode reward: [(0, '0.689')] +[2024-06-18 10:57:10,317][12883] Updated weights for policy 0, policy_version 118773 (0.0035) +[2024-06-18 10:57:11,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1946058752. Throughput: 0: 42822.1. Samples: 1946205280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 10:57:11,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 10:57:14,203][12883] Updated weights for policy 0, policy_version 118783 (0.0029) +[2024-06-18 10:57:14,499][12862] Signal inference workers to stop experience collection... (28500 times) +[2024-06-18 10:57:14,499][12862] Signal inference workers to resume experience collection... (28500 times) +[2024-06-18 10:57:14,547][12883] InferenceWorker_p0-w0: stopping experience collection (28500 times) +[2024-06-18 10:57:14,547][12883] InferenceWorker_p0-w0: resuming experience collection (28500 times) +[2024-06-18 10:57:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1946255360. Throughput: 0: 42856.0. Samples: 1946336800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:16,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 10:57:17,872][12883] Updated weights for policy 0, policy_version 118793 (0.0036) +[2024-06-18 10:57:21,862][12883] Updated weights for policy 0, policy_version 118803 (0.0036) +[2024-06-18 10:57:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1946468352. Throughput: 0: 42849.0. Samples: 1946594840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:21,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 10:57:25,441][12883] Updated weights for policy 0, policy_version 118813 (0.0034) +[2024-06-18 10:57:26,996][12645] Fps is (10 sec: 45861.9, 60 sec: 42869.4, 300 sec: 42653.5). Total num frames: 1946714112. Throughput: 0: 42777.4. Samples: 1946850720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:26,997][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 10:57:29,529][12883] Updated weights for policy 0, policy_version 118823 (0.0035) +[2024-06-18 10:57:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 1946910720. Throughput: 0: 43022.9. Samples: 1946986340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:31,994][12645] Avg episode reward: [(0, '0.268')] +[2024-06-18 10:57:32,863][12883] Updated weights for policy 0, policy_version 118833 (0.0036) +[2024-06-18 10:57:36,994][12645] Fps is (10 sec: 37693.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1947090944. Throughput: 0: 42867.4. Samples: 1947240380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:36,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 10:57:37,211][12883] Updated weights for policy 0, policy_version 118843 (0.0037) +[2024-06-18 10:57:40,430][12883] Updated weights for policy 0, policy_version 118853 (0.0037) +[2024-06-18 10:57:41,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 1947336704. Throughput: 0: 42880.1. Samples: 1947493160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:41,996][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 10:57:44,817][12883] Updated weights for policy 0, policy_version 118863 (0.0043) +[2024-06-18 10:57:46,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43417.7, 300 sec: 42487.3). Total num frames: 1947566080. Throughput: 0: 43112.1. Samples: 1947634700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:46,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 10:57:47,913][12883] Updated weights for policy 0, policy_version 118873 (0.0043) +[2024-06-18 10:57:51,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1947729920. Throughput: 0: 42809.9. Samples: 1947878600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:51,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 10:57:52,499][12883] Updated weights for policy 0, policy_version 118883 (0.0030) +[2024-06-18 10:57:55,550][12883] Updated weights for policy 0, policy_version 118893 (0.0039) +[2024-06-18 10:57:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1947992064. Throughput: 0: 42866.3. Samples: 1948134260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:57:57,003][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 10:58:00,183][12883] Updated weights for policy 0, policy_version 118903 (0.0044) +[2024-06-18 10:58:01,994][12645] Fps is (10 sec: 49151.0, 60 sec: 43690.6, 300 sec: 42487.3). Total num frames: 1948221440. Throughput: 0: 42994.9. Samples: 1948271580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:58:01,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 10:58:03,167][12883] Updated weights for policy 0, policy_version 118913 (0.0030) +[2024-06-18 10:58:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1948401664. Throughput: 0: 42895.5. Samples: 1948525140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:58:06,994][12645] Avg episode reward: [(0, '0.236')] +[2024-06-18 10:58:07,624][12883] Updated weights for policy 0, policy_version 118923 (0.0047) +[2024-06-18 10:58:11,040][12883] Updated weights for policy 0, policy_version 118933 (0.0036) +[2024-06-18 10:58:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1948647424. Throughput: 0: 42793.8. Samples: 1948776320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) +[2024-06-18 10:58:11,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 10:58:15,249][12883] Updated weights for policy 0, policy_version 118943 (0.0032) +[2024-06-18 10:58:16,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1948844032. Throughput: 0: 42756.3. Samples: 1948910380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:16,994][12645] Avg episode reward: [(0, '0.710')] +[2024-06-18 10:58:18,638][12883] Updated weights for policy 0, policy_version 118953 (0.0049) +[2024-06-18 10:58:21,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1949024256. Throughput: 0: 42784.9. Samples: 1949165700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:21,994][12645] Avg episode reward: [(0, '0.229')] +[2024-06-18 10:58:23,163][12883] Updated weights for policy 0, policy_version 118963 (0.0023) +[2024-06-18 10:58:26,210][12862] Signal inference workers to stop experience collection... (28550 times) +[2024-06-18 10:58:26,210][12862] Signal inference workers to resume experience collection... (28550 times) +[2024-06-18 10:58:26,214][12883] Updated weights for policy 0, policy_version 118973 (0.0036) +[2024-06-18 10:58:26,228][12883] InferenceWorker_p0-w0: stopping experience collection (28550 times) +[2024-06-18 10:58:26,228][12883] InferenceWorker_p0-w0: resuming experience collection (28550 times) +[2024-06-18 10:58:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.4, 300 sec: 42653.9). Total num frames: 1949286400. Throughput: 0: 42602.5. Samples: 1949410180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:26,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 10:58:31,045][12883] Updated weights for policy 0, policy_version 118983 (0.0042) +[2024-06-18 10:58:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1949466624. Throughput: 0: 42605.9. Samples: 1949551960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:31,994][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 10:58:33,807][12883] Updated weights for policy 0, policy_version 118993 (0.0034) +[2024-06-18 10:58:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1949679616. Throughput: 0: 42787.0. Samples: 1949804020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:36,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 10:58:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118999_1949679616.pth... +[2024-06-18 10:58:37,053][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118375_1939456000.pth +[2024-06-18 10:58:38,613][12883] Updated weights for policy 0, policy_version 119003 (0.0045) +[2024-06-18 10:58:41,521][12883] Updated weights for policy 0, policy_version 119013 (0.0038) +[2024-06-18 10:58:41,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 1949925376. Throughput: 0: 42628.0. Samples: 1950052520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:41,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 10:58:46,270][12883] Updated weights for policy 0, policy_version 119023 (0.0034) +[2024-06-18 10:58:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1950121984. Throughput: 0: 42611.2. Samples: 1950189080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:46,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 10:58:49,362][12883] Updated weights for policy 0, policy_version 119033 (0.0044) +[2024-06-18 10:58:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1950318592. Throughput: 0: 42535.5. Samples: 1950439240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:51,995][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 10:58:53,918][12883] Updated weights for policy 0, policy_version 119043 (0.0037) +[2024-06-18 10:58:56,993][12883] Updated weights for policy 0, policy_version 119053 (0.0041) +[2024-06-18 10:58:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1950564352. Throughput: 0: 42692.0. Samples: 1950697460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:58:56,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 10:59:01,434][12883] Updated weights for policy 0, policy_version 119063 (0.0030) +[2024-06-18 10:59:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1950777344. Throughput: 0: 42770.7. Samples: 1950835060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:59:01,998][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 10:59:04,620][12883] Updated weights for policy 0, policy_version 119073 (0.0040) +[2024-06-18 10:59:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1950957568. Throughput: 0: 42668.4. Samples: 1951085780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:59:06,994][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 10:59:08,850][12883] Updated weights for policy 0, policy_version 119083 (0.0029) +[2024-06-18 10:59:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1951203328. Throughput: 0: 42926.4. Samples: 1951341860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 10:59:11,994][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 10:59:12,199][12883] Updated weights for policy 0, policy_version 119093 (0.0035) +[2024-06-18 10:59:16,536][12883] Updated weights for policy 0, policy_version 119103 (0.0030) +[2024-06-18 10:59:16,996][12645] Fps is (10 sec: 45866.8, 60 sec: 42870.2, 300 sec: 42764.8). Total num frames: 1951416320. Throughput: 0: 42862.1. Samples: 1951480840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:16,996][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 10:59:19,826][12883] Updated weights for policy 0, policy_version 119113 (0.0037) +[2024-06-18 10:59:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1951596544. Throughput: 0: 42713.0. Samples: 1951726100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:21,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 10:59:24,206][12883] Updated weights for policy 0, policy_version 119123 (0.0024) +[2024-06-18 10:59:26,994][12645] Fps is (10 sec: 42606.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1951842304. Throughput: 0: 42902.2. Samples: 1951983120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:26,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 10:59:27,471][12883] Updated weights for policy 0, policy_version 119133 (0.0041) +[2024-06-18 10:59:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1952022528. Throughput: 0: 42895.7. Samples: 1952119380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:31,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 10:59:32,031][12883] Updated weights for policy 0, policy_version 119143 (0.0037) +[2024-06-18 10:59:35,060][12883] Updated weights for policy 0, policy_version 119153 (0.0027) +[2024-06-18 10:59:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1952251904. Throughput: 0: 42738.3. Samples: 1952362460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:36,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 10:59:39,652][12883] Updated weights for policy 0, policy_version 119163 (0.0047) +[2024-06-18 10:59:41,690][12862] Signal inference workers to stop experience collection... (28600 times) +[2024-06-18 10:59:41,696][12862] Signal inference workers to resume experience collection... (28600 times) +[2024-06-18 10:59:41,748][12883] InferenceWorker_p0-w0: stopping experience collection (28600 times) +[2024-06-18 10:59:41,748][12883] InferenceWorker_p0-w0: resuming experience collection (28600 times) +[2024-06-18 10:59:41,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1952481280. Throughput: 0: 42888.4. Samples: 1952627440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:41,994][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 10:59:42,926][12883] Updated weights for policy 0, policy_version 119173 (0.0038) +[2024-06-18 10:59:46,996][12645] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 1952661504. Throughput: 0: 42684.1. Samples: 1952755940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:46,997][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 10:59:47,194][12883] Updated weights for policy 0, policy_version 119183 (0.0036) +[2024-06-18 10:59:50,423][12883] Updated weights for policy 0, policy_version 119193 (0.0029) +[2024-06-18 10:59:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1952907264. Throughput: 0: 42712.6. Samples: 1953007840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:51,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 10:59:55,006][12883] Updated weights for policy 0, policy_version 119203 (0.0036) +[2024-06-18 10:59:56,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1953103872. Throughput: 0: 42992.3. Samples: 1953276520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 10:59:56,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 10:59:57,879][12883] Updated weights for policy 0, policy_version 119213 (0.0035) +[2024-06-18 11:00:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1953316864. Throughput: 0: 42616.9. Samples: 1953398520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 11:00:01,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 11:00:02,666][12883] Updated weights for policy 0, policy_version 119223 (0.0041) +[2024-06-18 11:00:05,397][12883] Updated weights for policy 0, policy_version 119233 (0.0038) +[2024-06-18 11:00:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1953562624. Throughput: 0: 42713.7. Samples: 1953648220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 11:00:06,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 11:00:10,257][12883] Updated weights for policy 0, policy_version 119243 (0.0045) +[2024-06-18 11:00:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1953759232. Throughput: 0: 43013.4. Samples: 1953918720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 11:00:11,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 11:00:13,350][12883] Updated weights for policy 0, policy_version 119253 (0.0027) +[2024-06-18 11:00:16,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42326.5, 300 sec: 42709.4). Total num frames: 1953955840. Throughput: 0: 42658.4. Samples: 1954039020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) +[2024-06-18 11:00:16,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 11:00:17,809][12883] Updated weights for policy 0, policy_version 119263 (0.0043) +[2024-06-18 11:00:20,838][12883] Updated weights for policy 0, policy_version 119273 (0.0033) +[2024-06-18 11:00:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1954217984. Throughput: 0: 43026.7. Samples: 1954298660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:00:21,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 11:00:25,574][12883] Updated weights for policy 0, policy_version 119283 (0.0033) +[2024-06-18 11:00:26,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1954381824. Throughput: 0: 42911.6. Samples: 1954558460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:00:26,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 11:00:28,490][12883] Updated weights for policy 0, policy_version 119293 (0.0043) +[2024-06-18 11:00:31,994][12645] Fps is (10 sec: 34405.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1954562048. Throughput: 0: 42718.1. Samples: 1954678160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:00:31,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 11:00:33,203][12883] Updated weights for policy 0, policy_version 119303 (0.0032) +[2024-06-18 11:00:36,283][12883] Updated weights for policy 0, policy_version 119313 (0.0027) +[2024-06-18 11:00:36,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1954856960. Throughput: 0: 42807.5. Samples: 1954934180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:00:36,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 11:00:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119315_1954856960.pth... +[2024-06-18 11:00:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118687_1944567808.pth +[2024-06-18 11:00:40,905][12883] Updated weights for policy 0, policy_version 119323 (0.0039) +[2024-06-18 11:00:41,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1955020800. Throughput: 0: 42440.1. Samples: 1955186320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:00:41,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 11:00:44,287][12883] Updated weights for policy 0, policy_version 119333 (0.0032) +[2024-06-18 11:00:46,994][12645] Fps is (10 sec: 36045.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 1955217408. Throughput: 0: 42470.7. Samples: 1955309700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:00:47,000][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 11:00:48,395][12883] Updated weights for policy 0, policy_version 119343 (0.0041) +[2024-06-18 11:00:50,885][12862] Signal inference workers to stop experience collection... (28650 times) +[2024-06-18 11:00:50,885][12862] Signal inference workers to resume experience collection... (28650 times) +[2024-06-18 11:00:50,930][12883] InferenceWorker_p0-w0: stopping experience collection (28650 times) +[2024-06-18 11:00:50,930][12883] InferenceWorker_p0-w0: resuming experience collection (28650 times) +[2024-06-18 11:00:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1955463168. Throughput: 0: 42790.3. Samples: 1955573780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:00:51,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 11:00:52,078][12883] Updated weights for policy 0, policy_version 119353 (0.0030) +[2024-06-18 11:00:56,108][12883] Updated weights for policy 0, policy_version 119363 (0.0034) +[2024-06-18 11:00:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1955676160. Throughput: 0: 42366.1. Samples: 1955825200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:00:56,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 11:00:59,699][12883] Updated weights for policy 0, policy_version 119373 (0.0034) +[2024-06-18 11:01:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1955872768. Throughput: 0: 42571.6. Samples: 1955954740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:01:01,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 11:01:03,933][12883] Updated weights for policy 0, policy_version 119383 (0.0039) +[2024-06-18 11:01:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1956102144. Throughput: 0: 42544.4. Samples: 1956213160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:01:06,994][12645] Avg episode reward: [(0, '0.690')] +[2024-06-18 11:01:07,518][12883] Updated weights for policy 0, policy_version 119393 (0.0028) +[2024-06-18 11:01:11,475][12883] Updated weights for policy 0, policy_version 119403 (0.0038) +[2024-06-18 11:01:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1956315136. Throughput: 0: 42454.7. Samples: 1956468920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:01:11,994][12645] Avg episode reward: [(0, '0.734')] +[2024-06-18 11:01:15,314][12883] Updated weights for policy 0, policy_version 119413 (0.0028) +[2024-06-18 11:01:16,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 1956511744. Throughput: 0: 42585.0. Samples: 1956594580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 11:01:16,997][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 11:01:19,218][12883] Updated weights for policy 0, policy_version 119423 (0.0039) +[2024-06-18 11:01:21,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41777.6, 300 sec: 42653.6). Total num frames: 1956724736. Throughput: 0: 42401.0. Samples: 1956842320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:01:21,996][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 11:01:23,047][12883] Updated weights for policy 0, policy_version 119433 (0.0023) +[2024-06-18 11:01:26,975][12883] Updated weights for policy 0, policy_version 119443 (0.0036) +[2024-06-18 11:01:26,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1956954112. Throughput: 0: 42527.0. Samples: 1957100040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:01:26,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 11:01:30,903][12883] Updated weights for policy 0, policy_version 119453 (0.0039) +[2024-06-18 11:01:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1957150720. Throughput: 0: 42591.1. Samples: 1957226300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:01:31,994][12645] Avg episode reward: [(0, '0.691')] +[2024-06-18 11:01:34,589][12883] Updated weights for policy 0, policy_version 119463 (0.0036) +[2024-06-18 11:01:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1957363712. Throughput: 0: 42334.6. Samples: 1957478840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:01:36,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 11:01:38,518][12883] Updated weights for policy 0, policy_version 119473 (0.0039) +[2024-06-18 11:01:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1957593088. Throughput: 0: 42465.4. Samples: 1957736140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:01:41,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 11:01:42,437][12883] Updated weights for policy 0, policy_version 119483 (0.0027) +[2024-06-18 11:01:46,166][12883] Updated weights for policy 0, policy_version 119493 (0.0042) +[2024-06-18 11:01:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1957789696. Throughput: 0: 42353.3. Samples: 1957860640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:01:46,994][12645] Avg episode reward: [(0, '0.206')] +[2024-06-18 11:01:50,361][12883] Updated weights for policy 0, policy_version 119503 (0.0037) +[2024-06-18 11:01:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1958002688. Throughput: 0: 42304.5. Samples: 1958116860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:01:51,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 11:01:53,782][12883] Updated weights for policy 0, policy_version 119513 (0.0029) +[2024-06-18 11:01:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1958215680. Throughput: 0: 42443.1. Samples: 1958378860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:01:56,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 11:01:57,952][12883] Updated weights for policy 0, policy_version 119523 (0.0045) +[2024-06-18 11:02:01,443][12883] Updated weights for policy 0, policy_version 119533 (0.0036) +[2024-06-18 11:02:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1958428672. Throughput: 0: 42397.7. Samples: 1958502380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:02:01,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 11:02:05,471][12883] Updated weights for policy 0, policy_version 119543 (0.0034) +[2024-06-18 11:02:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1958641664. Throughput: 0: 42536.3. Samples: 1958756360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:02:06,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 11:02:09,793][12883] Updated weights for policy 0, policy_version 119553 (0.0042) +[2024-06-18 11:02:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1958838272. Throughput: 0: 42531.7. Samples: 1959013960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:02:11,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 11:02:13,106][12883] Updated weights for policy 0, policy_version 119563 (0.0037) +[2024-06-18 11:02:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1959067648. Throughput: 0: 42445.7. Samples: 1959136360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 11:02:16,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 11:02:17,349][12883] Updated weights for policy 0, policy_version 119573 (0.0041) +[2024-06-18 11:02:20,168][12862] Signal inference workers to stop experience collection... (28700 times) +[2024-06-18 11:02:20,169][12862] Signal inference workers to resume experience collection... (28700 times) +[2024-06-18 11:02:20,191][12883] InferenceWorker_p0-w0: stopping experience collection (28700 times) +[2024-06-18 11:02:20,191][12883] InferenceWorker_p0-w0: resuming experience collection (28700 times) +[2024-06-18 11:02:20,783][12883] Updated weights for policy 0, policy_version 119583 (0.0029) +[2024-06-18 11:02:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42599.9, 300 sec: 42598.8). Total num frames: 1959280640. Throughput: 0: 42657.3. Samples: 1959398420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:02:21,996][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 11:02:24,847][12883] Updated weights for policy 0, policy_version 119593 (0.0031) +[2024-06-18 11:02:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1959477248. Throughput: 0: 42670.6. Samples: 1959656320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:02:26,994][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 11:02:28,510][12883] Updated weights for policy 0, policy_version 119603 (0.0032) +[2024-06-18 11:02:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1959706624. Throughput: 0: 42657.3. Samples: 1959780220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:02:31,997][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 11:02:32,789][12883] Updated weights for policy 0, policy_version 119613 (0.0030) +[2024-06-18 11:02:36,199][12883] Updated weights for policy 0, policy_version 119623 (0.0027) +[2024-06-18 11:02:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1959936000. Throughput: 0: 42714.6. Samples: 1960039020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:02:36,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 11:02:37,030][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119625_1959936000.pth... +[2024-06-18 11:02:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118999_1949679616.pth +[2024-06-18 11:02:40,446][12883] Updated weights for policy 0, policy_version 119633 (0.0036) +[2024-06-18 11:02:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1960116224. Throughput: 0: 42569.3. Samples: 1960294480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:02:41,995][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 11:02:44,082][12883] Updated weights for policy 0, policy_version 119643 (0.0032) +[2024-06-18 11:02:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1960361984. Throughput: 0: 42501.4. Samples: 1960414940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:02:46,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 11:02:48,172][12883] Updated weights for policy 0, policy_version 119653 (0.0040) +[2024-06-18 11:02:51,926][12883] Updated weights for policy 0, policy_version 119663 (0.0025) +[2024-06-18 11:02:51,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1960558592. Throughput: 0: 42766.8. Samples: 1960680860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:02:51,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 11:02:55,697][12883] Updated weights for policy 0, policy_version 119673 (0.0027) +[2024-06-18 11:02:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1960771584. Throughput: 0: 42712.0. Samples: 1960936000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:02:56,994][12645] Avg episode reward: [(0, '0.215')] +[2024-06-18 11:02:59,518][12883] Updated weights for policy 0, policy_version 119683 (0.0049) +[2024-06-18 11:03:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1961000960. Throughput: 0: 42732.0. Samples: 1961059300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:03:01,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 11:03:03,529][12883] Updated weights for policy 0, policy_version 119693 (0.0037) +[2024-06-18 11:03:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1961181184. Throughput: 0: 42568.5. Samples: 1961314000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:03:06,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 11:03:07,228][12883] Updated weights for policy 0, policy_version 119703 (0.0041) +[2024-06-18 11:03:11,105][12883] Updated weights for policy 0, policy_version 119713 (0.0033) +[2024-06-18 11:03:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1961394176. Throughput: 0: 42595.1. Samples: 1961573100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:03:11,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 11:03:14,858][12883] Updated weights for policy 0, policy_version 119723 (0.0042) +[2024-06-18 11:03:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1961639936. Throughput: 0: 42627.1. Samples: 1961698440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:03:16,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 11:03:18,659][12883] Updated weights for policy 0, policy_version 119733 (0.0033) +[2024-06-18 11:03:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1961820160. Throughput: 0: 42702.2. Samples: 1961960620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:03:21,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 11:03:22,339][12883] Updated weights for policy 0, policy_version 119743 (0.0025) +[2024-06-18 11:03:26,504][12883] Updated weights for policy 0, policy_version 119753 (0.0038) +[2024-06-18 11:03:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1962033152. Throughput: 0: 42546.2. Samples: 1962209060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:03:26,998][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 11:03:30,374][12883] Updated weights for policy 0, policy_version 119763 (0.0037) +[2024-06-18 11:03:31,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1962278912. Throughput: 0: 42773.3. Samples: 1962339740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:03:31,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 11:03:34,004][12883] Updated weights for policy 0, policy_version 119773 (0.0030) +[2024-06-18 11:03:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 1962442752. Throughput: 0: 42727.4. Samples: 1962603600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:03:36,994][12645] Avg episode reward: [(0, '0.691')] +[2024-06-18 11:03:37,813][12883] Updated weights for policy 0, policy_version 119783 (0.0034) +[2024-06-18 11:03:39,830][12862] Signal inference workers to stop experience collection... (28750 times) +[2024-06-18 11:03:39,830][12862] Signal inference workers to resume experience collection... (28750 times) +[2024-06-18 11:03:39,873][12883] InferenceWorker_p0-w0: stopping experience collection (28750 times) +[2024-06-18 11:03:39,873][12883] InferenceWorker_p0-w0: resuming experience collection (28750 times) +[2024-06-18 11:03:41,767][12883] Updated weights for policy 0, policy_version 119793 (0.0040) +[2024-06-18 11:03:41,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1962688512. Throughput: 0: 42563.4. Samples: 1962851360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:03:41,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 11:03:45,535][12883] Updated weights for policy 0, policy_version 119803 (0.0035) +[2024-06-18 11:03:46,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1962917888. Throughput: 0: 42783.5. Samples: 1962984560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:03:46,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 11:03:49,284][12883] Updated weights for policy 0, policy_version 119813 (0.0035) +[2024-06-18 11:03:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1963098112. Throughput: 0: 42889.8. Samples: 1963244040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:03:51,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 11:03:53,212][12883] Updated weights for policy 0, policy_version 119823 (0.0031) +[2024-06-18 11:03:56,854][12883] Updated weights for policy 0, policy_version 119833 (0.0026) +[2024-06-18 11:03:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1963343872. Throughput: 0: 42704.4. Samples: 1963494800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:03:56,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 11:04:00,859][12883] Updated weights for policy 0, policy_version 119843 (0.0030) +[2024-06-18 11:04:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1963556864. Throughput: 0: 42863.1. Samples: 1963627280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:04:01,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 11:04:04,486][12883] Updated weights for policy 0, policy_version 119853 (0.0030) +[2024-06-18 11:04:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1963737088. Throughput: 0: 42753.7. Samples: 1963884540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:04:06,994][12645] Avg episode reward: [(0, '0.669')] +[2024-06-18 11:04:08,517][12883] Updated weights for policy 0, policy_version 119863 (0.0033) +[2024-06-18 11:04:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 1963982848. Throughput: 0: 42844.5. Samples: 1964137060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:04:11,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 11:04:12,082][12883] Updated weights for policy 0, policy_version 119873 (0.0022) +[2024-06-18 11:04:16,011][12883] Updated weights for policy 0, policy_version 119883 (0.0038) +[2024-06-18 11:04:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1964195840. Throughput: 0: 42854.1. Samples: 1964268180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:04:16,994][12645] Avg episode reward: [(0, '0.564')] +[2024-06-18 11:04:20,046][12883] Updated weights for policy 0, policy_version 119893 (0.0034) +[2024-06-18 11:04:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1964392448. Throughput: 0: 42633.9. Samples: 1964522120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:04:21,994][12645] Avg episode reward: [(0, '0.319')] +[2024-06-18 11:04:23,894][12883] Updated weights for policy 0, policy_version 119903 (0.0030) +[2024-06-18 11:04:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1964621824. Throughput: 0: 42632.1. Samples: 1964769800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:04:26,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 11:04:27,811][12883] Updated weights for policy 0, policy_version 119913 (0.0053) +[2024-06-18 11:04:31,585][12883] Updated weights for policy 0, policy_version 119923 (0.0043) +[2024-06-18 11:04:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1964851200. Throughput: 0: 42706.3. Samples: 1964906340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:04:31,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 11:04:35,443][12883] Updated weights for policy 0, policy_version 119933 (0.0041) +[2024-06-18 11:04:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1965031424. Throughput: 0: 42521.7. Samples: 1965157520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:04:36,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 11:04:37,108][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119937_1965047808.pth... +[2024-06-18 11:04:37,167][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119315_1954856960.pth +[2024-06-18 11:04:39,207][12883] Updated weights for policy 0, policy_version 119943 (0.0036) +[2024-06-18 11:04:41,995][12645] Fps is (10 sec: 40952.7, 60 sec: 42870.2, 300 sec: 42709.5). Total num frames: 1965260800. Throughput: 0: 42510.7. Samples: 1965407860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:04:41,996][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 11:04:43,582][12883] Updated weights for policy 0, policy_version 119953 (0.0044) +[2024-06-18 11:04:46,812][12883] Updated weights for policy 0, policy_version 119963 (0.0029) +[2024-06-18 11:04:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1965490176. Throughput: 0: 42590.7. Samples: 1965543860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:04:46,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 11:04:51,066][12883] Updated weights for policy 0, policy_version 119973 (0.0037) +[2024-06-18 11:04:51,994][12645] Fps is (10 sec: 40967.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1965670400. Throughput: 0: 42705.4. Samples: 1965806280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:04:51,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 11:04:52,200][12862] Signal inference workers to stop experience collection... (28800 times) +[2024-06-18 11:04:52,200][12862] Signal inference workers to resume experience collection... (28800 times) +[2024-06-18 11:04:52,237][12883] InferenceWorker_p0-w0: stopping experience collection (28800 times) +[2024-06-18 11:04:52,237][12883] InferenceWorker_p0-w0: resuming experience collection (28800 times) +[2024-06-18 11:04:54,279][12883] Updated weights for policy 0, policy_version 119983 (0.0026) +[2024-06-18 11:04:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1965916160. Throughput: 0: 42734.2. Samples: 1966060100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:04:56,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 11:04:58,569][12883] Updated weights for policy 0, policy_version 119993 (0.0037) +[2024-06-18 11:05:01,762][12883] Updated weights for policy 0, policy_version 120003 (0.0034) +[2024-06-18 11:05:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1966129152. Throughput: 0: 42651.6. Samples: 1966187500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:05:01,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 11:05:06,068][12883] Updated weights for policy 0, policy_version 120013 (0.0029) +[2024-06-18 11:05:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1966342144. Throughput: 0: 42780.3. Samples: 1966447240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:05:06,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 11:05:09,541][12883] Updated weights for policy 0, policy_version 120023 (0.0044) +[2024-06-18 11:05:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1966538752. Throughput: 0: 42896.3. Samples: 1966700140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:05:11,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 11:05:13,678][12883] Updated weights for policy 0, policy_version 120033 (0.0024) +[2024-06-18 11:05:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1966751744. Throughput: 0: 42707.0. Samples: 1966828160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:05:16,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 11:05:17,293][12883] Updated weights for policy 0, policy_version 120043 (0.0029) +[2024-06-18 11:05:21,399][12883] Updated weights for policy 0, policy_version 120053 (0.0038) +[2024-06-18 11:05:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1966981120. Throughput: 0: 42891.1. Samples: 1967087620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) +[2024-06-18 11:05:21,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 11:05:24,791][12883] Updated weights for policy 0, policy_version 120063 (0.0047) +[2024-06-18 11:05:26,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1967177728. Throughput: 0: 43021.3. Samples: 1967343740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:05:26,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 11:05:29,139][12883] Updated weights for policy 0, policy_version 120073 (0.0028) +[2024-06-18 11:05:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1967390720. Throughput: 0: 42801.8. Samples: 1967469940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:05:31,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 11:05:32,404][12883] Updated weights for policy 0, policy_version 120083 (0.0032) +[2024-06-18 11:05:36,655][12883] Updated weights for policy 0, policy_version 120093 (0.0026) +[2024-06-18 11:05:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1967620096. Throughput: 0: 42798.1. Samples: 1967732200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:05:36,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 11:05:39,838][12883] Updated weights for policy 0, policy_version 120103 (0.0023) +[2024-06-18 11:05:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42599.7, 300 sec: 42709.5). Total num frames: 1967816704. Throughput: 0: 42911.1. Samples: 1967991100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:05:41,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 11:05:44,342][12883] Updated weights for policy 0, policy_version 120113 (0.0023) +[2024-06-18 11:05:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1968029696. Throughput: 0: 42891.1. Samples: 1968117600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:05:46,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 11:05:47,596][12883] Updated weights for policy 0, policy_version 120123 (0.0032) +[2024-06-18 11:05:51,847][12883] Updated weights for policy 0, policy_version 120133 (0.0031) +[2024-06-18 11:05:52,000][12645] Fps is (10 sec: 44209.2, 60 sec: 43140.0, 300 sec: 42653.0). Total num frames: 1968259072. Throughput: 0: 42791.4. Samples: 1968373120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:05:52,001][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 11:05:55,167][12883] Updated weights for policy 0, policy_version 120143 (0.0030) +[2024-06-18 11:05:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1968455680. Throughput: 0: 42836.1. Samples: 1968627760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:05:56,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 11:05:59,623][12883] Updated weights for policy 0, policy_version 120153 (0.0032) +[2024-06-18 11:06:01,983][12862] Signal inference workers to stop experience collection... (28850 times) +[2024-06-18 11:06:01,994][12645] Fps is (10 sec: 40986.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1968668672. Throughput: 0: 42847.3. Samples: 1968756280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:06:01,994][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 11:06:02,025][12883] InferenceWorker_p0-w0: stopping experience collection (28850 times) +[2024-06-18 11:06:02,106][12862] Signal inference workers to resume experience collection... (28850 times) +[2024-06-18 11:06:02,106][12883] InferenceWorker_p0-w0: resuming experience collection (28850 times) +[2024-06-18 11:06:03,342][12883] Updated weights for policy 0, policy_version 120163 (0.0029) +[2024-06-18 11:06:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1968898048. Throughput: 0: 42928.1. Samples: 1969019380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:06:06,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 11:06:07,087][12883] Updated weights for policy 0, policy_version 120173 (0.0038) +[2024-06-18 11:06:10,943][12883] Updated weights for policy 0, policy_version 120183 (0.0032) +[2024-06-18 11:06:12,000][12645] Fps is (10 sec: 44208.7, 60 sec: 42867.0, 300 sec: 42708.9). Total num frames: 1969111040. Throughput: 0: 42543.0. Samples: 1969258440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:06:12,001][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 11:06:15,169][12883] Updated weights for policy 0, policy_version 120193 (0.0037) +[2024-06-18 11:06:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 1969324032. Throughput: 0: 42746.3. Samples: 1969393520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:06:16,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 11:06:18,502][12883] Updated weights for policy 0, policy_version 120203 (0.0028) +[2024-06-18 11:06:21,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1969537024. Throughput: 0: 42792.5. Samples: 1969657860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:06:21,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 11:06:22,827][12883] Updated weights for policy 0, policy_version 120213 (0.0027) +[2024-06-18 11:06:26,004][12883] Updated weights for policy 0, policy_version 120223 (0.0030) +[2024-06-18 11:06:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1969766400. Throughput: 0: 42488.4. Samples: 1969903080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 11:06:26,994][12645] Avg episode reward: [(0, '0.210')] +[2024-06-18 11:06:30,275][12883] Updated weights for policy 0, policy_version 120233 (0.0037) +[2024-06-18 11:06:31,996][12645] Fps is (10 sec: 44226.9, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 1969979392. Throughput: 0: 42734.6. Samples: 1970040760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:06:31,997][12645] Avg episode reward: [(0, '0.687')] +[2024-06-18 11:06:33,698][12883] Updated weights for policy 0, policy_version 120243 (0.0026) +[2024-06-18 11:06:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1970176000. Throughput: 0: 42921.6. Samples: 1970304320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:06:36,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 11:06:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120250_1970176000.pth... +[2024-06-18 11:06:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119625_1959936000.pth +[2024-06-18 11:06:37,777][12883] Updated weights for policy 0, policy_version 120253 (0.0035) +[2024-06-18 11:06:41,275][12883] Updated weights for policy 0, policy_version 120263 (0.0038) +[2024-06-18 11:06:41,994][12645] Fps is (10 sec: 44247.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1970421760. Throughput: 0: 42740.1. Samples: 1970551060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:06:41,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 11:06:45,495][12883] Updated weights for policy 0, policy_version 120273 (0.0038) +[2024-06-18 11:06:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1970618368. Throughput: 0: 42846.1. Samples: 1970684360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:06:46,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 11:06:49,080][12883] Updated weights for policy 0, policy_version 120283 (0.0046) +[2024-06-18 11:06:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 1970831360. Throughput: 0: 42763.1. Samples: 1970943720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:06:51,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 11:06:53,297][12883] Updated weights for policy 0, policy_version 120293 (0.0035) +[2024-06-18 11:06:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1971027968. Throughput: 0: 43019.3. Samples: 1971194040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:06:56,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 11:06:57,073][12883] Updated weights for policy 0, policy_version 120303 (0.0036) +[2024-06-18 11:07:00,821][12883] Updated weights for policy 0, policy_version 120313 (0.0041) +[2024-06-18 11:07:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1971273728. Throughput: 0: 42865.8. Samples: 1971322480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:07:01,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 11:07:04,617][12883] Updated weights for policy 0, policy_version 120323 (0.0028) +[2024-06-18 11:07:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1971453952. Throughput: 0: 42801.8. Samples: 1971583940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:07:06,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 11:07:08,366][12883] Updated weights for policy 0, policy_version 120333 (0.0037) +[2024-06-18 11:07:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 1971666944. Throughput: 0: 43021.5. Samples: 1971839040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:07:11,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 11:07:12,228][12883] Updated weights for policy 0, policy_version 120343 (0.0032) +[2024-06-18 11:07:15,953][12883] Updated weights for policy 0, policy_version 120353 (0.0040) +[2024-06-18 11:07:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1971912704. Throughput: 0: 42836.3. Samples: 1971968300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:07:16,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 11:07:19,940][12883] Updated weights for policy 0, policy_version 120363 (0.0045) +[2024-06-18 11:07:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1972092928. Throughput: 0: 42613.7. Samples: 1972221940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:07:21,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 11:07:23,334][12862] Signal inference workers to stop experience collection... (28900 times) +[2024-06-18 11:07:23,334][12862] Signal inference workers to resume experience collection... (28900 times) +[2024-06-18 11:07:23,351][12883] InferenceWorker_p0-w0: stopping experience collection (28900 times) +[2024-06-18 11:07:23,351][12883] InferenceWorker_p0-w0: resuming experience collection (28900 times) +[2024-06-18 11:07:23,477][12883] Updated weights for policy 0, policy_version 120373 (0.0025) +[2024-06-18 11:07:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1972322304. Throughput: 0: 42842.1. Samples: 1972478960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 11:07:26,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 11:07:27,521][12883] Updated weights for policy 0, policy_version 120383 (0.0035) +[2024-06-18 11:07:30,930][12883] Updated weights for policy 0, policy_version 120393 (0.0031) +[2024-06-18 11:07:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1972535296. Throughput: 0: 42985.0. Samples: 1972618680. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:07:31,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 11:07:35,518][12883] Updated weights for policy 0, policy_version 120403 (0.0033) +[2024-06-18 11:07:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1972731904. Throughput: 0: 42870.3. Samples: 1972872880. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:07:36,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 11:07:38,466][12883] Updated weights for policy 0, policy_version 120413 (0.0036) +[2024-06-18 11:07:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1972977664. Throughput: 0: 42937.8. Samples: 1973126240. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:07:41,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 11:07:43,212][12883] Updated weights for policy 0, policy_version 120423 (0.0040) +[2024-06-18 11:07:46,298][12883] Updated weights for policy 0, policy_version 120433 (0.0029) +[2024-06-18 11:07:46,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1973207040. Throughput: 0: 43059.1. Samples: 1973260140. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:07:46,994][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 11:07:50,774][12883] Updated weights for policy 0, policy_version 120443 (0.0045) +[2024-06-18 11:07:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1973370880. Throughput: 0: 42939.1. Samples: 1973516200. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:07:51,994][12645] Avg episode reward: [(0, '0.637')] +[2024-06-18 11:07:53,783][12883] Updated weights for policy 0, policy_version 120453 (0.0031) +[2024-06-18 11:07:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1973633024. Throughput: 0: 42799.9. Samples: 1973765040. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:07:56,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 11:07:58,475][12883] Updated weights for policy 0, policy_version 120463 (0.0042) +[2024-06-18 11:08:01,492][12883] Updated weights for policy 0, policy_version 120473 (0.0035) +[2024-06-18 11:08:01,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1973829632. Throughput: 0: 43045.0. Samples: 1973905320. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:08:01,994][12645] Avg episode reward: [(0, '0.163')] +[2024-06-18 11:08:06,214][12883] Updated weights for policy 0, policy_version 120483 (0.0037) +[2024-06-18 11:08:06,996][12645] Fps is (10 sec: 36036.7, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 1973993472. Throughput: 0: 43063.6. Samples: 1974159900. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:08:06,997][12645] Avg episode reward: [(0, '0.157')] +[2024-06-18 11:08:09,213][12883] Updated weights for policy 0, policy_version 120493 (0.0031) +[2024-06-18 11:08:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1974272000. Throughput: 0: 42886.8. Samples: 1974408860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:08:11,994][12645] Avg episode reward: [(0, '0.244')] +[2024-06-18 11:08:13,936][12883] Updated weights for policy 0, policy_version 120503 (0.0032) +[2024-06-18 11:08:16,994][12645] Fps is (10 sec: 47524.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1974468608. Throughput: 0: 42893.3. Samples: 1974548880. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:08:16,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 11:08:17,041][12883] Updated weights for policy 0, policy_version 120513 (0.0051) +[2024-06-18 11:08:21,763][12883] Updated weights for policy 0, policy_version 120523 (0.0030) +[2024-06-18 11:08:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1974648832. Throughput: 0: 42682.6. Samples: 1974793600. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:08:21,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 11:08:24,758][12883] Updated weights for policy 0, policy_version 120533 (0.0029) +[2024-06-18 11:08:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1974910976. Throughput: 0: 42560.4. Samples: 1975041460. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) +[2024-06-18 11:08:26,994][12645] Avg episode reward: [(0, '0.639')] +[2024-06-18 11:08:29,420][12883] Updated weights for policy 0, policy_version 120543 (0.0040) +[2024-06-18 11:08:31,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1975107584. Throughput: 0: 42540.1. Samples: 1975174440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:08:31,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 11:08:32,367][12883] Updated weights for policy 0, policy_version 120553 (0.0033) +[2024-06-18 11:08:36,946][12883] Updated weights for policy 0, policy_version 120563 (0.0028) +[2024-06-18 11:08:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1975304192. Throughput: 0: 42350.3. Samples: 1975421960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:08:36,994][12645] Avg episode reward: [(0, '0.773')] +[2024-06-18 11:08:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120563_1975304192.pth... +[2024-06-18 11:08:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119937_1965047808.pth +[2024-06-18 11:08:39,770][12862] Signal inference workers to stop experience collection... (28950 times) +[2024-06-18 11:08:39,806][12883] InferenceWorker_p0-w0: stopping experience collection (28950 times) +[2024-06-18 11:08:39,833][12862] Signal inference workers to resume experience collection... (28950 times) +[2024-06-18 11:08:39,834][12883] InferenceWorker_p0-w0: resuming experience collection (28950 times) +[2024-06-18 11:08:39,974][12883] Updated weights for policy 0, policy_version 120573 (0.0038) +[2024-06-18 11:08:41,995][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1975549952. Throughput: 0: 42499.1. Samples: 1975677500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:08:41,996][12645] Avg episode reward: [(0, '0.673')] +[2024-06-18 11:08:45,123][12883] Updated weights for policy 0, policy_version 120583 (0.0036) +[2024-06-18 11:08:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 1975713792. Throughput: 0: 42371.1. Samples: 1975812020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:08:46,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 11:08:47,744][12883] Updated weights for policy 0, policy_version 120593 (0.0026) +[2024-06-18 11:08:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1975926784. Throughput: 0: 42109.6. Samples: 1976054740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:08:51,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 11:08:52,823][12883] Updated weights for policy 0, policy_version 120603 (0.0029) +[2024-06-18 11:08:55,698][12883] Updated weights for policy 0, policy_version 120613 (0.0046) +[2024-06-18 11:08:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1976172544. Throughput: 0: 42222.6. Samples: 1976308880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:08:56,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 11:09:00,455][12883] Updated weights for policy 0, policy_version 120623 (0.0033) +[2024-06-18 11:09:01,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1976336384. Throughput: 0: 42232.5. Samples: 1976449340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:09:01,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 11:09:03,372][12883] Updated weights for policy 0, policy_version 120633 (0.0031) +[2024-06-18 11:09:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 1976582144. Throughput: 0: 42315.1. Samples: 1976697780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:09:06,994][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 11:09:08,024][12883] Updated weights for policy 0, policy_version 120643 (0.0032) +[2024-06-18 11:09:10,924][12883] Updated weights for policy 0, policy_version 120653 (0.0035) +[2024-06-18 11:09:11,994][12645] Fps is (10 sec: 49151.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1976827904. Throughput: 0: 42286.7. Samples: 1976944360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:09:11,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 11:09:15,875][12883] Updated weights for policy 0, policy_version 120663 (0.0031) +[2024-06-18 11:09:16,994][12645] Fps is (10 sec: 39318.8, 60 sec: 41778.6, 300 sec: 42653.8). Total num frames: 1976975360. Throughput: 0: 42322.3. Samples: 1977078980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:09:16,995][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 11:09:18,639][12883] Updated weights for policy 0, policy_version 120673 (0.0028) +[2024-06-18 11:09:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1977237504. Throughput: 0: 42414.2. Samples: 1977330600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:09:21,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 11:09:23,466][12883] Updated weights for policy 0, policy_version 120683 (0.0033) +[2024-06-18 11:09:26,557][12883] Updated weights for policy 0, policy_version 120693 (0.0028) +[2024-06-18 11:09:26,994][12645] Fps is (10 sec: 47517.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1977450496. Throughput: 0: 42490.7. Samples: 1977589580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:09:26,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 11:09:31,676][12883] Updated weights for policy 0, policy_version 120703 (0.0047) +[2024-06-18 11:09:31,996][12645] Fps is (10 sec: 37675.2, 60 sec: 41777.6, 300 sec: 42653.6). Total num frames: 1977614336. Throughput: 0: 42332.6. Samples: 1977717080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 11:09:31,996][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 11:09:34,249][12883] Updated weights for policy 0, policy_version 120713 (0.0038) +[2024-06-18 11:09:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 1977876480. Throughput: 0: 42437.4. Samples: 1977964420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:09:36,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 11:09:39,348][12883] Updated weights for policy 0, policy_version 120723 (0.0036) +[2024-06-18 11:09:40,668][12862] Signal inference workers to stop experience collection... (29000 times) +[2024-06-18 11:09:40,668][12862] Signal inference workers to resume experience collection... (29000 times) +[2024-06-18 11:09:40,695][12883] InferenceWorker_p0-w0: stopping experience collection (29000 times) +[2024-06-18 11:09:40,695][12883] InferenceWorker_p0-w0: resuming experience collection (29000 times) +[2024-06-18 11:09:41,936][12883] Updated weights for policy 0, policy_version 120733 (0.0029) +[2024-06-18 11:09:41,994][12645] Fps is (10 sec: 47524.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1978089472. Throughput: 0: 42633.4. Samples: 1978227380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:09:41,994][12645] Avg episode reward: [(0, '0.713')] +[2024-06-18 11:09:46,994][12645] Fps is (10 sec: 36045.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1978236928. Throughput: 0: 42233.3. Samples: 1978349840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:09:46,994][12645] Avg episode reward: [(0, '0.654')] +[2024-06-18 11:09:47,018][12883] Updated weights for policy 0, policy_version 120743 (0.0027) +[2024-06-18 11:09:49,775][12883] Updated weights for policy 0, policy_version 120753 (0.0027) +[2024-06-18 11:09:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1978499072. Throughput: 0: 42161.0. Samples: 1978595020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:09:51,994][12645] Avg episode reward: [(0, '0.697')] +[2024-06-18 11:09:55,079][12883] Updated weights for policy 0, policy_version 120763 (0.0038) +[2024-06-18 11:09:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1978695680. Throughput: 0: 42656.5. Samples: 1978863900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:09:56,994][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 11:09:57,431][12883] Updated weights for policy 0, policy_version 120773 (0.0029) +[2024-06-18 11:10:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1978875904. Throughput: 0: 42270.9. Samples: 1978981140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:10:01,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 11:10:02,719][12883] Updated weights for policy 0, policy_version 120783 (0.0035) +[2024-06-18 11:10:04,994][12883] Updated weights for policy 0, policy_version 120793 (0.0035) +[2024-06-18 11:10:06,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1979154432. Throughput: 0: 42347.5. Samples: 1979236240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:10:06,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 11:10:10,411][12883] Updated weights for policy 0, policy_version 120803 (0.0034) +[2024-06-18 11:10:11,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41779.1, 300 sec: 42654.0). Total num frames: 1979334656. Throughput: 0: 42529.3. Samples: 1979503400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:10:11,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 11:10:12,574][12883] Updated weights for policy 0, policy_version 120813 (0.0030) +[2024-06-18 11:10:16,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.9, 300 sec: 42542.9). Total num frames: 1979531264. Throughput: 0: 42481.6. Samples: 1979628660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:10:16,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 11:10:17,930][12883] Updated weights for policy 0, policy_version 120823 (0.0034) +[2024-06-18 11:10:20,128][12883] Updated weights for policy 0, policy_version 120833 (0.0026) +[2024-06-18 11:10:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1979777024. Throughput: 0: 42595.1. Samples: 1979881200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:10:21,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 11:10:25,414][12883] Updated weights for policy 0, policy_version 120843 (0.0043) +[2024-06-18 11:10:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1979973632. Throughput: 0: 42733.2. Samples: 1980150380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:10:26,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 11:10:27,748][12883] Updated weights for policy 0, policy_version 120853 (0.0033) +[2024-06-18 11:10:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 1980170240. Throughput: 0: 42631.1. Samples: 1980268240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) +[2024-06-18 11:10:31,994][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 11:10:32,943][12883] Updated weights for policy 0, policy_version 120863 (0.0027) +[2024-06-18 11:10:35,349][12883] Updated weights for policy 0, policy_version 120873 (0.0030) +[2024-06-18 11:10:36,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1980432384. Throughput: 0: 42960.0. Samples: 1980528220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:10:36,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 11:10:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120876_1980432384.pth... +[2024-06-18 11:10:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120250_1970176000.pth +[2024-06-18 11:10:40,559][12883] Updated weights for policy 0, policy_version 120883 (0.0046) +[2024-06-18 11:10:41,667][12862] Signal inference workers to stop experience collection... (29050 times) +[2024-06-18 11:10:41,667][12862] Signal inference workers to resume experience collection... (29050 times) +[2024-06-18 11:10:41,701][12883] InferenceWorker_p0-w0: stopping experience collection (29050 times) +[2024-06-18 11:10:41,701][12883] InferenceWorker_p0-w0: resuming experience collection (29050 times) +[2024-06-18 11:10:41,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 1980628992. Throughput: 0: 42862.0. Samples: 1980792700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:10:41,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 11:10:43,051][12883] Updated weights for policy 0, policy_version 120893 (0.0026) +[2024-06-18 11:10:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42599.3). Total num frames: 1980825600. Throughput: 0: 43005.8. Samples: 1980916400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:10:46,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 11:10:48,123][12883] Updated weights for policy 0, policy_version 120903 (0.0033) +[2024-06-18 11:10:50,689][12883] Updated weights for policy 0, policy_version 120913 (0.0035) +[2024-06-18 11:10:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1981071360. Throughput: 0: 42919.5. Samples: 1981167620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:10:51,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 11:10:55,713][12883] Updated weights for policy 0, policy_version 120923 (0.0031) +[2024-06-18 11:10:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1981267968. Throughput: 0: 42929.0. Samples: 1981435200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:10:56,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 11:10:58,091][12883] Updated weights for policy 0, policy_version 120933 (0.0034) +[2024-06-18 11:11:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1981480960. Throughput: 0: 42825.3. Samples: 1981555800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:11:01,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 11:11:03,314][12883] Updated weights for policy 0, policy_version 120943 (0.0038) +[2024-06-18 11:11:06,314][12883] Updated weights for policy 0, policy_version 120953 (0.0039) +[2024-06-18 11:11:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42765.9). Total num frames: 1981726720. Throughput: 0: 42846.3. Samples: 1981809280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:11:06,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 11:11:11,000][12883] Updated weights for policy 0, policy_version 120963 (0.0044) +[2024-06-18 11:11:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1981906944. Throughput: 0: 42742.6. Samples: 1982073800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:11:11,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 11:11:13,980][12883] Updated weights for policy 0, policy_version 120973 (0.0052) +[2024-06-18 11:11:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1982119936. Throughput: 0: 42768.4. Samples: 1982192820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:11:16,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 11:11:18,669][12883] Updated weights for policy 0, policy_version 120983 (0.0041) +[2024-06-18 11:11:21,717][12883] Updated weights for policy 0, policy_version 120993 (0.0027) +[2024-06-18 11:11:21,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1982365696. Throughput: 0: 42681.3. Samples: 1982448880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:11:21,994][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 11:11:26,161][12883] Updated weights for policy 0, policy_version 121003 (0.0035) +[2024-06-18 11:11:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 1982562304. Throughput: 0: 42707.7. Samples: 1982714540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:11:26,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 11:11:29,489][12883] Updated weights for policy 0, policy_version 121013 (0.0033) +[2024-06-18 11:11:31,994][12645] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1982758912. Throughput: 0: 42611.9. Samples: 1982833940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) +[2024-06-18 11:11:31,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 11:11:33,661][12883] Updated weights for policy 0, policy_version 121023 (0.0027) +[2024-06-18 11:11:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1982988288. Throughput: 0: 42882.8. Samples: 1983097340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:11:36,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 11:11:37,011][12883] Updated weights for policy 0, policy_version 121033 (0.0033) +[2024-06-18 11:11:41,531][12883] Updated weights for policy 0, policy_version 121043 (0.0038) +[2024-06-18 11:11:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1983168512. Throughput: 0: 42560.8. Samples: 1983350440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:11:41,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 11:11:44,933][12883] Updated weights for policy 0, policy_version 121053 (0.0039) +[2024-06-18 11:11:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1983397888. Throughput: 0: 42607.6. Samples: 1983473140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:11:46,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 11:11:49,138][12883] Updated weights for policy 0, policy_version 121063 (0.0039) +[2024-06-18 11:11:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1983627264. Throughput: 0: 42734.6. Samples: 1983732340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:11:51,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 11:11:52,591][12883] Updated weights for policy 0, policy_version 121073 (0.0030) +[2024-06-18 11:11:56,712][12883] Updated weights for policy 0, policy_version 121083 (0.0037) +[2024-06-18 11:11:56,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 1983823872. Throughput: 0: 42508.7. Samples: 1983986700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:11:56,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 11:12:00,497][12883] Updated weights for policy 0, policy_version 121093 (0.0046) +[2024-06-18 11:12:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1984036864. Throughput: 0: 42657.3. Samples: 1984112400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:12:01,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 11:12:03,190][12862] Signal inference workers to stop experience collection... (29100 times) +[2024-06-18 11:12:03,222][12883] InferenceWorker_p0-w0: stopping experience collection (29100 times) +[2024-06-18 11:12:03,247][12862] Signal inference workers to resume experience collection... (29100 times) +[2024-06-18 11:12:03,248][12883] InferenceWorker_p0-w0: resuming experience collection (29100 times) +[2024-06-18 11:12:04,638][12883] Updated weights for policy 0, policy_version 121103 (0.0033) +[2024-06-18 11:12:06,994][12645] Fps is (10 sec: 44238.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1984266240. Throughput: 0: 42631.5. Samples: 1984367300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:12:06,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 11:12:08,307][12883] Updated weights for policy 0, policy_version 121113 (0.0027) +[2024-06-18 11:12:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1984446464. Throughput: 0: 42385.3. Samples: 1984621880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:12:11,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 11:12:12,476][12883] Updated weights for policy 0, policy_version 121123 (0.0034) +[2024-06-18 11:12:15,826][12883] Updated weights for policy 0, policy_version 121133 (0.0034) +[2024-06-18 11:12:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1984675840. Throughput: 0: 42457.0. Samples: 1984744500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:12:16,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 11:12:20,032][12883] Updated weights for policy 0, policy_version 121143 (0.0053) +[2024-06-18 11:12:21,995][12645] Fps is (10 sec: 44229.7, 60 sec: 42051.0, 300 sec: 42598.2). Total num frames: 1984888832. Throughput: 0: 42420.2. Samples: 1985006320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:12:21,996][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 11:12:23,391][12883] Updated weights for policy 0, policy_version 121153 (0.0037) +[2024-06-18 11:12:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1985101824. Throughput: 0: 42413.8. Samples: 1985259060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:12:26,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 11:12:27,457][12883] Updated weights for policy 0, policy_version 121163 (0.0032) +[2024-06-18 11:12:31,089][12883] Updated weights for policy 0, policy_version 121173 (0.0037) +[2024-06-18 11:12:31,994][12645] Fps is (10 sec: 42606.2, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 1985314816. Throughput: 0: 42507.7. Samples: 1985385980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:12:31,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 11:12:34,823][12883] Updated weights for policy 0, policy_version 121183 (0.0029) +[2024-06-18 11:12:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1985511424. Throughput: 0: 42433.8. Samples: 1985641860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) +[2024-06-18 11:12:36,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 11:12:37,040][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121187_1985527808.pth... +[2024-06-18 11:12:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120563_1975304192.pth +[2024-06-18 11:12:38,761][12883] Updated weights for policy 0, policy_version 121193 (0.0042) +[2024-06-18 11:12:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 1985740800. Throughput: 0: 42459.9. Samples: 1985897380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:12:41,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 11:12:42,898][12883] Updated weights for policy 0, policy_version 121203 (0.0033) +[2024-06-18 11:12:46,449][12883] Updated weights for policy 0, policy_version 121213 (0.0025) +[2024-06-18 11:12:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1985953792. Throughput: 0: 42613.8. Samples: 1986030020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:12:46,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 11:12:50,535][12883] Updated weights for policy 0, policy_version 121223 (0.0023) +[2024-06-18 11:12:52,000][12645] Fps is (10 sec: 42571.4, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 1986166784. Throughput: 0: 42607.4. Samples: 1986284900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:12:52,001][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 11:12:54,438][12883] Updated weights for policy 0, policy_version 121233 (0.0024) +[2024-06-18 11:12:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 1986412544. Throughput: 0: 42673.8. Samples: 1986542200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:12:56,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 11:12:58,223][12883] Updated weights for policy 0, policy_version 121243 (0.0034) +[2024-06-18 11:13:01,994][12645] Fps is (10 sec: 42625.6, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1986592768. Throughput: 0: 42752.1. Samples: 1986668340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:13:01,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 11:13:02,033][12883] Updated weights for policy 0, policy_version 121253 (0.0039) +[2024-06-18 11:13:05,793][12883] Updated weights for policy 0, policy_version 121263 (0.0036) +[2024-06-18 11:13:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1986805760. Throughput: 0: 42635.9. Samples: 1986924860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:13:06,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 11:13:09,703][12883] Updated weights for policy 0, policy_version 121273 (0.0029) +[2024-06-18 11:13:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1987018752. Throughput: 0: 42876.8. Samples: 1987188520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:13:11,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 11:13:13,291][12883] Updated weights for policy 0, policy_version 121283 (0.0037) +[2024-06-18 11:13:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1987231744. Throughput: 0: 42800.9. Samples: 1987312120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:13:16,996][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 11:13:17,338][12883] Updated weights for policy 0, policy_version 121293 (0.0040) +[2024-06-18 11:13:20,849][12883] Updated weights for policy 0, policy_version 121303 (0.0032) +[2024-06-18 11:13:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42872.7, 300 sec: 42542.9). Total num frames: 1987461120. Throughput: 0: 42752.4. Samples: 1987565720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:13:21,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 11:13:25,036][12883] Updated weights for policy 0, policy_version 121313 (0.0033) +[2024-06-18 11:13:26,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1987674112. Throughput: 0: 42883.5. Samples: 1987827140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:13:26,994][12645] Avg episode reward: [(0, '0.691')] +[2024-06-18 11:13:28,439][12883] Updated weights for policy 0, policy_version 121323 (0.0032) +[2024-06-18 11:13:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1987870720. Throughput: 0: 42605.3. Samples: 1987947260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:13:31,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 11:13:32,867][12883] Updated weights for policy 0, policy_version 121333 (0.0033) +[2024-06-18 11:13:35,893][12883] Updated weights for policy 0, policy_version 121343 (0.0036) +[2024-06-18 11:13:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 1988116480. Throughput: 0: 42623.6. Samples: 1988202700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 11:13:36,994][12645] Avg episode reward: [(0, '0.719')] +[2024-06-18 11:13:40,515][12883] Updated weights for policy 0, policy_version 121353 (0.0032) +[2024-06-18 11:13:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1988296704. Throughput: 0: 42773.5. Samples: 1988467000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:13:41,994][12645] Avg episode reward: [(0, '0.719')] +[2024-06-18 11:13:42,088][12862] Signal inference workers to stop experience collection... (29150 times) +[2024-06-18 11:13:42,088][12862] Signal inference workers to resume experience collection... (29150 times) +[2024-06-18 11:13:42,118][12883] InferenceWorker_p0-w0: stopping experience collection (29150 times) +[2024-06-18 11:13:42,118][12883] InferenceWorker_p0-w0: resuming experience collection (29150 times) +[2024-06-18 11:13:43,623][12883] Updated weights for policy 0, policy_version 121363 (0.0028) +[2024-06-18 11:13:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1988493312. Throughput: 0: 42664.8. Samples: 1988588260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:13:46,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 11:13:48,394][12883] Updated weights for policy 0, policy_version 121373 (0.0037) +[2024-06-18 11:13:51,086][12883] Updated weights for policy 0, policy_version 121383 (0.0032) +[2024-06-18 11:13:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43149.0, 300 sec: 42653.9). Total num frames: 1988755456. Throughput: 0: 42735.5. Samples: 1988847960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:13:51,996][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 11:13:56,068][12883] Updated weights for policy 0, policy_version 121393 (0.0034) +[2024-06-18 11:13:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1988952064. Throughput: 0: 42608.9. Samples: 1989105920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:13:56,994][12645] Avg episode reward: [(0, '0.642')] +[2024-06-18 11:13:58,685][12883] Updated weights for policy 0, policy_version 121403 (0.0043) +[2024-06-18 11:14:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1989148672. Throughput: 0: 42626.4. Samples: 1989230220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:14:01,994][12645] Avg episode reward: [(0, '0.756')] +[2024-06-18 11:14:03,830][12883] Updated weights for policy 0, policy_version 121413 (0.0039) +[2024-06-18 11:14:06,928][12883] Updated weights for policy 0, policy_version 121423 (0.0034) +[2024-06-18 11:14:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1989394432. Throughput: 0: 42631.5. Samples: 1989484140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:14:06,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 11:14:11,617][12883] Updated weights for policy 0, policy_version 121433 (0.0040) +[2024-06-18 11:14:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 1989591040. Throughput: 0: 42589.3. Samples: 1989743660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:14:11,994][12645] Avg episode reward: [(0, '0.682')] +[2024-06-18 11:14:14,577][12883] Updated weights for policy 0, policy_version 121443 (0.0033) +[2024-06-18 11:14:16,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1989771264. Throughput: 0: 42581.0. Samples: 1989863400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:14:16,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 11:14:19,127][12883] Updated weights for policy 0, policy_version 121453 (0.0033) +[2024-06-18 11:14:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1990033408. Throughput: 0: 42713.0. Samples: 1990124780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:14:21,994][12645] Avg episode reward: [(0, '0.336')] +[2024-06-18 11:14:22,250][12883] Updated weights for policy 0, policy_version 121463 (0.0044) +[2024-06-18 11:14:26,816][12883] Updated weights for policy 0, policy_version 121473 (0.0032) +[2024-06-18 11:14:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 1990230016. Throughput: 0: 42620.4. Samples: 1990384920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:14:26,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 11:14:29,946][12883] Updated weights for policy 0, policy_version 121483 (0.0031) +[2024-06-18 11:14:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1990426624. Throughput: 0: 42716.1. Samples: 1990510480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:14:31,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 11:14:34,334][12883] Updated weights for policy 0, policy_version 121493 (0.0027) +[2024-06-18 11:14:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1990672384. Throughput: 0: 42736.1. Samples: 1990771080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 11:14:36,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 11:14:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121501_1990672384.pth... +[2024-06-18 11:14:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120876_1980432384.pth +[2024-06-18 11:14:37,565][12883] Updated weights for policy 0, policy_version 121503 (0.0037) +[2024-06-18 11:14:41,931][12883] Updated weights for policy 0, policy_version 121513 (0.0024) +[2024-06-18 11:14:41,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1990868992. Throughput: 0: 42887.0. Samples: 1991035840. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:14:41,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 11:14:45,127][12883] Updated weights for policy 0, policy_version 121523 (0.0035) +[2024-06-18 11:14:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1991065600. Throughput: 0: 42717.1. Samples: 1991152480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:14:46,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 11:14:49,623][12883] Updated weights for policy 0, policy_version 121533 (0.0048) +[2024-06-18 11:14:51,206][12862] Signal inference workers to stop experience collection... (29200 times) +[2024-06-18 11:14:51,206][12862] Signal inference workers to resume experience collection... (29200 times) +[2024-06-18 11:14:51,251][12883] InferenceWorker_p0-w0: stopping experience collection (29200 times) +[2024-06-18 11:14:51,251][12883] InferenceWorker_p0-w0: resuming experience collection (29200 times) +[2024-06-18 11:14:51,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1991311360. Throughput: 0: 42743.1. Samples: 1991407580. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:14:51,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 11:14:52,849][12883] Updated weights for policy 0, policy_version 121543 (0.0030) +[2024-06-18 11:14:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1991491584. Throughput: 0: 42740.9. Samples: 1991667000. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:14:56,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 11:14:57,307][12883] Updated weights for policy 0, policy_version 121553 (0.0044) +[2024-06-18 11:15:00,443][12883] Updated weights for policy 0, policy_version 121563 (0.0048) +[2024-06-18 11:15:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1991720960. Throughput: 0: 42769.8. Samples: 1991788040. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:01,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 11:15:05,059][12883] Updated weights for policy 0, policy_version 121573 (0.0037) +[2024-06-18 11:15:06,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1991966720. Throughput: 0: 42789.3. Samples: 1992050300. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:06,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 11:15:08,254][12883] Updated weights for policy 0, policy_version 121583 (0.0022) +[2024-06-18 11:15:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1992130560. Throughput: 0: 42695.5. Samples: 1992306220. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:11,994][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 11:15:13,305][12883] Updated weights for policy 0, policy_version 121593 (0.0033) +[2024-06-18 11:15:15,898][12883] Updated weights for policy 0, policy_version 121603 (0.0039) +[2024-06-18 11:15:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1992359936. Throughput: 0: 42514.6. Samples: 1992423640. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:16,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 11:15:21,062][12883] Updated weights for policy 0, policy_version 121613 (0.0028) +[2024-06-18 11:15:21,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1992589312. Throughput: 0: 42698.3. Samples: 1992692600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:21,997][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 11:15:23,955][12883] Updated weights for policy 0, policy_version 121623 (0.0029) +[2024-06-18 11:15:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1992785920. Throughput: 0: 42437.0. Samples: 1992945500. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:26,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 11:15:28,580][12883] Updated weights for policy 0, policy_version 121633 (0.0039) +[2024-06-18 11:15:31,525][12883] Updated weights for policy 0, policy_version 121643 (0.0039) +[2024-06-18 11:15:31,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1992998912. Throughput: 0: 42588.4. Samples: 1993068960. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:31,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 11:15:36,114][12883] Updated weights for policy 0, policy_version 121653 (0.0031) +[2024-06-18 11:15:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1993211904. Throughput: 0: 42690.7. Samples: 1993328660. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:36,994][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 11:15:39,341][12883] Updated weights for policy 0, policy_version 121663 (0.0033) +[2024-06-18 11:15:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1993424896. Throughput: 0: 42535.5. Samples: 1993581100. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) +[2024-06-18 11:15:41,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 11:15:43,629][12883] Updated weights for policy 0, policy_version 121673 (0.0036) +[2024-06-18 11:15:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1993637888. Throughput: 0: 42806.5. Samples: 1993714340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:15:46,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 11:15:47,371][12883] Updated weights for policy 0, policy_version 121683 (0.0033) +[2024-06-18 11:15:51,090][12883] Updated weights for policy 0, policy_version 121693 (0.0029) +[2024-06-18 11:15:51,997][12645] Fps is (10 sec: 42582.8, 60 sec: 42322.7, 300 sec: 42653.4). Total num frames: 1993850880. Throughput: 0: 42773.4. Samples: 1993975260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:15:51,998][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 11:15:54,902][12883] Updated weights for policy 0, policy_version 121703 (0.0038) +[2024-06-18 11:15:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1994080256. Throughput: 0: 42627.9. Samples: 1994224480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:15:56,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 11:15:58,681][12883] Updated weights for policy 0, policy_version 121713 (0.0033) +[2024-06-18 11:15:59,299][12862] Signal inference workers to stop experience collection... (29250 times) +[2024-06-18 11:15:59,307][12862] Signal inference workers to resume experience collection... (29250 times) +[2024-06-18 11:15:59,353][12883] InferenceWorker_p0-w0: stopping experience collection (29250 times) +[2024-06-18 11:15:59,353][12883] InferenceWorker_p0-w0: resuming experience collection (29250 times) +[2024-06-18 11:16:01,994][12645] Fps is (10 sec: 42614.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1994276864. Throughput: 0: 42953.4. Samples: 1994356540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:01,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 11:16:02,780][12883] Updated weights for policy 0, policy_version 121723 (0.0027) +[2024-06-18 11:16:06,467][12883] Updated weights for policy 0, policy_version 121733 (0.0030) +[2024-06-18 11:16:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 1994473472. Throughput: 0: 42738.1. Samples: 1994615720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:06,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 11:16:10,395][12883] Updated weights for policy 0, policy_version 121743 (0.0031) +[2024-06-18 11:16:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1994719232. Throughput: 0: 42645.8. Samples: 1994864560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:11,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 11:16:14,162][12883] Updated weights for policy 0, policy_version 121753 (0.0037) +[2024-06-18 11:16:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1994915840. Throughput: 0: 42850.6. Samples: 1994997240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:16,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 11:16:17,944][12883] Updated weights for policy 0, policy_version 121763 (0.0037) +[2024-06-18 11:16:21,612][12883] Updated weights for policy 0, policy_version 121773 (0.0037) +[2024-06-18 11:16:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 1995128832. Throughput: 0: 42723.1. Samples: 1995251200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:21,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 11:16:25,593][12883] Updated weights for policy 0, policy_version 121783 (0.0034) +[2024-06-18 11:16:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1995358208. Throughput: 0: 42715.7. Samples: 1995503300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:26,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 11:16:29,330][12883] Updated weights for policy 0, policy_version 121793 (0.0036) +[2024-06-18 11:16:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1995554816. Throughput: 0: 42694.4. Samples: 1995635580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:31,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 11:16:33,460][12883] Updated weights for policy 0, policy_version 121803 (0.0029) +[2024-06-18 11:16:36,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 1995767808. Throughput: 0: 42498.3. Samples: 1995887620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:36,996][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 11:16:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121812_1995767808.pth... +[2024-06-18 11:16:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121187_1985527808.pth +[2024-06-18 11:16:37,271][12883] Updated weights for policy 0, policy_version 121813 (0.0041) +[2024-06-18 11:16:40,985][12883] Updated weights for policy 0, policy_version 121823 (0.0023) +[2024-06-18 11:16:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1995980800. Throughput: 0: 42693.5. Samples: 1996145680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 11:16:41,994][12645] Avg episode reward: [(0, '0.670')] +[2024-06-18 11:16:44,968][12883] Updated weights for policy 0, policy_version 121833 (0.0036) +[2024-06-18 11:16:46,994][12645] Fps is (10 sec: 42607.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1996193792. Throughput: 0: 42786.1. Samples: 1996281920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:16:46,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 11:16:48,353][12883] Updated weights for policy 0, policy_version 121843 (0.0032) +[2024-06-18 11:16:51,996][12645] Fps is (10 sec: 42588.4, 60 sec: 42599.5, 300 sec: 42653.6). Total num frames: 1996406784. Throughput: 0: 42519.7. Samples: 1996529200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:16:51,997][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 11:16:52,529][12883] Updated weights for policy 0, policy_version 121853 (0.0026) +[2024-06-18 11:16:55,994][12883] Updated weights for policy 0, policy_version 121863 (0.0035) +[2024-06-18 11:16:56,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1996636160. Throughput: 0: 42684.1. Samples: 1996785340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:16:56,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 11:17:00,480][12883] Updated weights for policy 0, policy_version 121873 (0.0028) +[2024-06-18 11:17:01,996][12645] Fps is (10 sec: 44236.8, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1996849152. Throughput: 0: 42650.8. Samples: 1996916620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:01,997][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 11:17:03,695][12883] Updated weights for policy 0, policy_version 121883 (0.0044) +[2024-06-18 11:17:06,994][12645] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1997062144. Throughput: 0: 42538.1. Samples: 1997165420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:06,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 11:17:08,101][12883] Updated weights for policy 0, policy_version 121893 (0.0047) +[2024-06-18 11:17:11,441][12883] Updated weights for policy 0, policy_version 121903 (0.0040) +[2024-06-18 11:17:11,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1997258752. Throughput: 0: 42704.0. Samples: 1997424980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:11,994][12645] Avg episode reward: [(0, '0.242')] +[2024-06-18 11:17:15,998][12883] Updated weights for policy 0, policy_version 121913 (0.0042) +[2024-06-18 11:17:16,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42596.8, 300 sec: 42653.9). Total num frames: 1997471744. Throughput: 0: 42566.7. Samples: 1997551180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:16,997][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 11:17:19,422][12883] Updated weights for policy 0, policy_version 121923 (0.0031) +[2024-06-18 11:17:20,435][12862] Signal inference workers to stop experience collection... (29300 times) +[2024-06-18 11:17:20,435][12862] Signal inference workers to resume experience collection... (29300 times) +[2024-06-18 11:17:20,469][12883] InferenceWorker_p0-w0: stopping experience collection (29300 times) +[2024-06-18 11:17:20,469][12883] InferenceWorker_p0-w0: resuming experience collection (29300 times) +[2024-06-18 11:17:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1997701120. Throughput: 0: 42540.7. Samples: 1997801860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:21,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 11:17:23,621][12883] Updated weights for policy 0, policy_version 121933 (0.0031) +[2024-06-18 11:17:26,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1997897728. Throughput: 0: 42623.9. Samples: 1998063760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:26,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 11:17:27,017][12883] Updated weights for policy 0, policy_version 121943 (0.0041) +[2024-06-18 11:17:31,088][12883] Updated weights for policy 0, policy_version 121953 (0.0041) +[2024-06-18 11:17:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1998094336. Throughput: 0: 42366.8. Samples: 1998188420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:31,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 11:17:34,637][12883] Updated weights for policy 0, policy_version 121963 (0.0045) +[2024-06-18 11:17:36,994][12645] Fps is (10 sec: 45874.2, 60 sec: 43146.0, 300 sec: 42765.0). Total num frames: 1998356480. Throughput: 0: 42631.7. Samples: 1998447540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:36,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 11:17:38,549][12883] Updated weights for policy 0, policy_version 121973 (0.0045) +[2024-06-18 11:17:41,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1998553088. Throughput: 0: 42600.8. Samples: 1998702380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:17:41,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 11:17:42,221][12883] Updated weights for policy 0, policy_version 121983 (0.0041) +[2024-06-18 11:17:46,314][12883] Updated weights for policy 0, policy_version 121993 (0.0040) +[2024-06-18 11:17:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 1998749696. Throughput: 0: 42444.7. Samples: 1998826540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:17:46,994][12645] Avg episode reward: [(0, '0.622')] +[2024-06-18 11:17:50,232][12883] Updated weights for policy 0, policy_version 122003 (0.0041) +[2024-06-18 11:17:51,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42871.5, 300 sec: 42598.1). Total num frames: 1998979072. Throughput: 0: 42628.2. Samples: 1999083780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:17:51,996][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 11:17:53,959][12883] Updated weights for policy 0, policy_version 122013 (0.0032) +[2024-06-18 11:17:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1999192064. Throughput: 0: 42552.4. Samples: 1999339840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:17:56,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 11:17:57,922][12883] Updated weights for policy 0, policy_version 122023 (0.0029) +[2024-06-18 11:18:01,691][12883] Updated weights for policy 0, policy_version 122033 (0.0040) +[2024-06-18 11:18:01,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 1999388672. Throughput: 0: 42532.5. Samples: 1999465040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:01,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 11:18:05,775][12883] Updated weights for policy 0, policy_version 122043 (0.0037) +[2024-06-18 11:18:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1999634432. Throughput: 0: 42777.4. Samples: 1999726840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:06,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 11:18:09,308][12883] Updated weights for policy 0, policy_version 122053 (0.0036) +[2024-06-18 11:18:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1999814656. Throughput: 0: 42687.6. Samples: 1999984700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:11,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 11:18:13,301][12883] Updated weights for policy 0, policy_version 122063 (0.0027) +[2024-06-18 11:18:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 2000027648. Throughput: 0: 42540.1. Samples: 2000102720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:16,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 11:18:17,034][12883] Updated weights for policy 0, policy_version 122073 (0.0035) +[2024-06-18 11:18:21,085][12883] Updated weights for policy 0, policy_version 122083 (0.0035) +[2024-06-18 11:18:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2000257024. Throughput: 0: 42541.0. Samples: 2000361880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:21,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 11:18:24,782][12883] Updated weights for policy 0, policy_version 122093 (0.0036) +[2024-06-18 11:18:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2000453632. Throughput: 0: 42516.5. Samples: 2000615620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:26,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 11:18:28,652][12883] Updated weights for policy 0, policy_version 122103 (0.0039) +[2024-06-18 11:18:31,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2000650240. Throughput: 0: 42468.5. Samples: 2000737620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:31,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 11:18:32,417][12883] Updated weights for policy 0, policy_version 122113 (0.0034) +[2024-06-18 11:18:36,138][12883] Updated weights for policy 0, policy_version 122123 (0.0027) +[2024-06-18 11:18:36,996][12645] Fps is (10 sec: 42590.0, 60 sec: 42051.0, 300 sec: 42653.6). Total num frames: 2000879616. Throughput: 0: 42486.0. Samples: 2000995640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:36,996][12645] Avg episode reward: [(0, '0.650')] +[2024-06-18 11:18:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122124_2000879616.pth... +[2024-06-18 11:18:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121501_1990672384.pth +[2024-06-18 11:18:40,205][12883] Updated weights for policy 0, policy_version 122133 (0.0032) +[2024-06-18 11:18:41,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2001076224. Throughput: 0: 42615.4. Samples: 2001257540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:41,994][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 11:18:44,162][12883] Updated weights for policy 0, policy_version 122143 (0.0035) +[2024-06-18 11:18:46,994][12645] Fps is (10 sec: 42606.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2001305600. Throughput: 0: 42624.2. Samples: 2001383140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:18:46,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 11:18:47,912][12883] Updated weights for policy 0, policy_version 122153 (0.0038) +[2024-06-18 11:18:51,697][12883] Updated weights for policy 0, policy_version 122163 (0.0033) +[2024-06-18 11:18:51,994][12645] Fps is (10 sec: 45876.4, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 2001534976. Throughput: 0: 42482.3. Samples: 2001638540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:18:51,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 11:18:55,569][12883] Updated weights for policy 0, policy_version 122173 (0.0028) +[2024-06-18 11:18:56,994][12645] Fps is (10 sec: 40961.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2001715200. Throughput: 0: 42447.6. Samples: 2001894840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:18:56,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 11:18:59,242][12883] Updated weights for policy 0, policy_version 122183 (0.0027) +[2024-06-18 11:19:01,995][12645] Fps is (10 sec: 39317.1, 60 sec: 42324.5, 300 sec: 42487.2). Total num frames: 2001928192. Throughput: 0: 42522.9. Samples: 2002016300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:01,995][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 11:19:03,373][12883] Updated weights for policy 0, policy_version 122193 (0.0035) +[2024-06-18 11:19:05,444][12862] Signal inference workers to stop experience collection... (29350 times) +[2024-06-18 11:19:05,444][12862] Signal inference workers to resume experience collection... (29350 times) +[2024-06-18 11:19:05,481][12883] InferenceWorker_p0-w0: stopping experience collection (29350 times) +[2024-06-18 11:19:05,481][12883] InferenceWorker_p0-w0: resuming experience collection (29350 times) +[2024-06-18 11:19:06,801][12883] Updated weights for policy 0, policy_version 122203 (0.0035) +[2024-06-18 11:19:06,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2002173952. Throughput: 0: 42558.2. Samples: 2002277000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:06,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 11:19:11,102][12883] Updated weights for policy 0, policy_version 122213 (0.0040) +[2024-06-18 11:19:11,994][12645] Fps is (10 sec: 44241.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2002370560. Throughput: 0: 42456.5. Samples: 2002526160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:11,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 11:19:14,710][12883] Updated weights for policy 0, policy_version 122223 (0.0022) +[2024-06-18 11:19:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2002583552. Throughput: 0: 42588.3. Samples: 2002654100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:16,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 11:19:18,862][12883] Updated weights for policy 0, policy_version 122233 (0.0037) +[2024-06-18 11:19:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 2002780160. Throughput: 0: 42598.3. Samples: 2002912480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:21,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 11:19:22,527][12883] Updated weights for policy 0, policy_version 122243 (0.0027) +[2024-06-18 11:19:26,460][12883] Updated weights for policy 0, policy_version 122253 (0.0033) +[2024-06-18 11:19:26,995][12645] Fps is (10 sec: 44229.5, 60 sec: 42870.3, 300 sec: 42709.2). Total num frames: 2003025920. Throughput: 0: 42448.8. Samples: 2003167800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:26,996][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 11:19:30,058][12883] Updated weights for policy 0, policy_version 122263 (0.0058) +[2024-06-18 11:19:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2003222528. Throughput: 0: 42592.1. Samples: 2003299780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:31,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 11:19:34,043][12883] Updated weights for policy 0, policy_version 122273 (0.0023) +[2024-06-18 11:19:36,994][12645] Fps is (10 sec: 40966.7, 60 sec: 42599.8, 300 sec: 42598.4). Total num frames: 2003435520. Throughput: 0: 42630.5. Samples: 2003556920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:36,994][12645] Avg episode reward: [(0, '0.158')] +[2024-06-18 11:19:37,650][12883] Updated weights for policy 0, policy_version 122283 (0.0028) +[2024-06-18 11:19:41,698][12883] Updated weights for policy 0, policy_version 122293 (0.0042) +[2024-06-18 11:19:41,999][12645] Fps is (10 sec: 42576.7, 60 sec: 42867.9, 300 sec: 42653.2). Total num frames: 2003648512. Throughput: 0: 42509.7. Samples: 2003808000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:41,999][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 11:19:45,632][12883] Updated weights for policy 0, policy_version 122303 (0.0033) +[2024-06-18 11:19:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2003861504. Throughput: 0: 42727.7. Samples: 2003939000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 11:19:46,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 11:19:49,381][12883] Updated weights for policy 0, policy_version 122313 (0.0036) +[2024-06-18 11:19:51,994][12645] Fps is (10 sec: 40981.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2004058112. Throughput: 0: 42569.9. Samples: 2004192640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:19:51,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 11:19:53,121][12883] Updated weights for policy 0, policy_version 122323 (0.0036) +[2024-06-18 11:19:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2004287488. Throughput: 0: 42774.3. Samples: 2004451000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:19:56,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 11:19:57,037][12883] Updated weights for policy 0, policy_version 122333 (0.0041) +[2024-06-18 11:20:00,824][12883] Updated weights for policy 0, policy_version 122343 (0.0036) +[2024-06-18 11:20:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42872.2, 300 sec: 42487.3). Total num frames: 2004500480. Throughput: 0: 42856.0. Samples: 2004582620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:01,994][12645] Avg episode reward: [(0, '0.748')] +[2024-06-18 11:20:04,725][12883] Updated weights for policy 0, policy_version 122353 (0.0038) +[2024-06-18 11:20:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2004713472. Throughput: 0: 42666.8. Samples: 2004832480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:06,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 11:20:08,407][12883] Updated weights for policy 0, policy_version 122363 (0.0033) +[2024-06-18 11:20:12,000][12645] Fps is (10 sec: 42572.0, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 2004926464. Throughput: 0: 42753.5. Samples: 2005091900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:12,000][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 11:20:12,313][12883] Updated weights for policy 0, policy_version 122373 (0.0028) +[2024-06-18 11:20:16,024][12883] Updated weights for policy 0, policy_version 122383 (0.0047) +[2024-06-18 11:20:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2005155840. Throughput: 0: 42740.9. Samples: 2005223120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:16,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 11:20:19,890][12883] Updated weights for policy 0, policy_version 122393 (0.0037) +[2024-06-18 11:20:21,994][12645] Fps is (10 sec: 40984.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2005336064. Throughput: 0: 42527.9. Samples: 2005470680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:21,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 11:20:23,765][12883] Updated weights for policy 0, policy_version 122403 (0.0040) +[2024-06-18 11:20:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42326.5, 300 sec: 42598.4). Total num frames: 2005565440. Throughput: 0: 42692.8. Samples: 2005728960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:26,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 11:20:27,753][12883] Updated weights for policy 0, policy_version 122413 (0.0029) +[2024-06-18 11:20:31,557][12883] Updated weights for policy 0, policy_version 122423 (0.0044) +[2024-06-18 11:20:31,998][12645] Fps is (10 sec: 45858.0, 60 sec: 42868.7, 300 sec: 42653.4). Total num frames: 2005794816. Throughput: 0: 42568.7. Samples: 2005854760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:31,998][12645] Avg episode reward: [(0, '0.701')] +[2024-06-18 11:20:35,384][12883] Updated weights for policy 0, policy_version 122433 (0.0028) +[2024-06-18 11:20:36,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2005991424. Throughput: 0: 42508.5. Samples: 2006105620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:36,997][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 11:20:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122436_2005991424.pth... +[2024-06-18 11:20:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121812_1995767808.pth +[2024-06-18 11:20:39,381][12883] Updated weights for policy 0, policy_version 122443 (0.0041) +[2024-06-18 11:20:41,994][12645] Fps is (10 sec: 40976.1, 60 sec: 42602.1, 300 sec: 42598.4). Total num frames: 2006204416. Throughput: 0: 42451.9. Samples: 2006361340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:41,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 11:20:43,735][12883] Updated weights for policy 0, policy_version 122453 (0.0035) +[2024-06-18 11:20:43,771][12862] Signal inference workers to stop experience collection... (29400 times) +[2024-06-18 11:20:43,772][12862] Signal inference workers to resume experience collection... (29400 times) +[2024-06-18 11:20:43,816][12883] InferenceWorker_p0-w0: stopping experience collection (29400 times) +[2024-06-18 11:20:43,816][12883] InferenceWorker_p0-w0: resuming experience collection (29400 times) +[2024-06-18 11:20:46,996][12645] Fps is (10 sec: 42598.5, 60 sec: 42596.8, 300 sec: 42598.6). Total num frames: 2006417408. Throughput: 0: 42430.3. Samples: 2006492080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) +[2024-06-18 11:20:46,996][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 11:20:47,073][12883] Updated weights for policy 0, policy_version 122463 (0.0042) +[2024-06-18 11:20:51,368][12883] Updated weights for policy 0, policy_version 122473 (0.0043) +[2024-06-18 11:20:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2006614016. Throughput: 0: 42449.4. Samples: 2006742700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:20:51,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 11:20:54,762][12883] Updated weights for policy 0, policy_version 122483 (0.0036) +[2024-06-18 11:20:56,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2006843392. Throughput: 0: 42239.2. Samples: 2006992400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:20:56,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 11:20:59,090][12883] Updated weights for policy 0, policy_version 122493 (0.0032) +[2024-06-18 11:21:01,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2007056384. Throughput: 0: 42128.9. Samples: 2007118920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:01,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 11:21:02,661][12883] Updated weights for policy 0, policy_version 122503 (0.0048) +[2024-06-18 11:21:06,760][12883] Updated weights for policy 0, policy_version 122513 (0.0047) +[2024-06-18 11:21:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2007252992. Throughput: 0: 42360.6. Samples: 2007376900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:06,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 11:21:10,401][12883] Updated weights for policy 0, policy_version 122523 (0.0039) +[2024-06-18 11:21:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42329.8, 300 sec: 42542.9). Total num frames: 2007465984. Throughput: 0: 42157.9. Samples: 2007626060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:11,994][12645] Avg episode reward: [(0, '0.657')] +[2024-06-18 11:21:14,690][12883] Updated weights for policy 0, policy_version 122533 (0.0035) +[2024-06-18 11:21:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2007695360. Throughput: 0: 42256.2. Samples: 2007756120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:16,994][12645] Avg episode reward: [(0, '0.607')] +[2024-06-18 11:21:18,076][12883] Updated weights for policy 0, policy_version 122543 (0.0039) +[2024-06-18 11:21:21,995][12645] Fps is (10 sec: 42592.4, 60 sec: 42597.6, 300 sec: 42487.1). Total num frames: 2007891968. Throughput: 0: 42371.1. Samples: 2008012280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:21,996][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 11:21:22,207][12883] Updated weights for policy 0, policy_version 122553 (0.0032) +[2024-06-18 11:21:25,702][12883] Updated weights for policy 0, policy_version 122563 (0.0034) +[2024-06-18 11:21:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2008104960. Throughput: 0: 42243.9. Samples: 2008262320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:26,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 11:21:30,013][12883] Updated weights for policy 0, policy_version 122573 (0.0032) +[2024-06-18 11:21:31,994][12645] Fps is (10 sec: 42603.7, 60 sec: 42054.9, 300 sec: 42543.2). Total num frames: 2008317952. Throughput: 0: 42208.2. Samples: 2008391360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:31,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 11:21:33,321][12883] Updated weights for policy 0, policy_version 122583 (0.0038) +[2024-06-18 11:21:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 2008514560. Throughput: 0: 42304.9. Samples: 2008646420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:36,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 11:21:37,743][12883] Updated weights for policy 0, policy_version 122593 (0.0028) +[2024-06-18 11:21:41,693][12883] Updated weights for policy 0, policy_version 122603 (0.0034) +[2024-06-18 11:21:41,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2008743936. Throughput: 0: 42367.2. Samples: 2008898920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:41,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 11:21:45,529][12883] Updated weights for policy 0, policy_version 122613 (0.0037) +[2024-06-18 11:21:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42326.9, 300 sec: 42543.2). Total num frames: 2008956928. Throughput: 0: 42423.5. Samples: 2009027980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:46,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 11:21:49,143][12883] Updated weights for policy 0, policy_version 122623 (0.0037) +[2024-06-18 11:21:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 2009137152. Throughput: 0: 42384.4. Samples: 2009284200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 11:21:51,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 11:21:53,118][12883] Updated weights for policy 0, policy_version 122633 (0.0030) +[2024-06-18 11:21:56,649][12883] Updated weights for policy 0, policy_version 122643 (0.0038) +[2024-06-18 11:21:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2009382912. Throughput: 0: 42461.7. Samples: 2009536840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:21:56,994][12645] Avg episode reward: [(0, '0.594')] +[2024-06-18 11:22:01,250][12883] Updated weights for policy 0, policy_version 122653 (0.0029) +[2024-06-18 11:22:02,000][12645] Fps is (10 sec: 45846.4, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 2009595904. Throughput: 0: 42500.7. Samples: 2009668920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:02,001][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 11:22:04,303][12883] Updated weights for policy 0, policy_version 122663 (0.0029) +[2024-06-18 11:22:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2009792512. Throughput: 0: 42346.5. Samples: 2009917820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:06,996][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 11:22:08,849][12883] Updated weights for policy 0, policy_version 122673 (0.0040) +[2024-06-18 11:22:11,847][12862] Signal inference workers to stop experience collection... (29450 times) +[2024-06-18 11:22:11,847][12862] Signal inference workers to resume experience collection... (29450 times) +[2024-06-18 11:22:11,891][12883] InferenceWorker_p0-w0: stopping experience collection (29450 times) +[2024-06-18 11:22:11,891][12883] InferenceWorker_p0-w0: resuming experience collection (29450 times) +[2024-06-18 11:22:11,989][12883] Updated weights for policy 0, policy_version 122683 (0.0032) +[2024-06-18 11:22:11,994][12645] Fps is (10 sec: 44264.9, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2010038272. Throughput: 0: 42332.5. Samples: 2010167280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:11,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 11:22:16,390][12883] Updated weights for policy 0, policy_version 122693 (0.0046) +[2024-06-18 11:22:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2010234880. Throughput: 0: 42493.4. Samples: 2010303560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:16,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 11:22:19,982][12883] Updated weights for policy 0, policy_version 122703 (0.0031) +[2024-06-18 11:22:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42326.4, 300 sec: 42487.3). Total num frames: 2010431488. Throughput: 0: 42311.6. Samples: 2010550440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:21,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 11:22:24,225][12883] Updated weights for policy 0, policy_version 122713 (0.0029) +[2024-06-18 11:22:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2010660864. Throughput: 0: 42512.8. Samples: 2010812000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:26,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 11:22:27,816][12883] Updated weights for policy 0, policy_version 122723 (0.0046) +[2024-06-18 11:22:31,810][12883] Updated weights for policy 0, policy_version 122733 (0.0034) +[2024-06-18 11:22:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 2010857472. Throughput: 0: 42606.3. Samples: 2010945260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:31,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 11:22:35,341][12883] Updated weights for policy 0, policy_version 122743 (0.0030) +[2024-06-18 11:22:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2011070464. Throughput: 0: 42554.1. Samples: 2011199140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:36,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 11:22:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122746_2011070464.pth... +[2024-06-18 11:22:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122124_2000879616.pth +[2024-06-18 11:22:39,271][12883] Updated weights for policy 0, policy_version 122753 (0.0038) +[2024-06-18 11:22:41,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2011316224. Throughput: 0: 42560.9. Samples: 2011452080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:41,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 11:22:42,929][12883] Updated weights for policy 0, policy_version 122763 (0.0027) +[2024-06-18 11:22:46,943][12883] Updated weights for policy 0, policy_version 122773 (0.0027) +[2024-06-18 11:22:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2011512832. Throughput: 0: 42666.8. Samples: 2011588660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:46,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 11:22:51,101][12883] Updated weights for policy 0, policy_version 122783 (0.0043) +[2024-06-18 11:22:52,000][12645] Fps is (10 sec: 40934.9, 60 sec: 43140.1, 300 sec: 42486.4). Total num frames: 2011725824. Throughput: 0: 42816.8. Samples: 2011844840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:52,000][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 11:22:54,428][12883] Updated weights for policy 0, policy_version 122793 (0.0049) +[2024-06-18 11:22:56,998][12645] Fps is (10 sec: 44216.3, 60 sec: 42868.2, 300 sec: 42597.7). Total num frames: 2011955200. Throughput: 0: 42834.2. Samples: 2012095020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:22:56,999][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 11:22:58,474][12883] Updated weights for policy 0, policy_version 122803 (0.0048) +[2024-06-18 11:23:01,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42602.9, 300 sec: 42431.8). Total num frames: 2012151808. Throughput: 0: 42815.6. Samples: 2012230260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:01,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 11:23:02,132][12883] Updated weights for policy 0, policy_version 122813 (0.0047) +[2024-06-18 11:23:05,958][12883] Updated weights for policy 0, policy_version 122823 (0.0032) +[2024-06-18 11:23:06,994][12645] Fps is (10 sec: 40979.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2012364800. Throughput: 0: 43105.7. Samples: 2012490200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:06,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 11:23:09,593][12883] Updated weights for policy 0, policy_version 122833 (0.0036) +[2024-06-18 11:23:11,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2012610560. Throughput: 0: 42913.6. Samples: 2012743120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:11,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 11:23:13,527][12883] Updated weights for policy 0, policy_version 122843 (0.0044) +[2024-06-18 11:23:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2012790784. Throughput: 0: 42912.4. Samples: 2012876320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:16,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 11:23:17,157][12883] Updated weights for policy 0, policy_version 122853 (0.0028) +[2024-06-18 11:23:21,261][12883] Updated weights for policy 0, policy_version 122863 (0.0031) +[2024-06-18 11:23:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2013020160. Throughput: 0: 42899.7. Samples: 2013129620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:21,994][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 11:23:24,680][12862] Signal inference workers to stop experience collection... (29500 times) +[2024-06-18 11:23:24,680][12862] Signal inference workers to resume experience collection... (29500 times) +[2024-06-18 11:23:24,721][12883] InferenceWorker_p0-w0: stopping experience collection (29500 times) +[2024-06-18 11:23:24,721][12883] InferenceWorker_p0-w0: resuming experience collection (29500 times) +[2024-06-18 11:23:24,832][12883] Updated weights for policy 0, policy_version 122873 (0.0045) +[2024-06-18 11:23:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2013249536. Throughput: 0: 42905.8. Samples: 2013382840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:26,994][12645] Avg episode reward: [(0, '0.698')] +[2024-06-18 11:23:28,746][12883] Updated weights for policy 0, policy_version 122883 (0.0040) +[2024-06-18 11:23:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42543.1). Total num frames: 2013429760. Throughput: 0: 42860.0. Samples: 2013517360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:31,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 11:23:32,397][12883] Updated weights for policy 0, policy_version 122893 (0.0038) +[2024-06-18 11:23:36,412][12883] Updated weights for policy 0, policy_version 122903 (0.0040) +[2024-06-18 11:23:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 2013659136. Throughput: 0: 42581.9. Samples: 2013760760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:36,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 11:23:40,570][12883] Updated weights for policy 0, policy_version 122913 (0.0035) +[2024-06-18 11:23:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2013872128. Throughput: 0: 42735.6. Samples: 2014017920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:41,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 11:23:44,004][12883] Updated weights for policy 0, policy_version 122923 (0.0042) +[2024-06-18 11:23:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2014052352. Throughput: 0: 42508.8. Samples: 2014143160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:46,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 11:23:48,114][12883] Updated weights for policy 0, policy_version 122933 (0.0031) +[2024-06-18 11:23:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42602.9, 300 sec: 42598.4). Total num frames: 2014281728. Throughput: 0: 42411.2. Samples: 2014398700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 11:23:51,994][12645] Avg episode reward: [(0, '0.695')] +[2024-06-18 11:23:52,026][12883] Updated weights for policy 0, policy_version 122943 (0.0030) +[2024-06-18 11:23:55,952][12883] Updated weights for policy 0, policy_version 122953 (0.0031) +[2024-06-18 11:23:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42328.7, 300 sec: 42598.6). Total num frames: 2014494720. Throughput: 0: 42434.8. Samples: 2014652680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:23:56,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 11:23:59,709][12883] Updated weights for policy 0, policy_version 122963 (0.0031) +[2024-06-18 11:24:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 2014674944. Throughput: 0: 42340.9. Samples: 2014781660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:01,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 11:24:03,469][12883] Updated weights for policy 0, policy_version 122973 (0.0029) +[2024-06-18 11:24:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2014920704. Throughput: 0: 42395.1. Samples: 2015037400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:06,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 11:24:07,254][12883] Updated weights for policy 0, policy_version 122983 (0.0033) +[2024-06-18 11:24:11,349][12883] Updated weights for policy 0, policy_version 122993 (0.0026) +[2024-06-18 11:24:11,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2015150080. Throughput: 0: 42353.4. Samples: 2015288740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:11,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 11:24:15,367][12883] Updated weights for policy 0, policy_version 123003 (0.0024) +[2024-06-18 11:24:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2015330304. Throughput: 0: 42223.7. Samples: 2015417420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:16,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 11:24:19,197][12883] Updated weights for policy 0, policy_version 123013 (0.0027) +[2024-06-18 11:24:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42542.8). Total num frames: 2015576064. Throughput: 0: 42476.9. Samples: 2015672320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:21,996][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 11:24:23,102][12883] Updated weights for policy 0, policy_version 123023 (0.0030) +[2024-06-18 11:24:26,860][12883] Updated weights for policy 0, policy_version 123033 (0.0036) +[2024-06-18 11:24:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2015772672. Throughput: 0: 42412.8. Samples: 2015926500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:26,999][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 11:24:30,654][12883] Updated weights for policy 0, policy_version 123043 (0.0036) +[2024-06-18 11:24:31,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2015969280. Throughput: 0: 42287.6. Samples: 2016046100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:31,994][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 11:24:34,618][12883] Updated weights for policy 0, policy_version 123053 (0.0031) +[2024-06-18 11:24:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42599.1). Total num frames: 2016215040. Throughput: 0: 42311.5. Samples: 2016302720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:36,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 11:24:37,111][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123061_2016231424.pth... +[2024-06-18 11:24:37,161][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122436_2005991424.pth +[2024-06-18 11:24:38,659][12883] Updated weights for policy 0, policy_version 123063 (0.0034) +[2024-06-18 11:24:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2016395264. Throughput: 0: 42377.8. Samples: 2016559680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:41,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 11:24:42,523][12883] Updated weights for policy 0, policy_version 123073 (0.0032) +[2024-06-18 11:24:46,266][12883] Updated weights for policy 0, policy_version 123083 (0.0031) +[2024-06-18 11:24:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2016591872. Throughput: 0: 42292.0. Samples: 2016684800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:46,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 11:24:50,143][12883] Updated weights for policy 0, policy_version 123093 (0.0030) +[2024-06-18 11:24:50,769][12862] Signal inference workers to stop experience collection... (29550 times) +[2024-06-18 11:24:50,770][12862] Signal inference workers to resume experience collection... (29550 times) +[2024-06-18 11:24:50,782][12883] InferenceWorker_p0-w0: stopping experience collection (29550 times) +[2024-06-18 11:24:50,782][12883] InferenceWorker_p0-w0: resuming experience collection (29550 times) +[2024-06-18 11:24:51,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 2016821248. Throughput: 0: 42245.8. Samples: 2016938560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:51,997][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 11:24:53,968][12883] Updated weights for policy 0, policy_version 123103 (0.0043) +[2024-06-18 11:24:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2017034240. Throughput: 0: 42486.3. Samples: 2017200620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 11:24:56,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 11:24:57,753][12883] Updated weights for policy 0, policy_version 123113 (0.0035) +[2024-06-18 11:25:01,486][12883] Updated weights for policy 0, policy_version 123123 (0.0037) +[2024-06-18 11:25:01,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2017247232. Throughput: 0: 42444.0. Samples: 2017327400. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:01,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 11:25:05,398][12883] Updated weights for policy 0, policy_version 123133 (0.0040) +[2024-06-18 11:25:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42488.2). Total num frames: 2017460224. Throughput: 0: 42420.7. Samples: 2017581160. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:06,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 11:25:09,026][12883] Updated weights for policy 0, policy_version 123143 (0.0029) +[2024-06-18 11:25:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2017673216. Throughput: 0: 42588.0. Samples: 2017842960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:11,995][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 11:25:13,131][12883] Updated weights for policy 0, policy_version 123153 (0.0026) +[2024-06-18 11:25:16,636][12883] Updated weights for policy 0, policy_version 123163 (0.0043) +[2024-06-18 11:25:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2017902592. Throughput: 0: 42655.2. Samples: 2017965580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:16,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 11:25:20,699][12883] Updated weights for policy 0, policy_version 123173 (0.0028) +[2024-06-18 11:25:21,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 2018115584. Throughput: 0: 42724.5. Samples: 2018225320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:21,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 11:25:24,483][12883] Updated weights for policy 0, policy_version 123183 (0.0033) +[2024-06-18 11:25:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 42376.8). Total num frames: 2018295808. Throughput: 0: 42800.5. Samples: 2018485700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:26,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 11:25:28,230][12883] Updated weights for policy 0, policy_version 123193 (0.0031) +[2024-06-18 11:25:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 2018541568. Throughput: 0: 42726.2. Samples: 2018607480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:31,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 11:25:32,203][12883] Updated weights for policy 0, policy_version 123203 (0.0042) +[2024-06-18 11:25:35,869][12883] Updated weights for policy 0, policy_version 123213 (0.0043) +[2024-06-18 11:25:36,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2018770944. Throughput: 0: 42933.2. Samples: 2018870460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:36,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 11:25:39,799][12883] Updated weights for policy 0, policy_version 123223 (0.0038) +[2024-06-18 11:25:41,994][12645] Fps is (10 sec: 39320.3, 60 sec: 42325.0, 300 sec: 42432.1). Total num frames: 2018934784. Throughput: 0: 42874.2. Samples: 2019129980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:41,995][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 11:25:43,573][12883] Updated weights for policy 0, policy_version 123233 (0.0036) +[2024-06-18 11:25:47,000][12645] Fps is (10 sec: 40934.8, 60 sec: 43140.0, 300 sec: 42597.5). Total num frames: 2019180544. Throughput: 0: 42732.7. Samples: 2019250640. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:47,000][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 11:25:47,562][12883] Updated weights for policy 0, policy_version 123243 (0.0033) +[2024-06-18 11:25:51,444][12883] Updated weights for policy 0, policy_version 123253 (0.0029) +[2024-06-18 11:25:51,996][12645] Fps is (10 sec: 45866.7, 60 sec: 42871.5, 300 sec: 42542.5). Total num frames: 2019393536. Throughput: 0: 42791.7. Samples: 2019506880. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:51,996][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 11:25:55,434][12883] Updated weights for policy 0, policy_version 123263 (0.0034) +[2024-06-18 11:25:56,994][12645] Fps is (10 sec: 40985.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2019590144. Throughput: 0: 42605.4. Samples: 2019760200. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) +[2024-06-18 11:25:56,994][12645] Avg episode reward: [(0, '0.654')] +[2024-06-18 11:25:57,958][12862] Signal inference workers to stop experience collection... (29600 times) +[2024-06-18 11:25:57,990][12883] InferenceWorker_p0-w0: stopping experience collection (29600 times) +[2024-06-18 11:25:58,014][12862] Signal inference workers to resume experience collection... (29600 times) +[2024-06-18 11:25:58,015][12883] InferenceWorker_p0-w0: resuming experience collection (29600 times) +[2024-06-18 11:25:59,092][12883] Updated weights for policy 0, policy_version 123273 (0.0029) +[2024-06-18 11:26:01,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2019819520. Throughput: 0: 42637.3. Samples: 2019884260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:01,994][12645] Avg episode reward: [(0, '0.740')] +[2024-06-18 11:26:03,162][12883] Updated weights for policy 0, policy_version 123283 (0.0040) +[2024-06-18 11:26:06,985][12883] Updated weights for policy 0, policy_version 123293 (0.0044) +[2024-06-18 11:26:06,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2020032512. Throughput: 0: 42676.9. Samples: 2020145780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:06,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 11:26:10,723][12883] Updated weights for policy 0, policy_version 123303 (0.0034) +[2024-06-18 11:26:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 2020229120. Throughput: 0: 42375.6. Samples: 2020392600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:11,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 11:26:14,803][12883] Updated weights for policy 0, policy_version 123313 (0.0039) +[2024-06-18 11:26:16,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42598.3). Total num frames: 2020458496. Throughput: 0: 42619.7. Samples: 2020525460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:16,996][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 11:26:18,242][12883] Updated weights for policy 0, policy_version 123323 (0.0035) +[2024-06-18 11:26:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2020638720. Throughput: 0: 42359.6. Samples: 2020776640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:21,998][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 11:26:22,471][12883] Updated weights for policy 0, policy_version 123333 (0.0042) +[2024-06-18 11:26:25,931][12883] Updated weights for policy 0, policy_version 123343 (0.0038) +[2024-06-18 11:26:26,994][12645] Fps is (10 sec: 39330.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2020851712. Throughput: 0: 42153.3. Samples: 2021026860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:26,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 11:26:30,222][12883] Updated weights for policy 0, policy_version 123353 (0.0028) +[2024-06-18 11:26:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2021097472. Throughput: 0: 42382.7. Samples: 2021157600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:31,994][12645] Avg episode reward: [(0, '0.714')] +[2024-06-18 11:26:33,501][12883] Updated weights for policy 0, policy_version 123363 (0.0033) +[2024-06-18 11:26:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 2021294080. Throughput: 0: 42434.0. Samples: 2021416320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:36,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 11:26:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123370_2021294080.pth... +[2024-06-18 11:26:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122746_2011070464.pth +[2024-06-18 11:26:38,110][12883] Updated weights for policy 0, policy_version 123373 (0.0043) +[2024-06-18 11:26:41,370][12883] Updated weights for policy 0, policy_version 123383 (0.0038) +[2024-06-18 11:26:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.8, 300 sec: 42542.9). Total num frames: 2021507072. Throughput: 0: 42155.7. Samples: 2021657200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:41,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 11:26:46,145][12883] Updated weights for policy 0, policy_version 123393 (0.0035) +[2024-06-18 11:26:46,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42602.6, 300 sec: 42709.4). Total num frames: 2021736448. Throughput: 0: 42430.4. Samples: 2021793640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:46,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 11:26:48,934][12883] Updated weights for policy 0, policy_version 123403 (0.0036) +[2024-06-18 11:26:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 2021916672. Throughput: 0: 42391.5. Samples: 2022053400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:51,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 11:26:53,710][12883] Updated weights for policy 0, policy_version 123413 (0.0032) +[2024-06-18 11:26:56,406][12883] Updated weights for policy 0, policy_version 123423 (0.0026) +[2024-06-18 11:26:56,998][12645] Fps is (10 sec: 42581.8, 60 sec: 42868.5, 300 sec: 42598.7). Total num frames: 2022162432. Throughput: 0: 42274.6. Samples: 2022295140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:26:56,998][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 11:27:01,457][12883] Updated weights for policy 0, policy_version 123433 (0.0033) +[2024-06-18 11:27:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2022359040. Throughput: 0: 42431.8. Samples: 2022434800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 11:27:01,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 11:27:04,715][12883] Updated weights for policy 0, policy_version 123443 (0.0039) +[2024-06-18 11:27:06,997][12645] Fps is (10 sec: 39324.2, 60 sec: 42049.7, 300 sec: 42431.3). Total num frames: 2022555648. Throughput: 0: 42341.1. Samples: 2022682140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:06,998][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 11:27:09,255][12883] Updated weights for policy 0, policy_version 123453 (0.0029) +[2024-06-18 11:27:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 2022785024. Throughput: 0: 42251.8. Samples: 2022928200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:11,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 11:27:12,371][12883] Updated weights for policy 0, policy_version 123463 (0.0034) +[2024-06-18 11:27:16,988][12883] Updated weights for policy 0, policy_version 123473 (0.0036) +[2024-06-18 11:27:16,994][12645] Fps is (10 sec: 42613.6, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 2022981632. Throughput: 0: 42393.4. Samples: 2023065300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:16,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 11:27:17,300][12862] Signal inference workers to stop experience collection... (29650 times) +[2024-06-18 11:27:17,300][12862] Signal inference workers to resume experience collection... (29650 times) +[2024-06-18 11:27:17,331][12883] InferenceWorker_p0-w0: stopping experience collection (29650 times) +[2024-06-18 11:27:17,331][12883] InferenceWorker_p0-w0: resuming experience collection (29650 times) +[2024-06-18 11:27:19,815][12883] Updated weights for policy 0, policy_version 123483 (0.0026) +[2024-06-18 11:27:21,996][12645] Fps is (10 sec: 40951.5, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 2023194624. Throughput: 0: 42378.9. Samples: 2023323460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:21,996][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 11:27:24,551][12883] Updated weights for policy 0, policy_version 123493 (0.0027) +[2024-06-18 11:27:26,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2023440384. Throughput: 0: 42607.4. Samples: 2023574540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:26,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 11:27:27,303][12883] Updated weights for policy 0, policy_version 123503 (0.0041) +[2024-06-18 11:27:31,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2023604224. Throughput: 0: 42672.3. Samples: 2023713880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:31,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 11:27:32,217][12883] Updated weights for policy 0, policy_version 123513 (0.0046) +[2024-06-18 11:27:34,848][12883] Updated weights for policy 0, policy_version 123523 (0.0042) +[2024-06-18 11:27:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2023849984. Throughput: 0: 42477.7. Samples: 2023964900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:36,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 11:27:39,868][12883] Updated weights for policy 0, policy_version 123533 (0.0038) +[2024-06-18 11:27:41,994][12645] Fps is (10 sec: 49151.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2024095744. Throughput: 0: 42696.4. Samples: 2024216300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:41,994][12645] Avg episode reward: [(0, '0.162')] +[2024-06-18 11:27:43,152][12883] Updated weights for policy 0, policy_version 123543 (0.0037) +[2024-06-18 11:27:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.4, 300 sec: 42432.7). Total num frames: 2024243200. Throughput: 0: 42415.1. Samples: 2024343480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:46,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 11:27:47,750][12883] Updated weights for policy 0, policy_version 123553 (0.0034) +[2024-06-18 11:27:50,742][12883] Updated weights for policy 0, policy_version 123563 (0.0032) +[2024-06-18 11:27:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42543.5). Total num frames: 2024505344. Throughput: 0: 42649.9. Samples: 2024601240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:51,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 11:27:55,297][12883] Updated weights for policy 0, policy_version 123573 (0.0039) +[2024-06-18 11:27:56,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42328.3, 300 sec: 42542.9). Total num frames: 2024701952. Throughput: 0: 42894.9. Samples: 2024858460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:27:56,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 11:27:58,695][12883] Updated weights for policy 0, policy_version 123583 (0.0047) +[2024-06-18 11:28:01,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2024898560. Throughput: 0: 42652.9. Samples: 2024984680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 11:28:01,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 11:28:02,883][12883] Updated weights for policy 0, policy_version 123593 (0.0034) +[2024-06-18 11:28:06,181][12883] Updated weights for policy 0, policy_version 123603 (0.0025) +[2024-06-18 11:28:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43147.1, 300 sec: 42487.3). Total num frames: 2025144320. Throughput: 0: 42607.9. Samples: 2025240720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:06,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 11:28:10,519][12883] Updated weights for policy 0, policy_version 123613 (0.0024) +[2024-06-18 11:28:11,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2025324544. Throughput: 0: 42734.1. Samples: 2025497580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:11,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 11:28:13,759][12883] Updated weights for policy 0, policy_version 123623 (0.0034) +[2024-06-18 11:28:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2025553920. Throughput: 0: 42414.2. Samples: 2025622520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:16,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 11:28:18,030][12883] Updated weights for policy 0, policy_version 123633 (0.0031) +[2024-06-18 11:28:21,901][12883] Updated weights for policy 0, policy_version 123643 (0.0031) +[2024-06-18 11:28:21,996][12645] Fps is (10 sec: 44227.8, 60 sec: 42871.5, 300 sec: 42431.5). Total num frames: 2025766912. Throughput: 0: 42502.8. Samples: 2025877620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:21,996][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 11:28:24,329][12862] Signal inference workers to stop experience collection... (29700 times) +[2024-06-18 11:28:24,329][12862] Signal inference workers to resume experience collection... (29700 times) +[2024-06-18 11:28:24,372][12883] InferenceWorker_p0-w0: stopping experience collection (29700 times) +[2024-06-18 11:28:24,372][12883] InferenceWorker_p0-w0: resuming experience collection (29700 times) +[2024-06-18 11:28:25,644][12883] Updated weights for policy 0, policy_version 123653 (0.0028) +[2024-06-18 11:28:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2025963520. Throughput: 0: 42632.9. Samples: 2026134780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:26,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 11:28:29,462][12883] Updated weights for policy 0, policy_version 123663 (0.0038) +[2024-06-18 11:28:31,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2026176512. Throughput: 0: 42606.7. Samples: 2026260780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:31,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 11:28:33,341][12883] Updated weights for policy 0, policy_version 123673 (0.0054) +[2024-06-18 11:28:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2026405888. Throughput: 0: 42677.5. Samples: 2026521720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:36,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 11:28:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123683_2026422272.pth... +[2024-06-18 11:28:37,019][12883] Updated weights for policy 0, policy_version 123683 (0.0029) +[2024-06-18 11:28:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123061_2016231424.pth +[2024-06-18 11:28:40,928][12883] Updated weights for policy 0, policy_version 123693 (0.0038) +[2024-06-18 11:28:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2026618880. Throughput: 0: 42577.3. Samples: 2026774440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:41,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 11:28:44,534][12883] Updated weights for policy 0, policy_version 123703 (0.0039) +[2024-06-18 11:28:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2026815488. Throughput: 0: 42632.8. Samples: 2026903160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:46,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 11:28:48,630][12883] Updated weights for policy 0, policy_version 123713 (0.0037) +[2024-06-18 11:28:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2027044864. Throughput: 0: 42757.0. Samples: 2027164780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:51,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 11:28:52,331][12883] Updated weights for policy 0, policy_version 123723 (0.0031) +[2024-06-18 11:28:56,156][12883] Updated weights for policy 0, policy_version 123733 (0.0026) +[2024-06-18 11:28:56,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2027274240. Throughput: 0: 42736.7. Samples: 2027420720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:28:56,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 11:29:00,430][12883] Updated weights for policy 0, policy_version 123743 (0.0034) +[2024-06-18 11:29:01,994][12645] Fps is (10 sec: 42595.2, 60 sec: 42871.0, 300 sec: 42542.8). Total num frames: 2027470848. Throughput: 0: 42883.8. Samples: 2027552320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) +[2024-06-18 11:29:01,995][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 11:29:03,696][12883] Updated weights for policy 0, policy_version 123753 (0.0043) +[2024-06-18 11:29:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2027683840. Throughput: 0: 42895.1. Samples: 2027807800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:06,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 11:29:08,063][12883] Updated weights for policy 0, policy_version 123763 (0.0048) +[2024-06-18 11:29:11,619][12883] Updated weights for policy 0, policy_version 123773 (0.0051) +[2024-06-18 11:29:11,994][12645] Fps is (10 sec: 42600.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2027896832. Throughput: 0: 42737.7. Samples: 2028057980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:11,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 11:29:16,130][12883] Updated weights for policy 0, policy_version 123783 (0.0033) +[2024-06-18 11:29:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 2028109824. Throughput: 0: 42704.5. Samples: 2028182480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:16,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 11:29:19,215][12883] Updated weights for policy 0, policy_version 123793 (0.0036) +[2024-06-18 11:29:21,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2028339200. Throughput: 0: 42643.5. Samples: 2028440680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:21,994][12645] Avg episode reward: [(0, '0.608')] +[2024-06-18 11:29:23,524][12883] Updated weights for policy 0, policy_version 123803 (0.0033) +[2024-06-18 11:29:26,757][12883] Updated weights for policy 0, policy_version 123813 (0.0039) +[2024-06-18 11:29:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2028552192. Throughput: 0: 42662.7. Samples: 2028694260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:26,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 11:29:30,931][12883] Updated weights for policy 0, policy_version 123823 (0.0030) +[2024-06-18 11:29:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2028748800. Throughput: 0: 42800.0. Samples: 2028829160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:31,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 11:29:34,475][12883] Updated weights for policy 0, policy_version 123833 (0.0040) +[2024-06-18 11:29:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2028945408. Throughput: 0: 42663.5. Samples: 2029084640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:36,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 11:29:38,650][12883] Updated weights for policy 0, policy_version 123843 (0.0030) +[2024-06-18 11:29:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2029174784. Throughput: 0: 42493.8. Samples: 2029332940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:41,994][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 11:29:42,441][12883] Updated weights for policy 0, policy_version 123853 (0.0043) +[2024-06-18 11:29:46,710][12883] Updated weights for policy 0, policy_version 123863 (0.0046) +[2024-06-18 11:29:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2029387776. Throughput: 0: 42449.5. Samples: 2029462520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:46,994][12645] Avg episode reward: [(0, '0.684')] +[2024-06-18 11:29:49,965][12883] Updated weights for policy 0, policy_version 123873 (0.0036) +[2024-06-18 11:29:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2029584384. Throughput: 0: 42383.1. Samples: 2029715040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:51,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 11:29:54,353][12883] Updated weights for policy 0, policy_version 123883 (0.0041) +[2024-06-18 11:29:54,620][12862] Signal inference workers to stop experience collection... (29750 times) +[2024-06-18 11:29:54,620][12862] Signal inference workers to resume experience collection... (29750 times) +[2024-06-18 11:29:54,649][12883] InferenceWorker_p0-w0: stopping experience collection (29750 times) +[2024-06-18 11:29:54,649][12883] InferenceWorker_p0-w0: resuming experience collection (29750 times) +[2024-06-18 11:29:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2029830144. Throughput: 0: 42471.7. Samples: 2029969200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:29:56,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 11:29:57,519][12883] Updated weights for policy 0, policy_version 123893 (0.0031) +[2024-06-18 11:30:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.8, 300 sec: 42542.9). Total num frames: 2030010368. Throughput: 0: 42531.6. Samples: 2030096400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:30:01,994][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 11:30:02,103][12883] Updated weights for policy 0, policy_version 123903 (0.0034) +[2024-06-18 11:30:05,223][12883] Updated weights for policy 0, policy_version 123913 (0.0035) +[2024-06-18 11:30:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 2030223360. Throughput: 0: 42404.4. Samples: 2030348880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 11:30:06,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 11:30:09,795][12883] Updated weights for policy 0, policy_version 123923 (0.0048) +[2024-06-18 11:30:11,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2030469120. Throughput: 0: 42506.1. Samples: 2030607040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:11,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 11:30:13,238][12883] Updated weights for policy 0, policy_version 123933 (0.0046) +[2024-06-18 11:30:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2030649344. Throughput: 0: 42372.0. Samples: 2030735900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:16,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 11:30:17,374][12883] Updated weights for policy 0, policy_version 123943 (0.0027) +[2024-06-18 11:30:20,921][12883] Updated weights for policy 0, policy_version 123953 (0.0047) +[2024-06-18 11:30:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2030878720. Throughput: 0: 42377.2. Samples: 2030991620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:21,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 11:30:24,966][12883] Updated weights for policy 0, policy_version 123963 (0.0024) +[2024-06-18 11:30:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2031091712. Throughput: 0: 42495.5. Samples: 2031245240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:26,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 11:30:28,708][12883] Updated weights for policy 0, policy_version 123973 (0.0028) +[2024-06-18 11:30:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2031304704. Throughput: 0: 42472.0. Samples: 2031373760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:31,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 11:30:32,582][12883] Updated weights for policy 0, policy_version 123983 (0.0029) +[2024-06-18 11:30:36,267][12883] Updated weights for policy 0, policy_version 123993 (0.0043) +[2024-06-18 11:30:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.5). Total num frames: 2031501312. Throughput: 0: 42532.0. Samples: 2031628980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:36,994][12645] Avg episode reward: [(0, '0.718')] +[2024-06-18 11:30:37,097][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123994_2031517696.pth... +[2024-06-18 11:30:37,158][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123370_2021294080.pth +[2024-06-18 11:30:40,704][12883] Updated weights for policy 0, policy_version 124003 (0.0035) +[2024-06-18 11:30:41,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42543.7). Total num frames: 2031730688. Throughput: 0: 42491.3. Samples: 2031881320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:41,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 11:30:44,234][12883] Updated weights for policy 0, policy_version 124013 (0.0038) +[2024-06-18 11:30:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2031943680. Throughput: 0: 42621.3. Samples: 2032014360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:46,994][12645] Avg episode reward: [(0, '0.319')] +[2024-06-18 11:30:48,171][12883] Updated weights for policy 0, policy_version 124023 (0.0030) +[2024-06-18 11:30:51,902][12883] Updated weights for policy 0, policy_version 124033 (0.0052) +[2024-06-18 11:30:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2032156672. Throughput: 0: 42535.1. Samples: 2032262960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:51,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 11:30:55,693][12883] Updated weights for policy 0, policy_version 124043 (0.0036) +[2024-06-18 11:30:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2032369664. Throughput: 0: 42621.9. Samples: 2032525020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:30:56,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 11:30:59,409][12883] Updated weights for policy 0, policy_version 124053 (0.0033) +[2024-06-18 11:31:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2032566272. Throughput: 0: 42578.7. Samples: 2032651940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:31:01,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 11:31:03,462][12883] Updated weights for policy 0, policy_version 124063 (0.0031) +[2024-06-18 11:31:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2032795648. Throughput: 0: 42502.7. Samples: 2032904240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 11:31:06,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 11:31:07,297][12883] Updated weights for policy 0, policy_version 124073 (0.0027) +[2024-06-18 11:31:11,288][12883] Updated weights for policy 0, policy_version 124083 (0.0034) +[2024-06-18 11:31:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42487.6). Total num frames: 2032992256. Throughput: 0: 42616.1. Samples: 2033162960. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:11,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 11:31:15,029][12883] Updated weights for policy 0, policy_version 124093 (0.0034) +[2024-06-18 11:31:15,283][12862] Signal inference workers to stop experience collection... (29800 times) +[2024-06-18 11:31:15,283][12862] Signal inference workers to resume experience collection... (29800 times) +[2024-06-18 11:31:15,307][12883] InferenceWorker_p0-w0: stopping experience collection (29800 times) +[2024-06-18 11:31:15,307][12883] InferenceWorker_p0-w0: resuming experience collection (29800 times) +[2024-06-18 11:31:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2033205248. Throughput: 0: 42638.6. Samples: 2033292500. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:16,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 11:31:18,816][12883] Updated weights for policy 0, policy_version 124103 (0.0032) +[2024-06-18 11:31:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2033434624. Throughput: 0: 42503.8. Samples: 2033541660. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:21,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 11:31:22,646][12883] Updated weights for policy 0, policy_version 124113 (0.0031) +[2024-06-18 11:31:26,420][12883] Updated weights for policy 0, policy_version 124123 (0.0037) +[2024-06-18 11:31:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2033647616. Throughput: 0: 42618.0. Samples: 2033799120. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:26,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 11:31:30,298][12883] Updated weights for policy 0, policy_version 124133 (0.0026) +[2024-06-18 11:31:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2033844224. Throughput: 0: 42588.3. Samples: 2033930840. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:31,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 11:31:33,982][12883] Updated weights for policy 0, policy_version 124143 (0.0036) +[2024-06-18 11:31:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2034073600. Throughput: 0: 42679.5. Samples: 2034183540. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:36,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 11:31:38,067][12883] Updated weights for policy 0, policy_version 124153 (0.0026) +[2024-06-18 11:31:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 2034270208. Throughput: 0: 42412.8. Samples: 2034433600. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:41,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 11:31:42,194][12883] Updated weights for policy 0, policy_version 124163 (0.0027) +[2024-06-18 11:31:46,070][12883] Updated weights for policy 0, policy_version 124173 (0.0023) +[2024-06-18 11:31:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2034483200. Throughput: 0: 42425.3. Samples: 2034561080. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:46,994][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 11:31:49,731][12883] Updated weights for policy 0, policy_version 124183 (0.0046) +[2024-06-18 11:31:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42543.5). Total num frames: 2034712576. Throughput: 0: 42542.1. Samples: 2034818640. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:51,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 11:31:53,850][12883] Updated weights for policy 0, policy_version 124193 (0.0030) +[2024-06-18 11:31:56,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2034909184. Throughput: 0: 42344.5. Samples: 2035068560. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:31:56,996][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 11:31:57,374][12883] Updated weights for policy 0, policy_version 124203 (0.0028) +[2024-06-18 11:32:01,822][12883] Updated weights for policy 0, policy_version 124213 (0.0038) +[2024-06-18 11:32:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 2035122176. Throughput: 0: 42308.0. Samples: 2035196360. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:32:01,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 11:32:04,942][12883] Updated weights for policy 0, policy_version 124223 (0.0025) +[2024-06-18 11:32:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2035351552. Throughput: 0: 42709.0. Samples: 2035463560. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:32:06,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 11:32:09,299][12883] Updated weights for policy 0, policy_version 124233 (0.0049) +[2024-06-18 11:32:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2035564544. Throughput: 0: 42435.0. Samples: 2035708700. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) +[2024-06-18 11:32:11,994][12645] Avg episode reward: [(0, '0.707')] +[2024-06-18 11:32:12,542][12883] Updated weights for policy 0, policy_version 124243 (0.0028) +[2024-06-18 11:32:16,837][12883] Updated weights for policy 0, policy_version 124253 (0.0032) +[2024-06-18 11:32:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 2035761152. Throughput: 0: 42463.6. Samples: 2035841700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:16,994][12645] Avg episode reward: [(0, '0.686')] +[2024-06-18 11:32:20,209][12883] Updated weights for policy 0, policy_version 124263 (0.0044) +[2024-06-18 11:32:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2035974144. Throughput: 0: 42537.4. Samples: 2036097720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:21,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 11:32:24,334][12883] Updated weights for policy 0, policy_version 124273 (0.0041) +[2024-06-18 11:32:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2036203520. Throughput: 0: 42702.6. Samples: 2036355220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:26,995][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 11:32:27,868][12883] Updated weights for policy 0, policy_version 124283 (0.0042) +[2024-06-18 11:32:31,959][12883] Updated weights for policy 0, policy_version 124293 (0.0030) +[2024-06-18 11:32:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2036416512. Throughput: 0: 42729.9. Samples: 2036483920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:31,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 11:32:35,634][12883] Updated weights for policy 0, policy_version 124303 (0.0031) +[2024-06-18 11:32:36,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2036613120. Throughput: 0: 42587.7. Samples: 2036735080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:36,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 11:32:37,077][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124306_2036629504.pth... +[2024-06-18 11:32:37,128][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123683_2026422272.pth +[2024-06-18 11:32:39,628][12883] Updated weights for policy 0, policy_version 124313 (0.0043) +[2024-06-18 11:32:41,301][12862] Signal inference workers to stop experience collection... (29850 times) +[2024-06-18 11:32:41,302][12862] Signal inference workers to resume experience collection... (29850 times) +[2024-06-18 11:32:41,348][12883] InferenceWorker_p0-w0: stopping experience collection (29850 times) +[2024-06-18 11:32:41,348][12883] InferenceWorker_p0-w0: resuming experience collection (29850 times) +[2024-06-18 11:32:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2036842496. Throughput: 0: 42720.3. Samples: 2036990880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:41,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 11:32:43,259][12883] Updated weights for policy 0, policy_version 124323 (0.0029) +[2024-06-18 11:32:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2037039104. Throughput: 0: 42709.9. Samples: 2037118300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:46,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 11:32:47,221][12883] Updated weights for policy 0, policy_version 124333 (0.0034) +[2024-06-18 11:32:50,868][12883] Updated weights for policy 0, policy_version 124343 (0.0033) +[2024-06-18 11:32:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2037252096. Throughput: 0: 42484.0. Samples: 2037375340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:51,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 11:32:54,894][12883] Updated weights for policy 0, policy_version 124353 (0.0025) +[2024-06-18 11:32:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 2037497856. Throughput: 0: 42711.6. Samples: 2037630720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:32:56,996][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 11:32:58,507][12883] Updated weights for policy 0, policy_version 124363 (0.0035) +[2024-06-18 11:33:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2037661696. Throughput: 0: 42712.8. Samples: 2037763780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:33:01,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 11:33:02,852][12883] Updated weights for policy 0, policy_version 124373 (0.0034) +[2024-06-18 11:33:06,157][12883] Updated weights for policy 0, policy_version 124383 (0.0036) +[2024-06-18 11:33:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2037891072. Throughput: 0: 42571.2. Samples: 2038013420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:33:06,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 11:33:10,686][12883] Updated weights for policy 0, policy_version 124393 (0.0026) +[2024-06-18 11:33:11,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2038120448. Throughput: 0: 42366.4. Samples: 2038261700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 11:33:11,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 11:33:13,903][12883] Updated weights for policy 0, policy_version 124403 (0.0026) +[2024-06-18 11:33:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 2038300672. Throughput: 0: 42480.0. Samples: 2038395520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:16,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 11:33:18,626][12883] Updated weights for policy 0, policy_version 124413 (0.0032) +[2024-06-18 11:33:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2038530048. Throughput: 0: 42370.2. Samples: 2038641740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:21,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 11:33:22,059][12883] Updated weights for policy 0, policy_version 124423 (0.0038) +[2024-06-18 11:33:26,313][12883] Updated weights for policy 0, policy_version 124433 (0.0045) +[2024-06-18 11:33:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2038743040. Throughput: 0: 42447.2. Samples: 2038901000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:26,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 11:33:29,561][12883] Updated weights for policy 0, policy_version 124443 (0.0024) +[2024-06-18 11:33:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 2038923264. Throughput: 0: 42431.5. Samples: 2039027720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:31,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 11:33:33,957][12883] Updated weights for policy 0, policy_version 124453 (0.0036) +[2024-06-18 11:33:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2039169024. Throughput: 0: 42447.6. Samples: 2039285480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:36,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 11:33:37,202][12883] Updated weights for policy 0, policy_version 124463 (0.0036) +[2024-06-18 11:33:41,511][12883] Updated weights for policy 0, policy_version 124473 (0.0026) +[2024-06-18 11:33:41,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2039398400. Throughput: 0: 42519.5. Samples: 2039544100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:41,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 11:33:44,793][12883] Updated weights for policy 0, policy_version 124483 (0.0044) +[2024-06-18 11:33:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2039578624. Throughput: 0: 42424.9. Samples: 2039672900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:46,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 11:33:49,163][12883] Updated weights for policy 0, policy_version 124493 (0.0045) +[2024-06-18 11:33:50,259][12862] Signal inference workers to stop experience collection... (29900 times) +[2024-06-18 11:33:50,259][12862] Signal inference workers to resume experience collection... (29900 times) +[2024-06-18 11:33:50,297][12883] InferenceWorker_p0-w0: stopping experience collection (29900 times) +[2024-06-18 11:33:50,297][12883] InferenceWorker_p0-w0: resuming experience collection (29900 times) +[2024-06-18 11:33:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2039824384. Throughput: 0: 42507.1. Samples: 2039926240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:51,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 11:33:52,290][12883] Updated weights for policy 0, policy_version 124503 (0.0036) +[2024-06-18 11:33:56,770][12883] Updated weights for policy 0, policy_version 124513 (0.0042) +[2024-06-18 11:33:56,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42598.5). Total num frames: 2040037376. Throughput: 0: 42794.3. Samples: 2040187440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:33:56,994][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 11:33:59,885][12883] Updated weights for policy 0, policy_version 124523 (0.0033) +[2024-06-18 11:34:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2040217600. Throughput: 0: 42552.3. Samples: 2040310380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:34:01,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 11:34:04,518][12883] Updated weights for policy 0, policy_version 124533 (0.0050) +[2024-06-18 11:34:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2040463360. Throughput: 0: 42635.5. Samples: 2040560340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:34:06,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 11:34:07,502][12883] Updated weights for policy 0, policy_version 124543 (0.0040) +[2024-06-18 11:34:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2040643584. Throughput: 0: 42648.9. Samples: 2040820200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:34:11,994][12645] Avg episode reward: [(0, '0.639')] +[2024-06-18 11:34:12,385][12883] Updated weights for policy 0, policy_version 124553 (0.0032) +[2024-06-18 11:34:15,601][12883] Updated weights for policy 0, policy_version 124563 (0.0027) +[2024-06-18 11:34:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2040856576. Throughput: 0: 42571.6. Samples: 2040943440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 11:34:16,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 11:34:19,997][12883] Updated weights for policy 0, policy_version 124573 (0.0031) +[2024-06-18 11:34:21,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 2041085952. Throughput: 0: 42545.9. Samples: 2041200140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:34:21,997][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 11:34:23,383][12883] Updated weights for policy 0, policy_version 124583 (0.0030) +[2024-06-18 11:34:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2041282560. Throughput: 0: 42522.3. Samples: 2041457600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:34:26,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 11:34:28,026][12883] Updated weights for policy 0, policy_version 124593 (0.0033) +[2024-06-18 11:34:30,968][12883] Updated weights for policy 0, policy_version 124603 (0.0045) +[2024-06-18 11:34:31,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2041495552. Throughput: 0: 42422.2. Samples: 2041581900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:34:31,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 11:34:35,514][12883] Updated weights for policy 0, policy_version 124613 (0.0037) +[2024-06-18 11:34:36,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2041741312. Throughput: 0: 42540.8. Samples: 2041840580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:34:36,994][12645] Avg episode reward: [(0, '0.773')] +[2024-06-18 11:34:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124618_2041741312.pth... +[2024-06-18 11:34:37,057][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123994_2031517696.pth +[2024-06-18 11:34:38,640][12883] Updated weights for policy 0, policy_version 124623 (0.0038) +[2024-06-18 11:34:41,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2041937920. Throughput: 0: 42415.1. Samples: 2042096220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:34:41,997][12645] Avg episode reward: [(0, '0.773')] +[2024-06-18 11:34:43,494][12883] Updated weights for policy 0, policy_version 124633 (0.0033) +[2024-06-18 11:34:46,825][12883] Updated weights for policy 0, policy_version 124643 (0.0030) +[2024-06-18 11:34:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2042150912. Throughput: 0: 42503.2. Samples: 2042223020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:34:46,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 11:34:51,301][12883] Updated weights for policy 0, policy_version 124653 (0.0040) +[2024-06-18 11:34:51,994][12645] Fps is (10 sec: 40969.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 2042347520. Throughput: 0: 42671.3. Samples: 2042480540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:34:51,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 11:34:54,447][12883] Updated weights for policy 0, policy_version 124663 (0.0043) +[2024-06-18 11:34:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2042576896. Throughput: 0: 42497.2. Samples: 2042732580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:34:56,994][12645] Avg episode reward: [(0, '0.772')] +[2024-06-18 11:34:58,751][12883] Updated weights for policy 0, policy_version 124673 (0.0036) +[2024-06-18 11:35:01,978][12883] Updated weights for policy 0, policy_version 124683 (0.0035) +[2024-06-18 11:35:01,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2042806272. Throughput: 0: 42737.7. Samples: 2042866640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:35:01,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 11:35:06,086][12883] Updated weights for policy 0, policy_version 124693 (0.0027) +[2024-06-18 11:35:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 2042986496. Throughput: 0: 42718.2. Samples: 2043122360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:35:06,994][12645] Avg episode reward: [(0, '0.698')] +[2024-06-18 11:35:09,479][12883] Updated weights for policy 0, policy_version 124703 (0.0038) +[2024-06-18 11:35:11,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2043215872. Throughput: 0: 42660.9. Samples: 2043377340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:35:11,994][12645] Avg episode reward: [(0, '0.677')] +[2024-06-18 11:35:13,537][12883] Updated weights for policy 0, policy_version 124713 (0.0024) +[2024-06-18 11:35:16,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2043445248. Throughput: 0: 42909.3. Samples: 2043512820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 11:35:16,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 11:35:17,058][12883] Updated weights for policy 0, policy_version 124723 (0.0028) +[2024-06-18 11:35:19,560][12862] Signal inference workers to stop experience collection... (29950 times) +[2024-06-18 11:35:19,592][12883] InferenceWorker_p0-w0: stopping experience collection (29950 times) +[2024-06-18 11:35:19,612][12862] Signal inference workers to resume experience collection... (29950 times) +[2024-06-18 11:35:19,616][12883] InferenceWorker_p0-w0: resuming experience collection (29950 times) +[2024-06-18 11:35:21,074][12883] Updated weights for policy 0, policy_version 124733 (0.0043) +[2024-06-18 11:35:22,000][12645] Fps is (10 sec: 42571.4, 60 sec: 42595.6, 300 sec: 42542.0). Total num frames: 2043641856. Throughput: 0: 42701.7. Samples: 2043762420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:35:22,000][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 11:35:24,835][12883] Updated weights for policy 0, policy_version 124743 (0.0038) +[2024-06-18 11:35:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2043871232. Throughput: 0: 42692.0. Samples: 2044017260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:35:26,994][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 11:35:28,900][12883] Updated weights for policy 0, policy_version 124753 (0.0036) +[2024-06-18 11:35:31,994][12645] Fps is (10 sec: 45903.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2044100608. Throughput: 0: 42821.8. Samples: 2044150000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:35:31,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 11:35:32,809][12883] Updated weights for policy 0, policy_version 124763 (0.0036) +[2024-06-18 11:35:36,394][12883] Updated weights for policy 0, policy_version 124773 (0.0028) +[2024-06-18 11:35:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2044280832. Throughput: 0: 42680.7. Samples: 2044401180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:35:36,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 11:35:40,832][12883] Updated weights for policy 0, policy_version 124783 (0.0037) +[2024-06-18 11:35:41,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42871.5, 300 sec: 42598.1). Total num frames: 2044510208. Throughput: 0: 42794.0. Samples: 2044658400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:35:41,996][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 11:35:44,016][12883] Updated weights for policy 0, policy_version 124793 (0.0040) +[2024-06-18 11:35:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2044690432. Throughput: 0: 42687.6. Samples: 2044787580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:35:46,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 11:35:48,423][12883] Updated weights for policy 0, policy_version 124803 (0.0032) +[2024-06-18 11:35:51,545][12883] Updated weights for policy 0, policy_version 124813 (0.0031) +[2024-06-18 11:35:51,994][12645] Fps is (10 sec: 42607.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2044936192. Throughput: 0: 42595.0. Samples: 2045039140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:35:51,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 11:35:56,266][12883] Updated weights for policy 0, policy_version 124823 (0.0034) +[2024-06-18 11:35:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2045132800. Throughput: 0: 42662.1. Samples: 2045297140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:35:56,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 11:35:59,151][12883] Updated weights for policy 0, policy_version 124833 (0.0028) +[2024-06-18 11:36:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2045329408. Throughput: 0: 42503.5. Samples: 2045425480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:36:01,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 11:36:03,910][12883] Updated weights for policy 0, policy_version 124843 (0.0035) +[2024-06-18 11:36:06,755][12883] Updated weights for policy 0, policy_version 124853 (0.0026) +[2024-06-18 11:36:06,994][12645] Fps is (10 sec: 45874.3, 60 sec: 43417.4, 300 sec: 42709.4). Total num frames: 2045591552. Throughput: 0: 42483.0. Samples: 2045673900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:36:06,994][12645] Avg episode reward: [(0, '0.672')] +[2024-06-18 11:36:11,559][12883] Updated weights for policy 0, policy_version 124863 (0.0029) +[2024-06-18 11:36:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2045771776. Throughput: 0: 42570.6. Samples: 2045932940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:36:11,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 11:36:14,873][12883] Updated weights for policy 0, policy_version 124873 (0.0029) +[2024-06-18 11:36:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2045984768. Throughput: 0: 42298.5. Samples: 2046053440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 11:36:16,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 11:36:19,315][12883] Updated weights for policy 0, policy_version 124883 (0.0033) +[2024-06-18 11:36:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 2046214144. Throughput: 0: 42533.9. Samples: 2046315200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:36:21,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 11:36:22,570][12883] Updated weights for policy 0, policy_version 124893 (0.0040) +[2024-06-18 11:36:26,928][12883] Updated weights for policy 0, policy_version 124903 (0.0046) +[2024-06-18 11:36:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2046410752. Throughput: 0: 42589.6. Samples: 2046574840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:36:26,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 11:36:30,210][12883] Updated weights for policy 0, policy_version 124913 (0.0035) +[2024-06-18 11:36:31,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2046623744. Throughput: 0: 42347.5. Samples: 2046693220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:36:31,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 11:36:33,827][12862] Signal inference workers to stop experience collection... (30000 times) +[2024-06-18 11:36:33,827][12862] Signal inference workers to resume experience collection... (30000 times) +[2024-06-18 11:36:33,846][12883] InferenceWorker_p0-w0: stopping experience collection (30000 times) +[2024-06-18 11:36:33,846][12883] InferenceWorker_p0-w0: resuming experience collection (30000 times) +[2024-06-18 11:36:34,578][12883] Updated weights for policy 0, policy_version 124923 (0.0046) +[2024-06-18 11:36:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2046836736. Throughput: 0: 42636.4. Samples: 2046957780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:36:36,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 11:36:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124930_2046853120.pth... +[2024-06-18 11:36:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124306_2036629504.pth +[2024-06-18 11:36:37,861][12883] Updated weights for policy 0, policy_version 124933 (0.0022) +[2024-06-18 11:36:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 2047049728. Throughput: 0: 42650.2. Samples: 2047216400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:36:41,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 11:36:42,171][12883] Updated weights for policy 0, policy_version 124943 (0.0033) +[2024-06-18 11:36:45,564][12883] Updated weights for policy 0, policy_version 124953 (0.0034) +[2024-06-18 11:36:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2047279104. Throughput: 0: 42592.0. Samples: 2047342120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:36:46,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 11:36:49,714][12883] Updated weights for policy 0, policy_version 124963 (0.0025) +[2024-06-18 11:36:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2047475712. Throughput: 0: 42775.8. Samples: 2047598800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:36:51,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 11:36:53,061][12883] Updated weights for policy 0, policy_version 124973 (0.0035) +[2024-06-18 11:36:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2047672320. Throughput: 0: 42640.0. Samples: 2047851740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:36:56,994][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 11:36:57,762][12883] Updated weights for policy 0, policy_version 124983 (0.0036) +[2024-06-18 11:37:00,812][12883] Updated weights for policy 0, policy_version 124993 (0.0030) +[2024-06-18 11:37:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2047918080. Throughput: 0: 42807.2. Samples: 2047979760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:01,994][12645] Avg episode reward: [(0, '0.767')] +[2024-06-18 11:37:05,762][12883] Updated weights for policy 0, policy_version 125003 (0.0034) +[2024-06-18 11:37:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.4, 300 sec: 42487.3). Total num frames: 2048098304. Throughput: 0: 42751.5. Samples: 2048239020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:06,994][12645] Avg episode reward: [(0, '0.692')] +[2024-06-18 11:37:08,488][12883] Updated weights for policy 0, policy_version 125013 (0.0044) +[2024-06-18 11:37:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2048327680. Throughput: 0: 42628.0. Samples: 2048493100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:11,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 11:37:13,458][12883] Updated weights for policy 0, policy_version 125023 (0.0036) +[2024-06-18 11:37:16,030][12883] Updated weights for policy 0, policy_version 125033 (0.0035) +[2024-06-18 11:37:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2048557056. Throughput: 0: 42825.0. Samples: 2048620340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:16,994][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 11:37:21,135][12883] Updated weights for policy 0, policy_version 125043 (0.0035) +[2024-06-18 11:37:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 2048753664. Throughput: 0: 42871.4. Samples: 2048887000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:21,994][12645] Avg episode reward: [(0, '0.629')] +[2024-06-18 11:37:23,579][12883] Updated weights for policy 0, policy_version 125053 (0.0039) +[2024-06-18 11:37:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2048983040. Throughput: 0: 42635.9. Samples: 2049135020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:26,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 11:37:28,620][12883] Updated weights for policy 0, policy_version 125063 (0.0031) +[2024-06-18 11:37:31,412][12883] Updated weights for policy 0, policy_version 125073 (0.0033) +[2024-06-18 11:37:31,996][12645] Fps is (10 sec: 47503.8, 60 sec: 43416.1, 300 sec: 42764.7). Total num frames: 2049228800. Throughput: 0: 42756.6. Samples: 2049266260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:31,996][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 11:37:36,101][12883] Updated weights for policy 0, policy_version 125083 (0.0033) +[2024-06-18 11:37:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2049392640. Throughput: 0: 42893.1. Samples: 2049529000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:36,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 11:37:39,104][12883] Updated weights for policy 0, policy_version 125093 (0.0034) +[2024-06-18 11:37:41,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2049622016. Throughput: 0: 42882.7. Samples: 2049781460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:41,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 11:37:43,728][12883] Updated weights for policy 0, policy_version 125103 (0.0027) +[2024-06-18 11:37:46,738][12883] Updated weights for policy 0, policy_version 125113 (0.0043) +[2024-06-18 11:37:46,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2049867776. Throughput: 0: 42968.9. Samples: 2049913360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:46,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 11:37:51,266][12883] Updated weights for policy 0, policy_version 125123 (0.0031) +[2024-06-18 11:37:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 2050048000. Throughput: 0: 42939.9. Samples: 2050171320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:51,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 11:37:54,226][12883] Updated weights for policy 0, policy_version 125133 (0.0043) +[2024-06-18 11:37:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2050260992. Throughput: 0: 42992.9. Samples: 2050427780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:37:56,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 11:37:57,439][12862] Signal inference workers to stop experience collection... (30050 times) +[2024-06-18 11:37:57,440][12862] Signal inference workers to resume experience collection... (30050 times) +[2024-06-18 11:37:57,455][12883] InferenceWorker_p0-w0: stopping experience collection (30050 times) +[2024-06-18 11:37:57,455][12883] InferenceWorker_p0-w0: resuming experience collection (30050 times) +[2024-06-18 11:37:58,722][12883] Updated weights for policy 0, policy_version 125143 (0.0039) +[2024-06-18 11:38:01,781][12883] Updated weights for policy 0, policy_version 125153 (0.0039) +[2024-06-18 11:38:01,994][12645] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2050506752. Throughput: 0: 43011.6. Samples: 2050555860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:38:01,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 11:38:06,310][12883] Updated weights for policy 0, policy_version 125163 (0.0026) +[2024-06-18 11:38:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2050686976. Throughput: 0: 42859.7. Samples: 2050815680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:38:06,996][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 11:38:09,471][12883] Updated weights for policy 0, policy_version 125173 (0.0025) +[2024-06-18 11:38:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2050899968. Throughput: 0: 42924.1. Samples: 2051066600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:38:11,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 11:38:14,136][12883] Updated weights for policy 0, policy_version 125183 (0.0041) +[2024-06-18 11:38:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2051112960. Throughput: 0: 42897.3. Samples: 2051196540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:38:16,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 11:38:17,301][12883] Updated weights for policy 0, policy_version 125193 (0.0029) +[2024-06-18 11:38:21,661][12883] Updated weights for policy 0, policy_version 125203 (0.0039) +[2024-06-18 11:38:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2051325952. Throughput: 0: 42786.3. Samples: 2051454380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:38:21,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 11:38:24,967][12883] Updated weights for policy 0, policy_version 125213 (0.0030) +[2024-06-18 11:38:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2051555328. Throughput: 0: 42742.7. Samples: 2051704880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:38:26,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 11:38:29,368][12883] Updated weights for policy 0, policy_version 125223 (0.0037) +[2024-06-18 11:38:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 2051751936. Throughput: 0: 42843.1. Samples: 2051841300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:38:31,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 11:38:32,763][12883] Updated weights for policy 0, policy_version 125233 (0.0040) +[2024-06-18 11:38:36,930][12883] Updated weights for policy 0, policy_version 125243 (0.0039) +[2024-06-18 11:38:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 2051981312. Throughput: 0: 42791.2. Samples: 2052096920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:38:36,994][12645] Avg episode reward: [(0, '0.250')] +[2024-06-18 11:38:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125243_2051981312.pth... +[2024-06-18 11:38:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124618_2041741312.pth +[2024-06-18 11:38:40,474][12883] Updated weights for policy 0, policy_version 125253 (0.0037) +[2024-06-18 11:38:41,995][12645] Fps is (10 sec: 45867.7, 60 sec: 43143.4, 300 sec: 42820.3). Total num frames: 2052210688. Throughput: 0: 42627.8. Samples: 2052346100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:38:41,996][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 11:38:44,596][12883] Updated weights for policy 0, policy_version 125263 (0.0049) +[2024-06-18 11:38:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2052390912. Throughput: 0: 42639.5. Samples: 2052474640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:38:46,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 11:38:48,406][12883] Updated weights for policy 0, policy_version 125273 (0.0028) +[2024-06-18 11:38:51,994][12645] Fps is (10 sec: 40966.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2052620288. Throughput: 0: 42603.9. Samples: 2052732860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:38:51,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 11:38:52,194][12883] Updated weights for policy 0, policy_version 125283 (0.0028) +[2024-06-18 11:38:55,889][12883] Updated weights for policy 0, policy_version 125293 (0.0036) +[2024-06-18 11:38:56,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 2052833280. Throughput: 0: 42574.8. Samples: 2052982560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:38:56,997][12645] Avg episode reward: [(0, '0.767')] +[2024-06-18 11:39:00,018][12883] Updated weights for policy 0, policy_version 125303 (0.0043) +[2024-06-18 11:39:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2053029888. Throughput: 0: 42596.8. Samples: 2053113400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:39:01,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 11:39:03,375][12883] Updated weights for policy 0, policy_version 125313 (0.0038) +[2024-06-18 11:39:06,996][12645] Fps is (10 sec: 42598.4, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 2053259264. Throughput: 0: 42460.5. Samples: 2053365200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:39:06,997][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 11:39:07,810][12883] Updated weights for policy 0, policy_version 125323 (0.0035) +[2024-06-18 11:39:11,347][12883] Updated weights for policy 0, policy_version 125333 (0.0033) +[2024-06-18 11:39:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2053455872. Throughput: 0: 42613.8. Samples: 2053622500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:39:11,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 11:39:15,081][12862] Signal inference workers to stop experience collection... (30100 times) +[2024-06-18 11:39:15,081][12862] Signal inference workers to resume experience collection... (30100 times) +[2024-06-18 11:39:15,097][12883] InferenceWorker_p0-w0: stopping experience collection (30100 times) +[2024-06-18 11:39:15,097][12883] InferenceWorker_p0-w0: resuming experience collection (30100 times) +[2024-06-18 11:39:15,225][12883] Updated weights for policy 0, policy_version 125343 (0.0035) +[2024-06-18 11:39:16,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 2053652480. Throughput: 0: 42425.0. Samples: 2053750420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:39:16,994][12645] Avg episode reward: [(0, '0.662')] +[2024-06-18 11:39:18,998][12883] Updated weights for policy 0, policy_version 125353 (0.0035) +[2024-06-18 11:39:21,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2053914624. Throughput: 0: 42591.5. Samples: 2054013540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 11:39:21,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 11:39:23,233][12883] Updated weights for policy 0, policy_version 125363 (0.0031) +[2024-06-18 11:39:26,877][12883] Updated weights for policy 0, policy_version 125373 (0.0042) +[2024-06-18 11:39:26,994][12645] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2054111232. Throughput: 0: 42689.8. Samples: 2054267080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:39:26,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 11:39:30,932][12883] Updated weights for policy 0, policy_version 125383 (0.0032) +[2024-06-18 11:39:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2054324224. Throughput: 0: 42649.0. Samples: 2054393840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:39:31,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 11:39:34,386][12883] Updated weights for policy 0, policy_version 125393 (0.0030) +[2024-06-18 11:39:36,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 2054553600. Throughput: 0: 42724.1. Samples: 2054655440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:39:36,994][12645] Avg episode reward: [(0, '0.176')] +[2024-06-18 11:39:38,618][12883] Updated weights for policy 0, policy_version 125403 (0.0029) +[2024-06-18 11:39:41,915][12883] Updated weights for policy 0, policy_version 125413 (0.0046) +[2024-06-18 11:39:41,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42599.5, 300 sec: 42765.0). Total num frames: 2054766592. Throughput: 0: 42832.7. Samples: 2054909940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:39:41,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 11:39:46,330][12883] Updated weights for policy 0, policy_version 125423 (0.0032) +[2024-06-18 11:39:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2054946816. Throughput: 0: 42737.8. Samples: 2055036600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:39:46,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 11:39:49,458][12883] Updated weights for policy 0, policy_version 125433 (0.0031) +[2024-06-18 11:39:51,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2055176192. Throughput: 0: 42797.8. Samples: 2055291000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:39:51,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 11:39:53,935][12883] Updated weights for policy 0, policy_version 125443 (0.0028) +[2024-06-18 11:39:56,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2055405568. Throughput: 0: 42818.3. Samples: 2055549320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:39:56,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 11:39:57,346][12883] Updated weights for policy 0, policy_version 125453 (0.0045) +[2024-06-18 11:40:01,347][12883] Updated weights for policy 0, policy_version 125463 (0.0035) +[2024-06-18 11:40:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2055602176. Throughput: 0: 42800.3. Samples: 2055676440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:40:01,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 11:40:04,902][12883] Updated weights for policy 0, policy_version 125473 (0.0029) +[2024-06-18 11:40:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 2055831552. Throughput: 0: 42674.2. Samples: 2055933880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:40:06,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 11:40:09,037][12883] Updated weights for policy 0, policy_version 125483 (0.0032) +[2024-06-18 11:40:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2056044544. Throughput: 0: 42849.8. Samples: 2056195320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:40:11,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 11:40:12,495][12883] Updated weights for policy 0, policy_version 125493 (0.0033) +[2024-06-18 11:40:16,750][12883] Updated weights for policy 0, policy_version 125503 (0.0035) +[2024-06-18 11:40:16,999][12645] Fps is (10 sec: 40940.1, 60 sec: 43140.9, 300 sec: 42709.7). Total num frames: 2056241152. Throughput: 0: 42708.6. Samples: 2056315940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:40:16,999][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 11:40:20,234][12883] Updated weights for policy 0, policy_version 125513 (0.0026) +[2024-06-18 11:40:21,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2056454144. Throughput: 0: 42484.5. Samples: 2056567240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:40:21,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 11:40:24,698][12883] Updated weights for policy 0, policy_version 125523 (0.0029) +[2024-06-18 11:40:26,994][12645] Fps is (10 sec: 42618.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2056667136. Throughput: 0: 42563.0. Samples: 2056825280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 11:40:26,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 11:40:27,895][12883] Updated weights for policy 0, policy_version 125533 (0.0033) +[2024-06-18 11:40:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2056880128. Throughput: 0: 42571.6. Samples: 2056952320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:40:31,994][12645] Avg episode reward: [(0, '0.211')] +[2024-06-18 11:40:32,311][12883] Updated weights for policy 0, policy_version 125543 (0.0049) +[2024-06-18 11:40:35,983][12883] Updated weights for policy 0, policy_version 125553 (0.0028) +[2024-06-18 11:40:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42654.2). Total num frames: 2057093120. Throughput: 0: 42609.1. Samples: 2057208420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:40:36,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 11:40:37,126][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125556_2057109504.pth... +[2024-06-18 11:40:37,184][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124930_2046853120.pth +[2024-06-18 11:40:40,238][12883] Updated weights for policy 0, policy_version 125563 (0.0037) +[2024-06-18 11:40:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2057306112. Throughput: 0: 42592.9. Samples: 2057466000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:40:41,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 11:40:43,728][12883] Updated weights for policy 0, policy_version 125573 (0.0030) +[2024-06-18 11:40:45,791][12862] Signal inference workers to stop experience collection... (30150 times) +[2024-06-18 11:40:45,791][12862] Signal inference workers to resume experience collection... (30150 times) +[2024-06-18 11:40:45,833][12883] InferenceWorker_p0-w0: stopping experience collection (30150 times) +[2024-06-18 11:40:45,833][12883] InferenceWorker_p0-w0: resuming experience collection (30150 times) +[2024-06-18 11:40:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2057519104. Throughput: 0: 42531.6. Samples: 2057590360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:40:46,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 11:40:47,827][12883] Updated weights for policy 0, policy_version 125583 (0.0027) +[2024-06-18 11:40:51,412][12883] Updated weights for policy 0, policy_version 125593 (0.0037) +[2024-06-18 11:40:51,995][12645] Fps is (10 sec: 42593.5, 60 sec: 42597.5, 300 sec: 42709.3). Total num frames: 2057732096. Throughput: 0: 42476.8. Samples: 2057845380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:40:51,995][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 11:40:55,432][12883] Updated weights for policy 0, policy_version 125603 (0.0038) +[2024-06-18 11:40:56,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2057928704. Throughput: 0: 42451.5. Samples: 2058105640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:40:56,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 11:40:59,335][12883] Updated weights for policy 0, policy_version 125613 (0.0034) +[2024-06-18 11:41:01,994][12645] Fps is (10 sec: 44241.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2058174464. Throughput: 0: 42517.9. Samples: 2058229040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:41:01,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 11:41:03,052][12883] Updated weights for policy 0, policy_version 125623 (0.0039) +[2024-06-18 11:41:06,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2058371072. Throughput: 0: 42591.0. Samples: 2058483840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:41:06,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 11:41:06,995][12883] Updated weights for policy 0, policy_version 125633 (0.0039) +[2024-06-18 11:41:10,619][12883] Updated weights for policy 0, policy_version 125643 (0.0030) +[2024-06-18 11:41:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2058584064. Throughput: 0: 42537.9. Samples: 2058739480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:41:11,995][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 11:41:14,641][12883] Updated weights for policy 0, policy_version 125653 (0.0043) +[2024-06-18 11:41:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42875.1, 300 sec: 42709.5). Total num frames: 2058813440. Throughput: 0: 42556.5. Samples: 2058867360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:41:16,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 11:41:18,443][12883] Updated weights for policy 0, policy_version 125663 (0.0038) +[2024-06-18 11:41:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2058993664. Throughput: 0: 42437.5. Samples: 2059118100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:41:21,994][12645] Avg episode reward: [(0, '0.723')] +[2024-06-18 11:41:22,493][12883] Updated weights for policy 0, policy_version 125673 (0.0038) +[2024-06-18 11:41:26,093][12883] Updated weights for policy 0, policy_version 125683 (0.0034) +[2024-06-18 11:41:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2059223040. Throughput: 0: 42376.4. Samples: 2059372940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 11:41:26,994][12645] Avg episode reward: [(0, '0.864')] +[2024-06-18 11:41:30,058][12883] Updated weights for policy 0, policy_version 125693 (0.0038) +[2024-06-18 11:41:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2059419648. Throughput: 0: 42567.1. Samples: 2059505880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:41:31,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 11:41:33,682][12883] Updated weights for policy 0, policy_version 125703 (0.0035) +[2024-06-18 11:41:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2059632640. Throughput: 0: 42312.1. Samples: 2059749380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:41:36,996][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 11:41:38,283][12883] Updated weights for policy 0, policy_version 125713 (0.0027) +[2024-06-18 11:41:41,348][12883] Updated weights for policy 0, policy_version 125723 (0.0032) +[2024-06-18 11:41:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2059845632. Throughput: 0: 42279.3. Samples: 2060008200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:41:41,995][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 11:41:45,963][12883] Updated weights for policy 0, policy_version 125733 (0.0033) +[2024-06-18 11:41:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2060058624. Throughput: 0: 42522.0. Samples: 2060142520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:41:46,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 11:41:48,912][12883] Updated weights for policy 0, policy_version 125743 (0.0036) +[2024-06-18 11:41:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42599.2, 300 sec: 42765.0). Total num frames: 2060288000. Throughput: 0: 42528.9. Samples: 2060397640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:41:51,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 11:41:53,503][12883] Updated weights for policy 0, policy_version 125753 (0.0030) +[2024-06-18 11:41:56,493][12883] Updated weights for policy 0, policy_version 125763 (0.0039) +[2024-06-18 11:41:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2060500992. Throughput: 0: 42553.4. Samples: 2060654380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:41:56,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 11:42:01,043][12883] Updated weights for policy 0, policy_version 125773 (0.0033) +[2024-06-18 11:42:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2060713984. Throughput: 0: 42611.5. Samples: 2060784880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:42:01,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 11:42:04,369][12883] Updated weights for policy 0, policy_version 125783 (0.0033) +[2024-06-18 11:42:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2060943360. Throughput: 0: 42804.9. Samples: 2061044320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:42:06,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 11:42:08,477][12883] Updated weights for policy 0, policy_version 125793 (0.0033) +[2024-06-18 11:42:10,915][12862] Signal inference workers to stop experience collection... (30200 times) +[2024-06-18 11:42:10,915][12862] Signal inference workers to resume experience collection... (30200 times) +[2024-06-18 11:42:10,945][12883] InferenceWorker_p0-w0: stopping experience collection (30200 times) +[2024-06-18 11:42:10,945][12883] InferenceWorker_p0-w0: resuming experience collection (30200 times) +[2024-06-18 11:42:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2061139968. Throughput: 0: 42833.9. Samples: 2061300460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:42:11,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 11:42:12,013][12883] Updated weights for policy 0, policy_version 125803 (0.0039) +[2024-06-18 11:42:16,032][12883] Updated weights for policy 0, policy_version 125813 (0.0029) +[2024-06-18 11:42:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2061352960. Throughput: 0: 42696.5. Samples: 2061427220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:42:16,994][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 11:42:19,551][12883] Updated weights for policy 0, policy_version 125823 (0.0039) +[2024-06-18 11:42:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2061598720. Throughput: 0: 43116.5. Samples: 2061689620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:42:21,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 11:42:23,643][12883] Updated weights for policy 0, policy_version 125833 (0.0028) +[2024-06-18 11:42:26,929][12883] Updated weights for policy 0, policy_version 125843 (0.0040) +[2024-06-18 11:42:26,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42654.2). Total num frames: 2061811712. Throughput: 0: 43060.8. Samples: 2061945940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:42:26,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 11:42:31,269][12883] Updated weights for policy 0, policy_version 125853 (0.0042) +[2024-06-18 11:42:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2061975552. Throughput: 0: 42844.4. Samples: 2062070520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 11:42:31,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 11:42:34,618][12883] Updated weights for policy 0, policy_version 125863 (0.0033) +[2024-06-18 11:42:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2062221312. Throughput: 0: 42999.4. Samples: 2062332620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:42:36,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 11:42:37,080][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125869_2062237696.pth... +[2024-06-18 11:42:37,137][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125243_2051981312.pth +[2024-06-18 11:42:39,135][12883] Updated weights for policy 0, policy_version 125873 (0.0030) +[2024-06-18 11:42:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2062434304. Throughput: 0: 42855.2. Samples: 2062582860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:42:41,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 11:42:42,340][12883] Updated weights for policy 0, policy_version 125883 (0.0030) +[2024-06-18 11:42:46,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2062614528. Throughput: 0: 42754.7. Samples: 2062708840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:42:46,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 11:42:47,139][12883] Updated weights for policy 0, policy_version 125893 (0.0031) +[2024-06-18 11:42:50,291][12883] Updated weights for policy 0, policy_version 125903 (0.0027) +[2024-06-18 11:42:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2062843904. Throughput: 0: 42736.9. Samples: 2062967480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:42:51,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 11:42:54,724][12883] Updated weights for policy 0, policy_version 125913 (0.0038) +[2024-06-18 11:42:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2063073280. Throughput: 0: 42728.7. Samples: 2063223260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:42:56,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 11:42:58,316][12883] Updated weights for policy 0, policy_version 125923 (0.0036) +[2024-06-18 11:43:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2063269888. Throughput: 0: 42729.2. Samples: 2063350040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:43:01,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 11:43:02,262][12883] Updated weights for policy 0, policy_version 125933 (0.0027) +[2024-06-18 11:43:06,020][12883] Updated weights for policy 0, policy_version 125943 (0.0031) +[2024-06-18 11:43:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2063499264. Throughput: 0: 42657.8. Samples: 2063609220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:43:06,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 11:43:09,837][12883] Updated weights for policy 0, policy_version 125953 (0.0031) +[2024-06-18 11:43:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2063728640. Throughput: 0: 42631.3. Samples: 2063864340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:43:11,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 11:43:13,536][12883] Updated weights for policy 0, policy_version 125963 (0.0040) +[2024-06-18 11:43:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2063892480. Throughput: 0: 42715.1. Samples: 2063992700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:43:16,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 11:43:17,525][12883] Updated weights for policy 0, policy_version 125973 (0.0036) +[2024-06-18 11:43:20,967][12862] Signal inference workers to stop experience collection... (30250 times) +[2024-06-18 11:43:20,997][12883] InferenceWorker_p0-w0: stopping experience collection (30250 times) +[2024-06-18 11:43:21,019][12862] Signal inference workers to resume experience collection... (30250 times) +[2024-06-18 11:43:21,020][12883] InferenceWorker_p0-w0: resuming experience collection (30250 times) +[2024-06-18 11:43:21,154][12883] Updated weights for policy 0, policy_version 125983 (0.0030) +[2024-06-18 11:43:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2064138240. Throughput: 0: 42707.2. Samples: 2064254440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:43:22,006][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 11:43:25,540][12883] Updated weights for policy 0, policy_version 125993 (0.0031) +[2024-06-18 11:43:26,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2064367616. Throughput: 0: 42631.9. Samples: 2064501300. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:43:26,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 11:43:28,778][12883] Updated weights for policy 0, policy_version 126003 (0.0037) +[2024-06-18 11:43:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2064547840. Throughput: 0: 42740.3. Samples: 2064632160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) +[2024-06-18 11:43:31,995][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 11:43:33,121][12883] Updated weights for policy 0, policy_version 126013 (0.0048) +[2024-06-18 11:43:36,387][12883] Updated weights for policy 0, policy_version 126023 (0.0036) +[2024-06-18 11:43:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42654.2). Total num frames: 2064793600. Throughput: 0: 42800.0. Samples: 2064893480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:43:36,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 11:43:40,702][12883] Updated weights for policy 0, policy_version 126033 (0.0034) +[2024-06-18 11:43:41,996][12645] Fps is (10 sec: 44227.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 2064990208. Throughput: 0: 42844.1. Samples: 2065151340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:43:41,997][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 11:43:44,046][12883] Updated weights for policy 0, policy_version 126043 (0.0028) +[2024-06-18 11:43:46,994][12645] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2065203200. Throughput: 0: 42802.2. Samples: 2065276140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:43:46,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 11:43:48,178][12883] Updated weights for policy 0, policy_version 126053 (0.0033) +[2024-06-18 11:43:51,689][12883] Updated weights for policy 0, policy_version 126063 (0.0040) +[2024-06-18 11:43:52,000][12645] Fps is (10 sec: 42581.6, 60 sec: 42867.0, 300 sec: 42653.4). Total num frames: 2065416192. Throughput: 0: 42747.0. Samples: 2065533100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:43:52,001][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 11:43:55,906][12883] Updated weights for policy 0, policy_version 126073 (0.0040) +[2024-06-18 11:43:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2065629184. Throughput: 0: 42799.4. Samples: 2065790320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:43:56,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 11:43:59,130][12883] Updated weights for policy 0, policy_version 126083 (0.0028) +[2024-06-18 11:44:01,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 2065842176. Throughput: 0: 42727.6. Samples: 2065915440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:44:01,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 11:44:03,573][12883] Updated weights for policy 0, policy_version 126093 (0.0045) +[2024-06-18 11:44:06,766][12883] Updated weights for policy 0, policy_version 126103 (0.0027) +[2024-06-18 11:44:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2066071552. Throughput: 0: 42656.5. Samples: 2066173980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:44:06,994][12645] Avg episode reward: [(0, '0.637')] +[2024-06-18 11:44:11,125][12883] Updated weights for policy 0, policy_version 126113 (0.0038) +[2024-06-18 11:44:11,994][12645] Fps is (10 sec: 42595.9, 60 sec: 42324.9, 300 sec: 42764.9). Total num frames: 2066268160. Throughput: 0: 42792.4. Samples: 2066426980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:44:11,995][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 11:44:14,779][12883] Updated weights for policy 0, policy_version 126123 (0.0038) +[2024-06-18 11:44:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2066481152. Throughput: 0: 42795.2. Samples: 2066557940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:44:16,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 11:44:18,758][12883] Updated weights for policy 0, policy_version 126133 (0.0040) +[2024-06-18 11:44:21,994][12645] Fps is (10 sec: 42600.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2066694144. Throughput: 0: 42687.5. Samples: 2066814420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:44:21,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 11:44:22,325][12883] Updated weights for policy 0, policy_version 126143 (0.0026) +[2024-06-18 11:44:26,170][12883] Updated weights for policy 0, policy_version 126153 (0.0037) +[2024-06-18 11:44:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2066907136. Throughput: 0: 42789.8. Samples: 2067076780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:44:26,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 11:44:29,678][12883] Updated weights for policy 0, policy_version 126163 (0.0031) +[2024-06-18 11:44:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 2067136512. Throughput: 0: 42911.7. Samples: 2067207160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:44:31,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 11:44:33,838][12883] Updated weights for policy 0, policy_version 126173 (0.0034) +[2024-06-18 11:44:36,996][12645] Fps is (10 sec: 45865.1, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2067365888. Throughput: 0: 43074.6. Samples: 2067471280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 11:44:36,996][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 11:44:37,070][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126183_2067382272.pth... +[2024-06-18 11:44:37,073][12883] Updated weights for policy 0, policy_version 126183 (0.0029) +[2024-06-18 11:44:37,121][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125556_2057109504.pth +[2024-06-18 11:44:37,867][12862] Signal inference workers to stop experience collection... (30300 times) +[2024-06-18 11:44:37,867][12862] Signal inference workers to resume experience collection... (30300 times) +[2024-06-18 11:44:37,912][12883] InferenceWorker_p0-w0: stopping experience collection (30300 times) +[2024-06-18 11:44:37,912][12883] InferenceWorker_p0-w0: resuming experience collection (30300 times) +[2024-06-18 11:44:41,335][12883] Updated weights for policy 0, policy_version 126193 (0.0034) +[2024-06-18 11:44:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2067546112. Throughput: 0: 43062.3. Samples: 2067728120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:44:41,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 11:44:44,717][12883] Updated weights for policy 0, policy_version 126203 (0.0023) +[2024-06-18 11:44:46,994][12645] Fps is (10 sec: 40968.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2067775488. Throughput: 0: 43143.4. Samples: 2067856900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:44:46,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 11:44:48,767][12883] Updated weights for policy 0, policy_version 126213 (0.0032) +[2024-06-18 11:44:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43149.0, 300 sec: 42709.5). Total num frames: 2068004864. Throughput: 0: 43084.0. Samples: 2068112760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:44:51,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 11:44:52,448][12883] Updated weights for policy 0, policy_version 126223 (0.0038) +[2024-06-18 11:44:56,253][12883] Updated weights for policy 0, policy_version 126233 (0.0024) +[2024-06-18 11:44:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2068201472. Throughput: 0: 43133.1. Samples: 2068367940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:44:56,994][12645] Avg episode reward: [(0, '0.709')] +[2024-06-18 11:44:59,923][12883] Updated weights for policy 0, policy_version 126243 (0.0033) +[2024-06-18 11:45:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2068430848. Throughput: 0: 43163.9. Samples: 2068500320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:45:01,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 11:45:04,326][12883] Updated weights for policy 0, policy_version 126253 (0.0040) +[2024-06-18 11:45:06,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 2068643840. Throughput: 0: 43178.3. Samples: 2068757540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:45:06,997][12645] Avg episode reward: [(0, '0.669')] +[2024-06-18 11:45:07,562][12883] Updated weights for policy 0, policy_version 126263 (0.0040) +[2024-06-18 11:45:11,887][12883] Updated weights for policy 0, policy_version 126273 (0.0044) +[2024-06-18 11:45:11,995][12645] Fps is (10 sec: 42591.3, 60 sec: 43143.7, 300 sec: 42765.5). Total num frames: 2068856832. Throughput: 0: 43155.1. Samples: 2069018840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:45:11,996][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 11:45:15,448][12883] Updated weights for policy 0, policy_version 126283 (0.0033) +[2024-06-18 11:45:16,994][12645] Fps is (10 sec: 42608.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2069069824. Throughput: 0: 43062.2. Samples: 2069144960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:45:16,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 11:45:19,467][12883] Updated weights for policy 0, policy_version 126293 (0.0026) +[2024-06-18 11:45:21,994][12645] Fps is (10 sec: 44244.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2069299200. Throughput: 0: 42934.9. Samples: 2069403260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:45:21,996][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 11:45:23,011][12883] Updated weights for policy 0, policy_version 126303 (0.0023) +[2024-06-18 11:45:26,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2069479424. Throughput: 0: 42829.6. Samples: 2069655460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:45:26,995][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 11:45:27,413][12883] Updated weights for policy 0, policy_version 126313 (0.0039) +[2024-06-18 11:45:30,822][12883] Updated weights for policy 0, policy_version 126323 (0.0029) +[2024-06-18 11:45:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2069708800. Throughput: 0: 42730.3. Samples: 2069779760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:45:31,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 11:45:35,013][12883] Updated weights for policy 0, policy_version 126333 (0.0041) +[2024-06-18 11:45:36,994][12645] Fps is (10 sec: 45876.3, 60 sec: 42873.0, 300 sec: 42820.6). Total num frames: 2069938176. Throughput: 0: 42888.0. Samples: 2070042720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) +[2024-06-18 11:45:36,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 11:45:38,458][12883] Updated weights for policy 0, policy_version 126343 (0.0037) +[2024-06-18 11:45:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2070118400. Throughput: 0: 42906.5. Samples: 2070298740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:45:41,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 11:45:42,762][12883] Updated weights for policy 0, policy_version 126353 (0.0025) +[2024-06-18 11:45:46,014][12883] Updated weights for policy 0, policy_version 126363 (0.0032) +[2024-06-18 11:45:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.2). Total num frames: 2070347776. Throughput: 0: 42697.5. Samples: 2070421700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:45:46,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 11:45:50,294][12883] Updated weights for policy 0, policy_version 126373 (0.0025) +[2024-06-18 11:45:51,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2070577152. Throughput: 0: 42730.6. Samples: 2070680320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:45:51,994][12645] Avg episode reward: [(0, '0.668')] +[2024-06-18 11:45:54,111][12883] Updated weights for policy 0, policy_version 126383 (0.0039) +[2024-06-18 11:45:56,999][12645] Fps is (10 sec: 40937.1, 60 sec: 42594.4, 300 sec: 42653.2). Total num frames: 2070757376. Throughput: 0: 42635.6. Samples: 2070937600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:45:57,000][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 11:45:58,252][12883] Updated weights for policy 0, policy_version 126393 (0.0039) +[2024-06-18 11:46:01,718][12883] Updated weights for policy 0, policy_version 126403 (0.0034) +[2024-06-18 11:46:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2071003136. Throughput: 0: 42540.4. Samples: 2071059280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:01,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 11:46:05,820][12883] Updated weights for policy 0, policy_version 126413 (0.0022) +[2024-06-18 11:46:06,994][12645] Fps is (10 sec: 42622.0, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 2071183360. Throughput: 0: 42644.5. Samples: 2071322260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:06,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 11:46:09,319][12883] Updated weights for policy 0, policy_version 126423 (0.0030) +[2024-06-18 11:46:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42326.7, 300 sec: 42653.9). Total num frames: 2071396352. Throughput: 0: 42602.5. Samples: 2071572560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:11,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 11:46:13,510][12883] Updated weights for policy 0, policy_version 126433 (0.0027) +[2024-06-18 11:46:16,849][12883] Updated weights for policy 0, policy_version 126443 (0.0031) +[2024-06-18 11:46:16,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2071642112. Throughput: 0: 42751.5. Samples: 2071703580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:16,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 11:46:21,202][12883] Updated weights for policy 0, policy_version 126453 (0.0031) +[2024-06-18 11:46:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 2071805952. Throughput: 0: 42550.6. Samples: 2071957500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:21,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 11:46:22,551][12862] Signal inference workers to stop experience collection... (30350 times) +[2024-06-18 11:46:22,599][12883] InferenceWorker_p0-w0: stopping experience collection (30350 times) +[2024-06-18 11:46:22,611][12862] Signal inference workers to resume experience collection... (30350 times) +[2024-06-18 11:46:22,624][12883] InferenceWorker_p0-w0: resuming experience collection (30350 times) +[2024-06-18 11:46:24,440][12883] Updated weights for policy 0, policy_version 126463 (0.0027) +[2024-06-18 11:46:26,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 2072051712. Throughput: 0: 42608.1. Samples: 2072216200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:26,996][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 11:46:28,837][12883] Updated weights for policy 0, policy_version 126473 (0.0037) +[2024-06-18 11:46:31,990][12883] Updated weights for policy 0, policy_version 126483 (0.0038) +[2024-06-18 11:46:31,994][12645] Fps is (10 sec: 49152.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2072297472. Throughput: 0: 42745.8. Samples: 2072345260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:31,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 11:46:36,640][12883] Updated weights for policy 0, policy_version 126493 (0.0028) +[2024-06-18 11:46:36,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2072461312. Throughput: 0: 42560.9. Samples: 2072595560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:36,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 11:46:37,108][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126494_2072477696.pth... +[2024-06-18 11:46:37,169][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125869_2062237696.pth +[2024-06-18 11:46:40,331][12883] Updated weights for policy 0, policy_version 126503 (0.0038) +[2024-06-18 11:46:41,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2072674304. Throughput: 0: 42431.8. Samples: 2072846800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 11:46:41,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 11:46:44,577][12883] Updated weights for policy 0, policy_version 126513 (0.0048) +[2024-06-18 11:46:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2072920064. Throughput: 0: 42638.1. Samples: 2072978000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:46:46,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 11:46:47,871][12883] Updated weights for policy 0, policy_version 126523 (0.0043) +[2024-06-18 11:46:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2073100288. Throughput: 0: 42475.9. Samples: 2073233680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:46:51,996][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 11:46:52,081][12883] Updated weights for policy 0, policy_version 126533 (0.0022) +[2024-06-18 11:46:55,372][12883] Updated weights for policy 0, policy_version 126543 (0.0039) +[2024-06-18 11:46:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42602.3, 300 sec: 42709.5). Total num frames: 2073313280. Throughput: 0: 42535.5. Samples: 2073486660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:46:56,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 11:46:59,573][12883] Updated weights for policy 0, policy_version 126553 (0.0037) +[2024-06-18 11:47:02,000][12645] Fps is (10 sec: 45847.0, 60 sec: 42594.0, 300 sec: 42764.1). Total num frames: 2073559040. Throughput: 0: 42467.1. Samples: 2073614860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:02,000][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 11:47:03,082][12883] Updated weights for policy 0, policy_version 126563 (0.0033) +[2024-06-18 11:47:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2073739264. Throughput: 0: 42471.6. Samples: 2073868720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:06,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 11:47:07,152][12883] Updated weights for policy 0, policy_version 126573 (0.0042) +[2024-06-18 11:47:11,148][12883] Updated weights for policy 0, policy_version 126583 (0.0033) +[2024-06-18 11:47:11,994][12645] Fps is (10 sec: 39345.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2073952256. Throughput: 0: 42371.9. Samples: 2074122840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:11,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 11:47:14,767][12883] Updated weights for policy 0, policy_version 126593 (0.0031) +[2024-06-18 11:47:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2074181632. Throughput: 0: 42391.0. Samples: 2074252860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:17,000][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 11:47:18,876][12883] Updated weights for policy 0, policy_version 126603 (0.0029) +[2024-06-18 11:47:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2074378240. Throughput: 0: 42527.5. Samples: 2074509300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:21,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 11:47:22,412][12883] Updated weights for policy 0, policy_version 126613 (0.0036) +[2024-06-18 11:47:26,432][12883] Updated weights for policy 0, policy_version 126623 (0.0027) +[2024-06-18 11:47:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2074591232. Throughput: 0: 42584.6. Samples: 2074763100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:26,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 11:47:30,537][12883] Updated weights for policy 0, policy_version 126633 (0.0028) +[2024-06-18 11:47:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2074820608. Throughput: 0: 42553.4. Samples: 2074892900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:31,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 11:47:33,990][12883] Updated weights for policy 0, policy_version 126643 (0.0036) +[2024-06-18 11:47:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2075017216. Throughput: 0: 42580.6. Samples: 2075149800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:36,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 11:47:38,071][12883] Updated weights for policy 0, policy_version 126653 (0.0032) +[2024-06-18 11:47:41,590][12883] Updated weights for policy 0, policy_version 126663 (0.0038) +[2024-06-18 11:47:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2075246592. Throughput: 0: 42566.7. Samples: 2075402160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 11:47:41,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 11:47:45,801][12883] Updated weights for policy 0, policy_version 126673 (0.0038) +[2024-06-18 11:47:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2075475968. Throughput: 0: 42711.6. Samples: 2075536620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:47:46,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 11:47:49,158][12883] Updated weights for policy 0, policy_version 126683 (0.0036) +[2024-06-18 11:47:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2075656192. Throughput: 0: 42753.0. Samples: 2075792600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:47:51,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 11:47:53,719][12883] Updated weights for policy 0, policy_version 126693 (0.0035) +[2024-06-18 11:47:56,807][12883] Updated weights for policy 0, policy_version 126703 (0.0031) +[2024-06-18 11:47:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2075901952. Throughput: 0: 42777.4. Samples: 2076047820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:47:56,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 11:47:59,428][12862] Signal inference workers to stop experience collection... (30400 times) +[2024-06-18 11:47:59,432][12862] Signal inference workers to resume experience collection... (30400 times) +[2024-06-18 11:47:59,471][12883] InferenceWorker_p0-w0: stopping experience collection (30400 times) +[2024-06-18 11:47:59,471][12883] InferenceWorker_p0-w0: resuming experience collection (30400 times) +[2024-06-18 11:48:01,445][12883] Updated weights for policy 0, policy_version 126713 (0.0025) +[2024-06-18 11:48:02,000][12645] Fps is (10 sec: 44209.0, 60 sec: 42325.3, 300 sec: 42708.6). Total num frames: 2076098560. Throughput: 0: 42719.4. Samples: 2076175500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:02,001][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 11:48:04,583][12883] Updated weights for policy 0, policy_version 126723 (0.0035) +[2024-06-18 11:48:06,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2076278784. Throughput: 0: 42633.7. Samples: 2076427820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:06,994][12645] Avg episode reward: [(0, '0.841')] +[2024-06-18 11:48:08,976][12883] Updated weights for policy 0, policy_version 126733 (0.0026) +[2024-06-18 11:48:11,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2076524544. Throughput: 0: 42726.6. Samples: 2076685800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:11,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 11:48:12,306][12883] Updated weights for policy 0, policy_version 126743 (0.0038) +[2024-06-18 11:48:16,410][12883] Updated weights for policy 0, policy_version 126753 (0.0033) +[2024-06-18 11:48:16,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2076737536. Throughput: 0: 42742.2. Samples: 2076816300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:16,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 11:48:20,390][12883] Updated weights for policy 0, policy_version 126763 (0.0032) +[2024-06-18 11:48:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2076950528. Throughput: 0: 42716.8. Samples: 2077072060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:21,995][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 11:48:24,211][12883] Updated weights for policy 0, policy_version 126773 (0.0028) +[2024-06-18 11:48:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2077163520. Throughput: 0: 42621.7. Samples: 2077320140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:26,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 11:48:28,250][12883] Updated weights for policy 0, policy_version 126783 (0.0048) +[2024-06-18 11:48:31,964][12883] Updated weights for policy 0, policy_version 126793 (0.0037) +[2024-06-18 11:48:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2077376512. Throughput: 0: 42528.4. Samples: 2077450400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:31,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 11:48:35,846][12883] Updated weights for policy 0, policy_version 126803 (0.0041) +[2024-06-18 11:48:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 2077605888. Throughput: 0: 42742.1. Samples: 2077716000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:36,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 11:48:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126807_2077605888.pth... +[2024-06-18 11:48:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126183_2067382272.pth +[2024-06-18 11:48:39,438][12883] Updated weights for policy 0, policy_version 126813 (0.0035) +[2024-06-18 11:48:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2077818880. Throughput: 0: 42601.8. Samples: 2077964900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 11:48:42,000][12645] Avg episode reward: [(0, '0.909')] +[2024-06-18 11:48:42,000][12862] Saving new best policy, reward=0.909! +[2024-06-18 11:48:43,424][12883] Updated weights for policy 0, policy_version 126823 (0.0037) +[2024-06-18 11:48:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 2078015488. Throughput: 0: 42596.2. Samples: 2078092060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:48:46,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 11:48:47,046][12883] Updated weights for policy 0, policy_version 126833 (0.0027) +[2024-06-18 11:48:51,229][12883] Updated weights for policy 0, policy_version 126843 (0.0021) +[2024-06-18 11:48:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2078228480. Throughput: 0: 42756.9. Samples: 2078351880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:48:51,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 11:48:54,500][12883] Updated weights for policy 0, policy_version 126853 (0.0038) +[2024-06-18 11:48:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2078441472. Throughput: 0: 42536.4. Samples: 2078599940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:48:56,994][12645] Avg episode reward: [(0, '0.611')] +[2024-06-18 11:48:58,788][12883] Updated weights for policy 0, policy_version 126863 (0.0040) +[2024-06-18 11:49:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42875.9, 300 sec: 42709.5). Total num frames: 2078670848. Throughput: 0: 42554.7. Samples: 2078731260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:01,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 11:49:02,555][12883] Updated weights for policy 0, policy_version 126873 (0.0030) +[2024-06-18 11:49:06,584][12883] Updated weights for policy 0, policy_version 126883 (0.0041) +[2024-06-18 11:49:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2078867456. Throughput: 0: 42668.4. Samples: 2078992140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:06,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 11:49:10,318][12883] Updated weights for policy 0, policy_version 126893 (0.0034) +[2024-06-18 11:49:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2079096832. Throughput: 0: 42669.3. Samples: 2079240260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:11,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 11:49:14,237][12883] Updated weights for policy 0, policy_version 126903 (0.0033) +[2024-06-18 11:49:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2079293440. Throughput: 0: 42783.3. Samples: 2079375640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:16,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 11:49:17,765][12883] Updated weights for policy 0, policy_version 126913 (0.0038) +[2024-06-18 11:49:21,752][12883] Updated weights for policy 0, policy_version 126923 (0.0030) +[2024-06-18 11:49:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2079522816. Throughput: 0: 42532.4. Samples: 2079629960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:21,995][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 11:49:25,461][12883] Updated weights for policy 0, policy_version 126933 (0.0029) +[2024-06-18 11:49:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2079735808. Throughput: 0: 42649.8. Samples: 2079884140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:26,994][12645] Avg episode reward: [(0, '0.709')] +[2024-06-18 11:49:29,367][12883] Updated weights for policy 0, policy_version 126943 (0.0048) +[2024-06-18 11:49:30,977][12862] Signal inference workers to stop experience collection... (30450 times) +[2024-06-18 11:49:31,015][12883] InferenceWorker_p0-w0: stopping experience collection (30450 times) +[2024-06-18 11:49:31,023][12862] Signal inference workers to resume experience collection... (30450 times) +[2024-06-18 11:49:31,029][12883] InferenceWorker_p0-w0: resuming experience collection (30450 times) +[2024-06-18 11:49:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 2079948800. Throughput: 0: 42715.1. Samples: 2080014240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:31,994][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 11:49:33,442][12883] Updated weights for policy 0, policy_version 126953 (0.0036) +[2024-06-18 11:49:36,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42323.8, 300 sec: 42709.1). Total num frames: 2080145408. Throughput: 0: 42709.5. Samples: 2080273900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:36,996][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 11:49:37,200][12883] Updated weights for policy 0, policy_version 126963 (0.0040) +[2024-06-18 11:49:41,099][12883] Updated weights for policy 0, policy_version 126973 (0.0033) +[2024-06-18 11:49:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2080374784. Throughput: 0: 42889.0. Samples: 2080529940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:41,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 11:49:44,742][12883] Updated weights for policy 0, policy_version 126983 (0.0030) +[2024-06-18 11:49:46,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2080587776. Throughput: 0: 42912.9. Samples: 2080662340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 11:49:46,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 11:49:48,783][12883] Updated weights for policy 0, policy_version 126993 (0.0029) +[2024-06-18 11:49:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2080784384. Throughput: 0: 42800.6. Samples: 2080918160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:49:51,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 11:49:52,301][12883] Updated weights for policy 0, policy_version 127003 (0.0042) +[2024-06-18 11:49:56,418][12883] Updated weights for policy 0, policy_version 127013 (0.0028) +[2024-06-18 11:49:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2081013760. Throughput: 0: 42903.1. Samples: 2081170900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:49:56,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 11:49:59,748][12883] Updated weights for policy 0, policy_version 127023 (0.0034) +[2024-06-18 11:50:01,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 2081226752. Throughput: 0: 42760.3. Samples: 2081299860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:01,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 11:50:04,053][12883] Updated weights for policy 0, policy_version 127033 (0.0047) +[2024-06-18 11:50:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 2081439744. Throughput: 0: 42774.7. Samples: 2081554820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:06,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 11:50:07,801][12883] Updated weights for policy 0, policy_version 127043 (0.0044) +[2024-06-18 11:50:11,514][12883] Updated weights for policy 0, policy_version 127053 (0.0040) +[2024-06-18 11:50:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2081636352. Throughput: 0: 42742.2. Samples: 2081807540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:11,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 11:50:15,490][12883] Updated weights for policy 0, policy_version 127063 (0.0029) +[2024-06-18 11:50:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2081849344. Throughput: 0: 42618.9. Samples: 2081932100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:16,995][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 11:50:19,085][12883] Updated weights for policy 0, policy_version 127073 (0.0026) +[2024-06-18 11:50:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2082078720. Throughput: 0: 42662.1. Samples: 2082193600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:21,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 11:50:22,985][12883] Updated weights for policy 0, policy_version 127083 (0.0042) +[2024-06-18 11:50:26,710][12883] Updated weights for policy 0, policy_version 127093 (0.0038) +[2024-06-18 11:50:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2082291712. Throughput: 0: 42400.0. Samples: 2082437940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:26,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 11:50:30,683][12883] Updated weights for policy 0, policy_version 127103 (0.0028) +[2024-06-18 11:50:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2082504704. Throughput: 0: 42321.7. Samples: 2082566820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:31,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 11:50:34,508][12883] Updated weights for policy 0, policy_version 127113 (0.0032) +[2024-06-18 11:50:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2082717696. Throughput: 0: 42535.1. Samples: 2082832240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:36,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 11:50:37,129][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127120_2082734080.pth... +[2024-06-18 11:50:37,181][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126494_2072477696.pth +[2024-06-18 11:50:38,101][12883] Updated weights for policy 0, policy_version 127123 (0.0033) +[2024-06-18 11:50:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2082930688. Throughput: 0: 42558.3. Samples: 2083086020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:41,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 11:50:42,104][12883] Updated weights for policy 0, policy_version 127133 (0.0030) +[2024-06-18 11:50:45,999][12883] Updated weights for policy 0, policy_version 127143 (0.0047) +[2024-06-18 11:50:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2083160064. Throughput: 0: 42628.2. Samples: 2083218120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 11:50:46,994][12645] Avg episode reward: [(0, '0.611')] +[2024-06-18 11:50:49,742][12883] Updated weights for policy 0, policy_version 127153 (0.0029) +[2024-06-18 11:50:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42654.7). Total num frames: 2083340288. Throughput: 0: 42595.5. Samples: 2083471620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:50:51,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 11:50:53,879][12883] Updated weights for policy 0, policy_version 127163 (0.0028) +[2024-06-18 11:50:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2083569664. Throughput: 0: 42695.9. Samples: 2083728860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:50:56,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 11:50:57,264][12883] Updated weights for policy 0, policy_version 127173 (0.0037) +[2024-06-18 11:50:59,078][12862] Signal inference workers to stop experience collection... (30500 times) +[2024-06-18 11:50:59,079][12862] Signal inference workers to resume experience collection... (30500 times) +[2024-06-18 11:50:59,101][12883] InferenceWorker_p0-w0: stopping experience collection (30500 times) +[2024-06-18 11:50:59,132][12883] InferenceWorker_p0-w0: resuming experience collection (30500 times) +[2024-06-18 11:51:01,389][12883] Updated weights for policy 0, policy_version 127183 (0.0031) +[2024-06-18 11:51:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2083799040. Throughput: 0: 42905.9. Samples: 2083862860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:01,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 11:51:04,867][12883] Updated weights for policy 0, policy_version 127193 (0.0030) +[2024-06-18 11:51:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2083979264. Throughput: 0: 42808.9. Samples: 2084120000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:06,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 11:51:09,028][12883] Updated weights for policy 0, policy_version 127203 (0.0034) +[2024-06-18 11:51:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2084225024. Throughput: 0: 42905.6. Samples: 2084368700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:11,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 11:51:13,006][12883] Updated weights for policy 0, policy_version 127213 (0.0034) +[2024-06-18 11:51:16,539][12883] Updated weights for policy 0, policy_version 127223 (0.0037) +[2024-06-18 11:51:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2084438016. Throughput: 0: 43051.6. Samples: 2084504140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:16,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 11:51:20,991][12883] Updated weights for policy 0, policy_version 127233 (0.0041) +[2024-06-18 11:51:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 2084634624. Throughput: 0: 42785.2. Samples: 2084757580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:21,995][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 11:51:24,547][12883] Updated weights for policy 0, policy_version 127243 (0.0034) +[2024-06-18 11:51:26,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2084864000. Throughput: 0: 42544.2. Samples: 2085000520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:26,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 11:51:28,562][12883] Updated weights for policy 0, policy_version 127253 (0.0028) +[2024-06-18 11:51:31,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 2085060608. Throughput: 0: 42718.7. Samples: 2085140560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:31,996][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 11:51:32,228][12883] Updated weights for policy 0, policy_version 127263 (0.0033) +[2024-06-18 11:51:36,164][12883] Updated weights for policy 0, policy_version 127273 (0.0033) +[2024-06-18 11:51:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2085273600. Throughput: 0: 42768.0. Samples: 2085396180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:36,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 11:51:40,071][12883] Updated weights for policy 0, policy_version 127283 (0.0029) +[2024-06-18 11:51:41,994][12645] Fps is (10 sec: 45884.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2085519360. Throughput: 0: 42456.8. Samples: 2085639420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:41,994][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 11:51:43,754][12883] Updated weights for policy 0, policy_version 127293 (0.0033) +[2024-06-18 11:51:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2085683200. Throughput: 0: 42584.8. Samples: 2085779180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:46,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 11:51:47,748][12883] Updated weights for policy 0, policy_version 127303 (0.0038) +[2024-06-18 11:51:51,298][12883] Updated weights for policy 0, policy_version 127313 (0.0028) +[2024-06-18 11:51:51,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2085912576. Throughput: 0: 42603.2. Samples: 2086037240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) +[2024-06-18 11:51:51,997][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 11:51:55,281][12883] Updated weights for policy 0, policy_version 127323 (0.0028) +[2024-06-18 11:51:56,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 2086158336. Throughput: 0: 42684.0. Samples: 2086289480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:51:56,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 11:51:58,957][12883] Updated weights for policy 0, policy_version 127333 (0.0025) +[2024-06-18 11:52:01,996][12645] Fps is (10 sec: 40960.1, 60 sec: 42050.7, 300 sec: 42653.6). Total num frames: 2086322176. Throughput: 0: 42776.0. Samples: 2086429160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:01,996][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 11:52:02,907][12883] Updated weights for policy 0, policy_version 127343 (0.0050) +[2024-06-18 11:52:06,600][12883] Updated weights for policy 0, policy_version 127353 (0.0034) +[2024-06-18 11:52:06,996][12645] Fps is (10 sec: 39313.3, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2086551552. Throughput: 0: 42602.4. Samples: 2086674780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:06,996][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 11:52:10,538][12883] Updated weights for policy 0, policy_version 127363 (0.0036) +[2024-06-18 11:52:11,404][12862] Signal inference workers to stop experience collection... (30550 times) +[2024-06-18 11:52:11,405][12862] Signal inference workers to resume experience collection... (30550 times) +[2024-06-18 11:52:11,432][12883] InferenceWorker_p0-w0: stopping experience collection (30550 times) +[2024-06-18 11:52:11,432][12883] InferenceWorker_p0-w0: resuming experience collection (30550 times) +[2024-06-18 11:52:11,994][12645] Fps is (10 sec: 47524.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2086797312. Throughput: 0: 42948.6. Samples: 2086933200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:11,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 11:52:14,250][12883] Updated weights for policy 0, policy_version 127373 (0.0039) +[2024-06-18 11:52:16,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2086961152. Throughput: 0: 42776.7. Samples: 2087065420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:16,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 11:52:18,153][12883] Updated weights for policy 0, policy_version 127383 (0.0034) +[2024-06-18 11:52:21,965][12883] Updated weights for policy 0, policy_version 127393 (0.0041) +[2024-06-18 11:52:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2087206912. Throughput: 0: 42617.0. Samples: 2087313940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:21,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 11:52:25,866][12883] Updated weights for policy 0, policy_version 127403 (0.0036) +[2024-06-18 11:52:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2087419904. Throughput: 0: 43006.7. Samples: 2087574720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:26,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 11:52:29,687][12883] Updated weights for policy 0, policy_version 127413 (0.0031) +[2024-06-18 11:52:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 2087600128. Throughput: 0: 42870.3. Samples: 2087708340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:31,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 11:52:33,698][12883] Updated weights for policy 0, policy_version 127423 (0.0042) +[2024-06-18 11:52:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2087845888. Throughput: 0: 42655.9. Samples: 2087956660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:36,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 11:52:37,132][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127433_2087862272.pth... +[2024-06-18 11:52:37,141][12883] Updated weights for policy 0, policy_version 127433 (0.0034) +[2024-06-18 11:52:37,184][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126807_2077605888.pth +[2024-06-18 11:52:41,344][12883] Updated weights for policy 0, policy_version 127443 (0.0047) +[2024-06-18 11:52:41,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2088058880. Throughput: 0: 42801.5. Samples: 2088215540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:41,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 11:52:44,758][12883] Updated weights for policy 0, policy_version 127453 (0.0033) +[2024-06-18 11:52:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2088255488. Throughput: 0: 42537.2. Samples: 2088343240. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:46,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 11:52:48,854][12883] Updated weights for policy 0, policy_version 127463 (0.0038) +[2024-06-18 11:52:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 2088484864. Throughput: 0: 42729.2. Samples: 2088597500. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 11:52:51,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 11:52:52,450][12883] Updated weights for policy 0, policy_version 127473 (0.0043) +[2024-06-18 11:52:56,509][12883] Updated weights for policy 0, policy_version 127483 (0.0031) +[2024-06-18 11:52:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42654.8). Total num frames: 2088681472. Throughput: 0: 42671.1. Samples: 2088853400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:52:56,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 11:53:00,051][12883] Updated weights for policy 0, policy_version 127493 (0.0028) +[2024-06-18 11:53:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43146.1, 300 sec: 42820.6). Total num frames: 2088910848. Throughput: 0: 42582.3. Samples: 2088981620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:01,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 11:53:04,127][12883] Updated weights for policy 0, policy_version 127503 (0.0045) +[2024-06-18 11:53:06,995][12645] Fps is (10 sec: 45870.5, 60 sec: 43145.4, 300 sec: 42764.9). Total num frames: 2089140224. Throughput: 0: 42711.0. Samples: 2089235980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:06,995][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 11:53:07,653][12883] Updated weights for policy 0, policy_version 127513 (0.0037) +[2024-06-18 11:53:11,869][12883] Updated weights for policy 0, policy_version 127523 (0.0034) +[2024-06-18 11:53:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2089336832. Throughput: 0: 42680.9. Samples: 2089495360. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:11,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 11:53:15,687][12883] Updated weights for policy 0, policy_version 127533 (0.0029) +[2024-06-18 11:53:16,996][12645] Fps is (10 sec: 42593.2, 60 sec: 43416.0, 300 sec: 42764.7). Total num frames: 2089566208. Throughput: 0: 42516.6. Samples: 2089621680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:16,997][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 11:53:19,537][12883] Updated weights for policy 0, policy_version 127543 (0.0037) +[2024-06-18 11:53:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2089762816. Throughput: 0: 42658.5. Samples: 2089876300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:21,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 11:53:23,324][12883] Updated weights for policy 0, policy_version 127553 (0.0044) +[2024-06-18 11:53:26,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2089959424. Throughput: 0: 42503.1. Samples: 2090128180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:26,994][12645] Avg episode reward: [(0, '0.233')] +[2024-06-18 11:53:27,492][12883] Updated weights for policy 0, policy_version 127563 (0.0041) +[2024-06-18 11:53:31,095][12883] Updated weights for policy 0, policy_version 127573 (0.0037) +[2024-06-18 11:53:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2090188800. Throughput: 0: 42603.6. Samples: 2090260400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:31,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 11:53:35,045][12883] Updated weights for policy 0, policy_version 127583 (0.0028) +[2024-06-18 11:53:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2090401792. Throughput: 0: 42659.6. Samples: 2090517180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:36,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 11:53:38,839][12883] Updated weights for policy 0, policy_version 127593 (0.0040) +[2024-06-18 11:53:40,861][12862] Signal inference workers to stop experience collection... (30600 times) +[2024-06-18 11:53:40,917][12883] InferenceWorker_p0-w0: stopping experience collection (30600 times) +[2024-06-18 11:53:40,981][12862] Signal inference workers to resume experience collection... (30600 times) +[2024-06-18 11:53:40,982][12883] InferenceWorker_p0-w0: resuming experience collection (30600 times) +[2024-06-18 11:53:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2090614784. Throughput: 0: 42573.8. Samples: 2090769220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:41,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 11:53:42,875][12883] Updated weights for policy 0, policy_version 127603 (0.0031) +[2024-06-18 11:53:46,377][12883] Updated weights for policy 0, policy_version 127613 (0.0040) +[2024-06-18 11:53:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2090827776. Throughput: 0: 42535.2. Samples: 2090895700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:46,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 11:53:50,579][12883] Updated weights for policy 0, policy_version 127623 (0.0021) +[2024-06-18 11:53:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2091057152. Throughput: 0: 42770.3. Samples: 2091160600. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:51,994][12645] Avg episode reward: [(0, '0.677')] +[2024-06-18 11:53:53,936][12883] Updated weights for policy 0, policy_version 127633 (0.0033) +[2024-06-18 11:53:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2091253760. Throughput: 0: 42596.0. Samples: 2091412180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) +[2024-06-18 11:53:56,994][12645] Avg episode reward: [(0, '0.673')] +[2024-06-18 11:53:58,203][12883] Updated weights for policy 0, policy_version 127643 (0.0039) +[2024-06-18 11:54:01,599][12883] Updated weights for policy 0, policy_version 127653 (0.0039) +[2024-06-18 11:54:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2091466752. Throughput: 0: 42615.9. Samples: 2091539300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:01,994][12645] Avg episode reward: [(0, '0.678')] +[2024-06-18 11:54:05,579][12883] Updated weights for policy 0, policy_version 127663 (0.0039) +[2024-06-18 11:54:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42326.1, 300 sec: 42653.9). Total num frames: 2091679744. Throughput: 0: 42734.4. Samples: 2091799340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:06,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 11:54:09,334][12883] Updated weights for policy 0, policy_version 127673 (0.0048) +[2024-06-18 11:54:11,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 2091892736. Throughput: 0: 42688.1. Samples: 2092049240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:12,005][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 11:54:13,493][12883] Updated weights for policy 0, policy_version 127683 (0.0036) +[2024-06-18 11:54:16,922][12883] Updated weights for policy 0, policy_version 127693 (0.0040) +[2024-06-18 11:54:16,994][12645] Fps is (10 sec: 44233.6, 60 sec: 42599.5, 300 sec: 42709.4). Total num frames: 2092122112. Throughput: 0: 42565.1. Samples: 2092175860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:16,995][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 11:54:21,478][12883] Updated weights for policy 0, policy_version 127703 (0.0029) +[2024-06-18 11:54:21,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 2092318720. Throughput: 0: 42744.1. Samples: 2092440660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:21,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 11:54:24,755][12883] Updated weights for policy 0, policy_version 127713 (0.0046) +[2024-06-18 11:54:26,994][12645] Fps is (10 sec: 40963.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2092531712. Throughput: 0: 42922.7. Samples: 2092700740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:26,994][12645] Avg episode reward: [(0, '0.633')] +[2024-06-18 11:54:28,923][12883] Updated weights for policy 0, policy_version 127723 (0.0034) +[2024-06-18 11:54:31,996][12645] Fps is (10 sec: 42588.3, 60 sec: 42596.8, 300 sec: 42709.5). Total num frames: 2092744704. Throughput: 0: 42800.4. Samples: 2092821820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:31,997][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 11:54:32,245][12883] Updated weights for policy 0, policy_version 127733 (0.0028) +[2024-06-18 11:54:36,621][12883] Updated weights for policy 0, policy_version 127743 (0.0033) +[2024-06-18 11:54:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2092974080. Throughput: 0: 42760.1. Samples: 2093084800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:36,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 11:54:37,136][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127746_2092990464.pth... +[2024-06-18 11:54:37,201][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127120_2082734080.pth +[2024-06-18 11:54:39,977][12883] Updated weights for policy 0, policy_version 127753 (0.0039) +[2024-06-18 11:54:41,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2093154304. Throughput: 0: 42959.6. Samples: 2093345360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:41,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 11:54:44,199][12883] Updated weights for policy 0, policy_version 127763 (0.0037) +[2024-06-18 11:54:46,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2093400064. Throughput: 0: 42841.7. Samples: 2093467180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:46,994][12645] Avg episode reward: [(0, '0.128')] +[2024-06-18 11:54:47,497][12883] Updated weights for policy 0, policy_version 127773 (0.0031) +[2024-06-18 11:54:51,717][12883] Updated weights for policy 0, policy_version 127783 (0.0037) +[2024-06-18 11:54:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2093613056. Throughput: 0: 42855.5. Samples: 2093727840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:51,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 11:54:55,238][12883] Updated weights for policy 0, policy_version 127793 (0.0045) +[2024-06-18 11:54:57,000][12645] Fps is (10 sec: 40934.9, 60 sec: 42594.0, 300 sec: 42653.1). Total num frames: 2093809664. Throughput: 0: 42980.2. Samples: 2093983520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 11:54:57,000][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 11:54:59,286][12883] Updated weights for policy 0, policy_version 127803 (0.0033) +[2024-06-18 11:55:01,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2094039040. Throughput: 0: 42991.4. Samples: 2094110540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:01,996][12645] Avg episode reward: [(0, '0.642')] +[2024-06-18 11:55:02,838][12883] Updated weights for policy 0, policy_version 127813 (0.0049) +[2024-06-18 11:55:06,037][12862] Signal inference workers to stop experience collection... (30650 times) +[2024-06-18 11:55:06,037][12862] Signal inference workers to resume experience collection... (30650 times) +[2024-06-18 11:55:06,104][12883] InferenceWorker_p0-w0: stopping experience collection (30650 times) +[2024-06-18 11:55:06,104][12883] InferenceWorker_p0-w0: resuming experience collection (30650 times) +[2024-06-18 11:55:06,994][12645] Fps is (10 sec: 42625.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2094235648. Throughput: 0: 42884.4. Samples: 2094370460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:06,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 11:55:07,112][12883] Updated weights for policy 0, policy_version 127823 (0.0044) +[2024-06-18 11:55:10,531][12883] Updated weights for policy 0, policy_version 127833 (0.0032) +[2024-06-18 11:55:12,000][12645] Fps is (10 sec: 40943.5, 60 sec: 42595.5, 300 sec: 42708.6). Total num frames: 2094448640. Throughput: 0: 42555.8. Samples: 2094616020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:12,001][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 11:55:14,875][12883] Updated weights for policy 0, policy_version 127843 (0.0033) +[2024-06-18 11:55:16,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42871.9, 300 sec: 42765.0). Total num frames: 2094694400. Throughput: 0: 42842.0. Samples: 2094749620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:16,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 11:55:18,171][12883] Updated weights for policy 0, policy_version 127853 (0.0030) +[2024-06-18 11:55:21,994][12645] Fps is (10 sec: 40985.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2094858240. Throughput: 0: 42739.5. Samples: 2095008080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:21,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 11:55:22,443][12883] Updated weights for policy 0, policy_version 127863 (0.0046) +[2024-06-18 11:55:26,088][12883] Updated weights for policy 0, policy_version 127873 (0.0042) +[2024-06-18 11:55:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2095104000. Throughput: 0: 42349.8. Samples: 2095251100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:26,994][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 11:55:30,262][12883] Updated weights for policy 0, policy_version 127883 (0.0051) +[2024-06-18 11:55:31,994][12645] Fps is (10 sec: 49151.4, 60 sec: 43419.1, 300 sec: 42820.5). Total num frames: 2095349760. Throughput: 0: 42769.8. Samples: 2095391820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:32,000][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 11:55:33,556][12883] Updated weights for policy 0, policy_version 127893 (0.0029) +[2024-06-18 11:55:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 2095480832. Throughput: 0: 42714.3. Samples: 2095649980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:36,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 11:55:37,976][12883] Updated weights for policy 0, policy_version 127903 (0.0045) +[2024-06-18 11:55:40,994][12883] Updated weights for policy 0, policy_version 127913 (0.0032) +[2024-06-18 11:55:42,000][12645] Fps is (10 sec: 40935.0, 60 sec: 43413.1, 300 sec: 42708.6). Total num frames: 2095759360. Throughput: 0: 42629.8. Samples: 2095901860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:42,000][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 11:55:45,689][12883] Updated weights for policy 0, policy_version 127923 (0.0042) +[2024-06-18 11:55:46,994][12645] Fps is (10 sec: 52428.9, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2096005120. Throughput: 0: 42965.3. Samples: 2096043880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:46,994][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 11:55:48,657][12883] Updated weights for policy 0, policy_version 127933 (0.0033) +[2024-06-18 11:55:51,994][12645] Fps is (10 sec: 37706.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2096136192. Throughput: 0: 42673.7. Samples: 2096290780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:51,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 11:55:53,361][12883] Updated weights for policy 0, policy_version 127943 (0.0036) +[2024-06-18 11:55:56,207][12883] Updated weights for policy 0, policy_version 127953 (0.0037) +[2024-06-18 11:55:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 43422.0, 300 sec: 42765.0). Total num frames: 2096414720. Throughput: 0: 42937.0. Samples: 2096547920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:55:56,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 11:56:00,852][12883] Updated weights for policy 0, policy_version 127963 (0.0040) +[2024-06-18 11:56:01,994][12645] Fps is (10 sec: 49151.8, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 2096627712. Throughput: 0: 43034.7. Samples: 2096686180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 11:56:01,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 11:56:03,816][12883] Updated weights for policy 0, policy_version 127973 (0.0038) +[2024-06-18 11:56:06,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2096791552. Throughput: 0: 42824.0. Samples: 2096935160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:06,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 11:56:07,660][12862] Signal inference workers to stop experience collection... (30700 times) +[2024-06-18 11:56:07,661][12862] Signal inference workers to resume experience collection... (30700 times) +[2024-06-18 11:56:07,679][12883] InferenceWorker_p0-w0: stopping experience collection (30700 times) +[2024-06-18 11:56:07,682][12883] InferenceWorker_p0-w0: resuming experience collection (30700 times) +[2024-06-18 11:56:08,396][12883] Updated weights for policy 0, policy_version 127983 (0.0041) +[2024-06-18 11:56:11,494][12883] Updated weights for policy 0, policy_version 127993 (0.0031) +[2024-06-18 11:56:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43422.1, 300 sec: 42765.0). Total num frames: 2097053696. Throughput: 0: 43090.7. Samples: 2097190180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:11,994][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 11:56:16,031][12883] Updated weights for policy 0, policy_version 128003 (0.0039) +[2024-06-18 11:56:16,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2097250304. Throughput: 0: 43137.9. Samples: 2097333020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:16,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 11:56:19,031][12883] Updated weights for policy 0, policy_version 128013 (0.0038) +[2024-06-18 11:56:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2097446912. Throughput: 0: 42860.5. Samples: 2097578700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:21,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 11:56:23,617][12883] Updated weights for policy 0, policy_version 128023 (0.0041) +[2024-06-18 11:56:26,505][12883] Updated weights for policy 0, policy_version 128033 (0.0032) +[2024-06-18 11:56:26,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 2097709056. Throughput: 0: 42932.9. Samples: 2097833580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:27,000][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 11:56:31,133][12883] Updated weights for policy 0, policy_version 128043 (0.0029) +[2024-06-18 11:56:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2097905664. Throughput: 0: 42930.2. Samples: 2097975740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:31,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 11:56:34,171][12883] Updated weights for policy 0, policy_version 128053 (0.0031) +[2024-06-18 11:56:36,994][12645] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 2098102272. Throughput: 0: 43016.9. Samples: 2098226540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:36,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 11:56:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128058_2098102272.pth... +[2024-06-18 11:56:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127433_2087862272.pth +[2024-06-18 11:56:38,958][12883] Updated weights for policy 0, policy_version 128063 (0.0029) +[2024-06-18 11:56:41,934][12883] Updated weights for policy 0, policy_version 128073 (0.0031) +[2024-06-18 11:56:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43149.0, 300 sec: 42931.6). Total num frames: 2098348032. Throughput: 0: 43036.1. Samples: 2098484540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:42,000][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 11:56:46,656][12883] Updated weights for policy 0, policy_version 128083 (0.0031) +[2024-06-18 11:56:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42709.8). Total num frames: 2098511872. Throughput: 0: 42969.9. Samples: 2098619820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:46,994][12645] Avg episode reward: [(0, '0.276')] +[2024-06-18 11:56:49,493][12883] Updated weights for policy 0, policy_version 128093 (0.0037) +[2024-06-18 11:56:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43417.5, 300 sec: 42654.0). Total num frames: 2098741248. Throughput: 0: 42843.5. Samples: 2098863120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:51,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 11:56:54,536][12883] Updated weights for policy 0, policy_version 128103 (0.0033) +[2024-06-18 11:56:56,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42871.5, 300 sec: 42931.9). Total num frames: 2098987008. Throughput: 0: 43045.2. Samples: 2099127220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:56:56,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 11:56:57,020][12883] Updated weights for policy 0, policy_version 128113 (0.0033) +[2024-06-18 11:57:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 2099150848. Throughput: 0: 42759.2. Samples: 2099257180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) +[2024-06-18 11:57:01,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 11:57:02,014][12883] Updated weights for policy 0, policy_version 128123 (0.0038) +[2024-06-18 11:57:04,552][12883] Updated weights for policy 0, policy_version 128133 (0.0044) +[2024-06-18 11:57:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2099396608. Throughput: 0: 42809.7. Samples: 2099505140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:06,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 11:57:09,777][12883] Updated weights for policy 0, policy_version 128143 (0.0027) +[2024-06-18 11:57:11,323][12862] Signal inference workers to stop experience collection... (30750 times) +[2024-06-18 11:57:11,324][12862] Signal inference workers to resume experience collection... (30750 times) +[2024-06-18 11:57:11,343][12883] InferenceWorker_p0-w0: stopping experience collection (30750 times) +[2024-06-18 11:57:11,344][12883] InferenceWorker_p0-w0: resuming experience collection (30750 times) +[2024-06-18 11:57:11,994][12645] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2099642368. Throughput: 0: 42796.6. Samples: 2099759420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:11,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 11:57:12,752][12883] Updated weights for policy 0, policy_version 128153 (0.0038) +[2024-06-18 11:57:16,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2099789824. Throughput: 0: 42615.2. Samples: 2099893420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:16,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 11:57:17,456][12883] Updated weights for policy 0, policy_version 128163 (0.0033) +[2024-06-18 11:57:20,193][12883] Updated weights for policy 0, policy_version 128173 (0.0032) +[2024-06-18 11:57:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2100035584. Throughput: 0: 42707.9. Samples: 2100148400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:21,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 11:57:25,134][12883] Updated weights for policy 0, policy_version 128183 (0.0040) +[2024-06-18 11:57:26,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2100264960. Throughput: 0: 42588.4. Samples: 2100401020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:26,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 11:57:27,895][12883] Updated weights for policy 0, policy_version 128193 (0.0041) +[2024-06-18 11:57:31,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2100428800. Throughput: 0: 42453.8. Samples: 2100530240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:31,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 11:57:32,810][12883] Updated weights for policy 0, policy_version 128203 (0.0029) +[2024-06-18 11:57:35,628][12883] Updated weights for policy 0, policy_version 128213 (0.0036) +[2024-06-18 11:57:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2100674560. Throughput: 0: 42704.2. Samples: 2100784800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:36,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 11:57:40,298][12883] Updated weights for policy 0, policy_version 128223 (0.0029) +[2024-06-18 11:57:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2100887552. Throughput: 0: 42655.2. Samples: 2101046700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:41,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 11:57:43,276][12883] Updated weights for policy 0, policy_version 128233 (0.0035) +[2024-06-18 11:57:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2101084160. Throughput: 0: 42572.4. Samples: 2101172940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:46,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 11:57:47,769][12883] Updated weights for policy 0, policy_version 128243 (0.0040) +[2024-06-18 11:57:51,082][12883] Updated weights for policy 0, policy_version 128253 (0.0028) +[2024-06-18 11:57:51,994][12645] Fps is (10 sec: 42596.7, 60 sec: 42871.2, 300 sec: 42820.5). Total num frames: 2101313536. Throughput: 0: 42703.2. Samples: 2101426800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:51,995][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 11:57:55,197][12883] Updated weights for policy 0, policy_version 128263 (0.0027) +[2024-06-18 11:57:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2101526528. Throughput: 0: 42938.2. Samples: 2101691640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:57:56,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 11:57:58,851][12883] Updated weights for policy 0, policy_version 128273 (0.0032) +[2024-06-18 11:58:01,994][12645] Fps is (10 sec: 42600.0, 60 sec: 43144.5, 300 sec: 42709.6). Total num frames: 2101739520. Throughput: 0: 42681.7. Samples: 2101814100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) +[2024-06-18 11:58:01,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 11:58:03,064][12883] Updated weights for policy 0, policy_version 128283 (0.0031) +[2024-06-18 11:58:06,402][12883] Updated weights for policy 0, policy_version 128293 (0.0037) +[2024-06-18 11:58:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2101968896. Throughput: 0: 42777.5. Samples: 2102073380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:06,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 11:58:10,656][12883] Updated weights for policy 0, policy_version 128303 (0.0031) +[2024-06-18 11:58:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 2102165504. Throughput: 0: 42943.1. Samples: 2102333460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:11,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 11:58:14,098][12883] Updated weights for policy 0, policy_version 128313 (0.0038) +[2024-06-18 11:58:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 2102394880. Throughput: 0: 42806.6. Samples: 2102456540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:16,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 11:58:18,083][12883] Updated weights for policy 0, policy_version 128323 (0.0029) +[2024-06-18 11:58:21,751][12883] Updated weights for policy 0, policy_version 128333 (0.0025) +[2024-06-18 11:58:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2102624256. Throughput: 0: 42982.1. Samples: 2102719000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:21,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 11:58:25,583][12883] Updated weights for policy 0, policy_version 128343 (0.0038) +[2024-06-18 11:58:26,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 2102804480. Throughput: 0: 42854.7. Samples: 2102975260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:26,997][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 11:58:29,324][12883] Updated weights for policy 0, policy_version 128353 (0.0033) +[2024-06-18 11:58:31,346][12862] Signal inference workers to stop experience collection... (30800 times) +[2024-06-18 11:58:31,352][12862] Signal inference workers to resume experience collection... (30800 times) +[2024-06-18 11:58:31,395][12883] InferenceWorker_p0-w0: stopping experience collection (30800 times) +[2024-06-18 11:58:31,396][12883] InferenceWorker_p0-w0: resuming experience collection (30800 times) +[2024-06-18 11:58:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2103033856. Throughput: 0: 42869.0. Samples: 2103102040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:31,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 11:58:33,076][12883] Updated weights for policy 0, policy_version 128363 (0.0041) +[2024-06-18 11:58:36,978][12883] Updated weights for policy 0, policy_version 128373 (0.0037) +[2024-06-18 11:58:36,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2103263232. Throughput: 0: 43063.0. Samples: 2103364620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:36,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 11:58:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128373_2103263232.pth... +[2024-06-18 11:58:37,060][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127746_2092990464.pth +[2024-06-18 11:58:40,648][12883] Updated weights for policy 0, policy_version 128383 (0.0026) +[2024-06-18 11:58:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2103459840. Throughput: 0: 42792.7. Samples: 2103617320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:41,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 11:58:44,581][12883] Updated weights for policy 0, policy_version 128393 (0.0027) +[2024-06-18 11:58:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2103672832. Throughput: 0: 42755.0. Samples: 2103738080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:46,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 11:58:49,160][12883] Updated weights for policy 0, policy_version 128403 (0.0034) +[2024-06-18 11:58:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.8, 300 sec: 42876.1). Total num frames: 2103902208. Throughput: 0: 42759.0. Samples: 2103997540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:51,996][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 11:58:52,525][12883] Updated weights for policy 0, policy_version 128413 (0.0024) +[2024-06-18 11:58:56,690][12883] Updated weights for policy 0, policy_version 128423 (0.0032) +[2024-06-18 11:58:56,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2104082432. Throughput: 0: 42587.2. Samples: 2104249880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:58:56,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 11:59:00,338][12883] Updated weights for policy 0, policy_version 128433 (0.0027) +[2024-06-18 11:59:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2104311808. Throughput: 0: 42593.3. Samples: 2104373240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:59:01,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 11:59:04,323][12883] Updated weights for policy 0, policy_version 128443 (0.0041) +[2024-06-18 11:59:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 2104508416. Throughput: 0: 42526.3. Samples: 2104632680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 11:59:06,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 11:59:07,869][12883] Updated weights for policy 0, policy_version 128453 (0.0028) +[2024-06-18 11:59:11,887][12883] Updated weights for policy 0, policy_version 128463 (0.0028) +[2024-06-18 11:59:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 2104737792. Throughput: 0: 42456.4. Samples: 2104885700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:11,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 11:59:16,029][12883] Updated weights for policy 0, policy_version 128473 (0.0030) +[2024-06-18 11:59:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2104950784. Throughput: 0: 42561.7. Samples: 2105017320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:16,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 11:59:19,442][12883] Updated weights for policy 0, policy_version 128483 (0.0044) +[2024-06-18 11:59:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2105147392. Throughput: 0: 42367.2. Samples: 2105271140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:21,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 11:59:23,652][12883] Updated weights for policy 0, policy_version 128493 (0.0032) +[2024-06-18 11:59:26,991][12883] Updated weights for policy 0, policy_version 128503 (0.0027) +[2024-06-18 11:59:26,996][12645] Fps is (10 sec: 44227.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 2105393152. Throughput: 0: 42490.5. Samples: 2105529480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:26,996][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 11:59:31,208][12883] Updated weights for policy 0, policy_version 128513 (0.0034) +[2024-06-18 11:59:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2105589760. Throughput: 0: 42689.5. Samples: 2105659100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:31,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 11:59:34,980][12883] Updated weights for policy 0, policy_version 128523 (0.0030) +[2024-06-18 11:59:36,994][12645] Fps is (10 sec: 40968.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2105802752. Throughput: 0: 42628.0. Samples: 2105915800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:36,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 11:59:38,703][12883] Updated weights for policy 0, policy_version 128533 (0.0033) +[2024-06-18 11:59:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2106032128. Throughput: 0: 42643.9. Samples: 2106168860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:41,994][12645] Avg episode reward: [(0, '0.689')] +[2024-06-18 11:59:42,491][12883] Updated weights for policy 0, policy_version 128543 (0.0037) +[2024-06-18 11:59:46,608][12883] Updated weights for policy 0, policy_version 128553 (0.0034) +[2024-06-18 11:59:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2106228736. Throughput: 0: 42828.0. Samples: 2106300500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:46,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 11:59:49,869][12883] Updated weights for policy 0, policy_version 128563 (0.0032) +[2024-06-18 11:59:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42877.0). Total num frames: 2106458112. Throughput: 0: 42852.4. Samples: 2106561040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:51,994][12645] Avg episode reward: [(0, '0.812')] +[2024-06-18 11:59:54,143][12883] Updated weights for policy 0, policy_version 128573 (0.0033) +[2024-06-18 11:59:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 2106671104. Throughput: 0: 42940.1. Samples: 2106818000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 11:59:56,994][12645] Avg episode reward: [(0, '0.694')] +[2024-06-18 11:59:58,077][12883] Updated weights for policy 0, policy_version 128583 (0.0035) +[2024-06-18 12:00:01,753][12883] Updated weights for policy 0, policy_version 128593 (0.0045) +[2024-06-18 12:00:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2106884096. Throughput: 0: 42940.5. Samples: 2106949640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:00:01,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 12:00:03,656][12862] Signal inference workers to stop experience collection... (30850 times) +[2024-06-18 12:00:03,657][12862] Signal inference workers to resume experience collection... (30850 times) +[2024-06-18 12:00:03,702][12883] InferenceWorker_p0-w0: stopping experience collection (30850 times) +[2024-06-18 12:00:03,702][12883] InferenceWorker_p0-w0: resuming experience collection (30850 times) +[2024-06-18 12:00:05,578][12883] Updated weights for policy 0, policy_version 128603 (0.0037) +[2024-06-18 12:00:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 2107097088. Throughput: 0: 43018.2. Samples: 2107206960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:00:06,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 12:00:09,328][12883] Updated weights for policy 0, policy_version 128613 (0.0041) +[2024-06-18 12:00:11,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 2107310080. Throughput: 0: 42819.9. Samples: 2107456380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:11,996][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 12:00:13,233][12883] Updated weights for policy 0, policy_version 128623 (0.0043) +[2024-06-18 12:00:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2107506688. Throughput: 0: 42806.7. Samples: 2107585400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:16,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 12:00:17,041][12883] Updated weights for policy 0, policy_version 128633 (0.0027) +[2024-06-18 12:00:21,022][12883] Updated weights for policy 0, policy_version 128643 (0.0030) +[2024-06-18 12:00:21,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2107719680. Throughput: 0: 42795.6. Samples: 2107841600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:21,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 12:00:24,666][12883] Updated weights for policy 0, policy_version 128653 (0.0046) +[2024-06-18 12:00:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 2107965440. Throughput: 0: 42720.1. Samples: 2108091260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:26,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 12:00:28,909][12883] Updated weights for policy 0, policy_version 128663 (0.0033) +[2024-06-18 12:00:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2108145664. Throughput: 0: 42628.1. Samples: 2108218760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:31,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 12:00:32,579][12883] Updated weights for policy 0, policy_version 128673 (0.0035) +[2024-06-18 12:00:36,475][12883] Updated weights for policy 0, policy_version 128683 (0.0030) +[2024-06-18 12:00:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 2108358656. Throughput: 0: 42663.0. Samples: 2108480880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:36,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 12:00:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128684_2108358656.pth... +[2024-06-18 12:00:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128058_2098102272.pth +[2024-06-18 12:00:40,243][12883] Updated weights for policy 0, policy_version 128693 (0.0030) +[2024-06-18 12:00:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2108588032. Throughput: 0: 42478.1. Samples: 2108729520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:41,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 12:00:44,048][12883] Updated weights for policy 0, policy_version 128703 (0.0041) +[2024-06-18 12:00:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2108784640. Throughput: 0: 42671.5. Samples: 2108869860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:46,994][12645] Avg episode reward: [(0, '0.718')] +[2024-06-18 12:00:47,747][12883] Updated weights for policy 0, policy_version 128713 (0.0026) +[2024-06-18 12:00:51,603][12883] Updated weights for policy 0, policy_version 128723 (0.0033) +[2024-06-18 12:00:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2108997632. Throughput: 0: 42569.0. Samples: 2109122560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:51,994][12645] Avg episode reward: [(0, '0.732')] +[2024-06-18 12:00:55,327][12883] Updated weights for policy 0, policy_version 128733 (0.0040) +[2024-06-18 12:00:56,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2109243392. Throughput: 0: 42575.9. Samples: 2109372200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:00:56,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 12:00:59,088][12883] Updated weights for policy 0, policy_version 128743 (0.0032) +[2024-06-18 12:01:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2109423616. Throughput: 0: 42749.3. Samples: 2109509120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:01:01,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 12:01:02,901][12883] Updated weights for policy 0, policy_version 128753 (0.0034) +[2024-06-18 12:01:06,600][12883] Updated weights for policy 0, policy_version 128763 (0.0041) +[2024-06-18 12:01:06,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 2109652992. Throughput: 0: 42769.9. Samples: 2109766340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:01:07,005][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 12:01:10,530][12883] Updated weights for policy 0, policy_version 128773 (0.0049) +[2024-06-18 12:01:11,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 2109882368. Throughput: 0: 42834.3. Samples: 2110018800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) +[2024-06-18 12:01:11,994][12645] Avg episode reward: [(0, '0.729')] +[2024-06-18 12:01:14,473][12883] Updated weights for policy 0, policy_version 128783 (0.0041) +[2024-06-18 12:01:16,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2110062592. Throughput: 0: 42972.4. Samples: 2110152520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:16,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 12:01:18,095][12883] Updated weights for policy 0, policy_version 128793 (0.0038) +[2024-06-18 12:01:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2110291968. Throughput: 0: 42844.5. Samples: 2110408880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:21,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 12:01:22,007][12883] Updated weights for policy 0, policy_version 128803 (0.0031) +[2024-06-18 12:01:26,082][12883] Updated weights for policy 0, policy_version 128813 (0.0032) +[2024-06-18 12:01:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2110521344. Throughput: 0: 42884.8. Samples: 2110659340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:26,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 12:01:29,554][12883] Updated weights for policy 0, policy_version 128823 (0.0026) +[2024-06-18 12:01:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2110701568. Throughput: 0: 42614.6. Samples: 2110787520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:31,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 12:01:33,592][12883] Updated weights for policy 0, policy_version 128833 (0.0040) +[2024-06-18 12:01:36,982][12883] Updated weights for policy 0, policy_version 128843 (0.0051) +[2024-06-18 12:01:36,996][12645] Fps is (10 sec: 44227.0, 60 sec: 43416.0, 300 sec: 42764.7). Total num frames: 2110963712. Throughput: 0: 42862.1. Samples: 2111051460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:36,997][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 12:01:41,415][12883] Updated weights for policy 0, policy_version 128853 (0.0049) +[2024-06-18 12:01:41,435][12862] Signal inference workers to stop experience collection... (30900 times) +[2024-06-18 12:01:41,436][12862] Signal inference workers to resume experience collection... (30900 times) +[2024-06-18 12:01:41,454][12883] InferenceWorker_p0-w0: stopping experience collection (30900 times) +[2024-06-18 12:01:41,454][12883] InferenceWorker_p0-w0: resuming experience collection (30900 times) +[2024-06-18 12:01:42,000][12645] Fps is (10 sec: 47484.5, 60 sec: 43140.1, 300 sec: 42930.7). Total num frames: 2111176704. Throughput: 0: 42982.9. Samples: 2111306700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:42,000][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 12:01:44,538][12883] Updated weights for policy 0, policy_version 128863 (0.0034) +[2024-06-18 12:01:46,994][12645] Fps is (10 sec: 37691.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2111340544. Throughput: 0: 42647.0. Samples: 2111428240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:46,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 12:01:49,066][12883] Updated weights for policy 0, policy_version 128873 (0.0036) +[2024-06-18 12:01:51,994][12645] Fps is (10 sec: 40985.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2111586304. Throughput: 0: 42668.4. Samples: 2111686320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:52,003][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 12:01:52,447][12883] Updated weights for policy 0, policy_version 128883 (0.0028) +[2024-06-18 12:01:56,703][12883] Updated weights for policy 0, policy_version 128893 (0.0027) +[2024-06-18 12:01:56,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2111799296. Throughput: 0: 42987.5. Samples: 2111953240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:01:56,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 12:02:00,161][12883] Updated weights for policy 0, policy_version 128903 (0.0031) +[2024-06-18 12:02:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2111979520. Throughput: 0: 42708.9. Samples: 2112074420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:02:01,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 12:02:04,634][12883] Updated weights for policy 0, policy_version 128913 (0.0044) +[2024-06-18 12:02:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 2112241664. Throughput: 0: 42692.5. Samples: 2112330040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:02:06,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 12:02:07,615][12883] Updated weights for policy 0, policy_version 128923 (0.0031) +[2024-06-18 12:02:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2112405504. Throughput: 0: 43012.6. Samples: 2112594900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:02:11,994][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 12:02:12,478][12883] Updated weights for policy 0, policy_version 128933 (0.0030) +[2024-06-18 12:02:15,154][12883] Updated weights for policy 0, policy_version 128943 (0.0039) +[2024-06-18 12:02:16,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2112634880. Throughput: 0: 42695.5. Samples: 2112708820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:16,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 12:02:20,362][12883] Updated weights for policy 0, policy_version 128953 (0.0034) +[2024-06-18 12:02:21,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2112880640. Throughput: 0: 42620.3. Samples: 2112969280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:21,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 12:02:22,913][12883] Updated weights for policy 0, policy_version 128963 (0.0030) +[2024-06-18 12:02:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2113044480. Throughput: 0: 42714.3. Samples: 2113228580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:26,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 12:02:27,944][12883] Updated weights for policy 0, policy_version 128973 (0.0046) +[2024-06-18 12:02:30,445][12883] Updated weights for policy 0, policy_version 128983 (0.0040) +[2024-06-18 12:02:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2113273856. Throughput: 0: 42563.5. Samples: 2113343600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:31,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 12:02:35,583][12883] Updated weights for policy 0, policy_version 128993 (0.0030) +[2024-06-18 12:02:36,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2113503232. Throughput: 0: 42695.6. Samples: 2113607620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:36,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 12:02:37,160][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129000_2113536000.pth... +[2024-06-18 12:02:37,220][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128373_2103263232.pth +[2024-06-18 12:02:38,500][12883] Updated weights for policy 0, policy_version 129003 (0.0021) +[2024-06-18 12:02:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41510.4, 300 sec: 42653.9). Total num frames: 2113667072. Throughput: 0: 42437.6. Samples: 2113862940. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:41,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 12:02:43,260][12883] Updated weights for policy 0, policy_version 129013 (0.0036) +[2024-06-18 12:02:46,013][12883] Updated weights for policy 0, policy_version 129023 (0.0034) +[2024-06-18 12:02:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 2113929216. Throughput: 0: 42368.5. Samples: 2113981000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:46,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 12:02:50,879][12883] Updated weights for policy 0, policy_version 129033 (0.0040) +[2024-06-18 12:02:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2114125824. Throughput: 0: 42515.0. Samples: 2114243220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:51,996][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 12:02:53,931][12883] Updated weights for policy 0, policy_version 129043 (0.0036) +[2024-06-18 12:02:56,994][12645] Fps is (10 sec: 37682.1, 60 sec: 41778.9, 300 sec: 42598.4). Total num frames: 2114306048. Throughput: 0: 42265.0. Samples: 2114496840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:02:56,995][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 12:02:58,745][12883] Updated weights for policy 0, policy_version 129053 (0.0049) +[2024-06-18 12:03:01,675][12883] Updated weights for policy 0, policy_version 129063 (0.0027) +[2024-06-18 12:03:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2114568192. Throughput: 0: 42432.7. Samples: 2114618280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:03:01,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 12:03:06,430][12883] Updated weights for policy 0, policy_version 129073 (0.0037) +[2024-06-18 12:03:06,998][12645] Fps is (10 sec: 45858.3, 60 sec: 42049.4, 300 sec: 42708.9). Total num frames: 2114764800. Throughput: 0: 42394.1. Samples: 2114877180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:03:06,998][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 12:03:08,063][12862] Signal inference workers to stop experience collection... (30950 times) +[2024-06-18 12:03:08,063][12862] Signal inference workers to resume experience collection... (30950 times) +[2024-06-18 12:03:08,081][12883] InferenceWorker_p0-w0: stopping experience collection (30950 times) +[2024-06-18 12:03:08,081][12883] InferenceWorker_p0-w0: resuming experience collection (30950 times) +[2024-06-18 12:03:09,363][12883] Updated weights for policy 0, policy_version 129083 (0.0039) +[2024-06-18 12:03:11,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2114945024. Throughput: 0: 42125.8. Samples: 2115124240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:03:11,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 12:03:14,226][12883] Updated weights for policy 0, policy_version 129093 (0.0038) +[2024-06-18 12:03:16,993][12645] Fps is (10 sec: 44255.2, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 2115207168. Throughput: 0: 42303.4. Samples: 2115247240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) +[2024-06-18 12:03:16,994][12645] Avg episode reward: [(0, '0.681')] +[2024-06-18 12:03:17,106][12883] Updated weights for policy 0, policy_version 129103 (0.0035) +[2024-06-18 12:03:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 42598.7). Total num frames: 2115371008. Throughput: 0: 42137.3. Samples: 2115503800. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:03:21,994][12645] Avg episode reward: [(0, '0.747')] +[2024-06-18 12:03:22,095][12883] Updated weights for policy 0, policy_version 129113 (0.0037) +[2024-06-18 12:03:25,284][12883] Updated weights for policy 0, policy_version 129123 (0.0038) +[2024-06-18 12:03:26,995][12645] Fps is (10 sec: 37676.9, 60 sec: 42324.3, 300 sec: 42542.6). Total num frames: 2115584000. Throughput: 0: 41992.4. Samples: 2115752660. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:03:26,996][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 12:03:29,814][12883] Updated weights for policy 0, policy_version 129133 (0.0047) +[2024-06-18 12:03:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2115813376. Throughput: 0: 42321.7. Samples: 2115885480. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:03:31,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 12:03:33,042][12883] Updated weights for policy 0, policy_version 129143 (0.0030) +[2024-06-18 12:03:36,995][12645] Fps is (10 sec: 42598.7, 60 sec: 41778.1, 300 sec: 42542.7). Total num frames: 2116009984. Throughput: 0: 42070.7. Samples: 2116136460. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:03:36,996][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 12:03:37,396][12883] Updated weights for policy 0, policy_version 129153 (0.0042) +[2024-06-18 12:03:40,725][12883] Updated weights for policy 0, policy_version 129163 (0.0036) +[2024-06-18 12:03:41,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2116239360. Throughput: 0: 42009.3. Samples: 2116387240. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:03:41,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 12:03:45,312][12883] Updated weights for policy 0, policy_version 129173 (0.0029) +[2024-06-18 12:03:46,994][12645] Fps is (10 sec: 44243.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2116452352. Throughput: 0: 42224.8. Samples: 2116518400. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:03:46,994][12645] Avg episode reward: [(0, '0.599')] +[2024-06-18 12:03:48,650][12883] Updated weights for policy 0, policy_version 129183 (0.0024) +[2024-06-18 12:03:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2116648960. Throughput: 0: 42090.9. Samples: 2116771100. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:03:51,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 12:03:52,875][12883] Updated weights for policy 0, policy_version 129193 (0.0039) +[2024-06-18 12:03:56,426][12883] Updated weights for policy 0, policy_version 129203 (0.0031) +[2024-06-18 12:03:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.8, 300 sec: 42598.4). Total num frames: 2116878336. Throughput: 0: 42221.1. Samples: 2117024180. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:03:56,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 12:04:00,634][12883] Updated weights for policy 0, policy_version 129213 (0.0037) +[2024-06-18 12:04:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2117074944. Throughput: 0: 42311.0. Samples: 2117151240. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:04:01,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 12:04:04,342][12883] Updated weights for policy 0, policy_version 129223 (0.0028) +[2024-06-18 12:04:06,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41782.0, 300 sec: 42487.3). Total num frames: 2117271552. Throughput: 0: 42245.7. Samples: 2117404860. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:04:06,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 12:04:08,386][12883] Updated weights for policy 0, policy_version 129233 (0.0029) +[2024-06-18 12:04:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2117500928. Throughput: 0: 42525.9. Samples: 2117666260. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:04:11,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 12:04:12,219][12883] Updated weights for policy 0, policy_version 129243 (0.0044) +[2024-06-18 12:04:16,023][12883] Updated weights for policy 0, policy_version 129253 (0.0041) +[2024-06-18 12:04:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2117730304. Throughput: 0: 42403.6. Samples: 2117793640. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) +[2024-06-18 12:04:17,007][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 12:04:19,817][12883] Updated weights for policy 0, policy_version 129263 (0.0049) +[2024-06-18 12:04:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2117926912. Throughput: 0: 42461.4. Samples: 2118047160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:04:21,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 12:04:23,559][12883] Updated weights for policy 0, policy_version 129273 (0.0025) +[2024-06-18 12:04:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42599.5, 300 sec: 42542.8). Total num frames: 2118139904. Throughput: 0: 42679.5. Samples: 2118307820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:04:26,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 12:04:27,342][12883] Updated weights for policy 0, policy_version 129283 (0.0031) +[2024-06-18 12:04:31,151][12862] Signal inference workers to stop experience collection... (31000 times) +[2024-06-18 12:04:31,151][12862] Signal inference workers to resume experience collection... (31000 times) +[2024-06-18 12:04:31,153][12883] Updated weights for policy 0, policy_version 129293 (0.0021) +[2024-06-18 12:04:31,167][12883] InferenceWorker_p0-w0: stopping experience collection (31000 times) +[2024-06-18 12:04:31,182][12883] InferenceWorker_p0-w0: resuming experience collection (31000 times) +[2024-06-18 12:04:31,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 2118369280. Throughput: 0: 42663.7. Samples: 2118438360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:04:31,997][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 12:04:34,969][12883] Updated weights for policy 0, policy_version 129303 (0.0050) +[2024-06-18 12:04:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42872.4, 300 sec: 42542.9). Total num frames: 2118582272. Throughput: 0: 42663.0. Samples: 2118690940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:04:36,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 12:04:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129308_2118582272.pth... +[2024-06-18 12:04:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128684_2108358656.pth +[2024-06-18 12:04:38,879][12883] Updated weights for policy 0, policy_version 129313 (0.0033) +[2024-06-18 12:04:41,994][12645] Fps is (10 sec: 42607.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2118795264. Throughput: 0: 42716.2. Samples: 2118946420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:04:41,994][12645] Avg episode reward: [(0, '0.683')] +[2024-06-18 12:04:42,743][12883] Updated weights for policy 0, policy_version 129323 (0.0035) +[2024-06-18 12:04:46,575][12883] Updated weights for policy 0, policy_version 129333 (0.0037) +[2024-06-18 12:04:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2118991872. Throughput: 0: 42763.8. Samples: 2119075620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:04:46,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 12:04:50,489][12883] Updated weights for policy 0, policy_version 129343 (0.0033) +[2024-06-18 12:04:51,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2119221248. Throughput: 0: 42893.9. Samples: 2119335080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:04:51,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 12:04:54,410][12883] Updated weights for policy 0, policy_version 129353 (0.0037) +[2024-06-18 12:04:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2119434240. Throughput: 0: 42779.9. Samples: 2119591360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:04:56,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 12:04:57,990][12883] Updated weights for policy 0, policy_version 129363 (0.0034) +[2024-06-18 12:05:01,812][12883] Updated weights for policy 0, policy_version 129373 (0.0033) +[2024-06-18 12:05:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2119647232. Throughput: 0: 42845.1. Samples: 2119721660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:05:01,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 12:05:05,946][12883] Updated weights for policy 0, policy_version 129383 (0.0046) +[2024-06-18 12:05:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42487.6). Total num frames: 2119843840. Throughput: 0: 42951.8. Samples: 2119980000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:05:06,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 12:05:09,382][12883] Updated weights for policy 0, policy_version 129393 (0.0027) +[2024-06-18 12:05:11,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2120073216. Throughput: 0: 42712.4. Samples: 2120229880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:05:11,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 12:05:13,666][12883] Updated weights for policy 0, policy_version 129403 (0.0033) +[2024-06-18 12:05:16,935][12883] Updated weights for policy 0, policy_version 129413 (0.0029) +[2024-06-18 12:05:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2120302592. Throughput: 0: 42769.2. Samples: 2120362880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) +[2024-06-18 12:05:16,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 12:05:21,321][12883] Updated weights for policy 0, policy_version 129423 (0.0044) +[2024-06-18 12:05:21,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2120499200. Throughput: 0: 42855.7. Samples: 2120619440. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:05:21,994][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 12:05:24,497][12883] Updated weights for policy 0, policy_version 129433 (0.0030) +[2024-06-18 12:05:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2120712192. Throughput: 0: 42793.5. Samples: 2120872120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:05:26,994][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 12:05:28,847][12883] Updated weights for policy 0, policy_version 129443 (0.0024) +[2024-06-18 12:05:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42599.9, 300 sec: 42598.4). Total num frames: 2120925184. Throughput: 0: 42687.2. Samples: 2120996540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:05:31,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 12:05:32,314][12883] Updated weights for policy 0, policy_version 129453 (0.0034) +[2024-06-18 12:05:36,391][12883] Updated weights for policy 0, policy_version 129463 (0.0034) +[2024-06-18 12:05:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2121154560. Throughput: 0: 42714.2. Samples: 2121257220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:05:36,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 12:05:39,840][12862] Signal inference workers to stop experience collection... (31050 times) +[2024-06-18 12:05:39,880][12883] InferenceWorker_p0-w0: stopping experience collection (31050 times) +[2024-06-18 12:05:39,897][12862] Signal inference workers to resume experience collection... (31050 times) +[2024-06-18 12:05:39,908][12883] InferenceWorker_p0-w0: resuming experience collection (31050 times) +[2024-06-18 12:05:39,911][12883] Updated weights for policy 0, policy_version 129473 (0.0031) +[2024-06-18 12:05:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2121334784. Throughput: 0: 42704.0. Samples: 2121513040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:05:41,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 12:05:44,068][12883] Updated weights for policy 0, policy_version 129483 (0.0036) +[2024-06-18 12:05:47,000][12645] Fps is (10 sec: 42571.6, 60 sec: 43140.1, 300 sec: 42653.0). Total num frames: 2121580544. Throughput: 0: 42626.5. Samples: 2121640120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:05:47,000][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 12:05:47,526][12883] Updated weights for policy 0, policy_version 129493 (0.0033) +[2024-06-18 12:05:51,766][12883] Updated weights for policy 0, policy_version 129503 (0.0040) +[2024-06-18 12:05:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2121777152. Throughput: 0: 42613.0. Samples: 2121897580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:05:51,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 12:05:55,247][12883] Updated weights for policy 0, policy_version 129513 (0.0028) +[2024-06-18 12:05:56,994][12645] Fps is (10 sec: 40985.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2121990144. Throughput: 0: 42612.9. Samples: 2122147460. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:05:56,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 12:05:59,545][12883] Updated weights for policy 0, policy_version 129523 (0.0036) +[2024-06-18 12:06:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 2122186752. Throughput: 0: 42489.4. Samples: 2122274900. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:06:01,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 12:06:03,011][12883] Updated weights for policy 0, policy_version 129533 (0.0031) +[2024-06-18 12:06:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 2122399744. Throughput: 0: 42394.6. Samples: 2122527200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:06:06,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 12:06:07,393][12883] Updated weights for policy 0, policy_version 129543 (0.0037) +[2024-06-18 12:06:10,821][12883] Updated weights for policy 0, policy_version 129553 (0.0043) +[2024-06-18 12:06:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2122629120. Throughput: 0: 42288.0. Samples: 2122775080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:06:11,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 12:06:15,116][12883] Updated weights for policy 0, policy_version 129563 (0.0040) +[2024-06-18 12:06:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 2122825728. Throughput: 0: 42489.9. Samples: 2122908580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:06:16,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 12:06:18,578][12883] Updated weights for policy 0, policy_version 129573 (0.0048) +[2024-06-18 12:06:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2123055104. Throughput: 0: 42410.2. Samples: 2123165680. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) +[2024-06-18 12:06:21,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 12:06:22,683][12883] Updated weights for policy 0, policy_version 129583 (0.0026) +[2024-06-18 12:06:26,282][12883] Updated weights for policy 0, policy_version 129593 (0.0042) +[2024-06-18 12:06:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2123268096. Throughput: 0: 42332.1. Samples: 2123417980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:06:26,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 12:06:30,263][12883] Updated weights for policy 0, policy_version 129603 (0.0027) +[2024-06-18 12:06:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42376.6). Total num frames: 2123464704. Throughput: 0: 42489.4. Samples: 2123551880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:06:31,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 12:06:33,908][12883] Updated weights for policy 0, policy_version 129613 (0.0028) +[2024-06-18 12:06:36,994][12645] Fps is (10 sec: 42597.1, 60 sec: 42325.2, 300 sec: 42432.7). Total num frames: 2123694080. Throughput: 0: 42389.2. Samples: 2123805100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:06:36,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 12:06:37,029][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129620_2123694080.pth... +[2024-06-18 12:06:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129000_2113536000.pth +[2024-06-18 12:06:37,982][12883] Updated weights for policy 0, policy_version 129623 (0.0022) +[2024-06-18 12:06:41,594][12883] Updated weights for policy 0, policy_version 129633 (0.0041) +[2024-06-18 12:06:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2123907072. Throughput: 0: 42384.5. Samples: 2124054760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:06:42,000][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 12:06:45,771][12883] Updated weights for policy 0, policy_version 129643 (0.0048) +[2024-06-18 12:06:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42056.6, 300 sec: 42431.8). Total num frames: 2124103680. Throughput: 0: 42459.5. Samples: 2124185580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:06:46,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 12:06:49,117][12883] Updated weights for policy 0, policy_version 129653 (0.0044) +[2024-06-18 12:06:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2124333056. Throughput: 0: 42594.8. Samples: 2124443960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:06:51,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 12:06:53,203][12883] Updated weights for policy 0, policy_version 129663 (0.0032) +[2024-06-18 12:06:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2124546048. Throughput: 0: 42830.5. Samples: 2124702460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:06:56,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 12:06:57,267][12883] Updated weights for policy 0, policy_version 129673 (0.0038) +[2024-06-18 12:07:01,203][12883] Updated weights for policy 0, policy_version 129683 (0.0029) +[2024-06-18 12:07:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2124759040. Throughput: 0: 42673.3. Samples: 2124828880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:07:01,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 12:07:03,494][12862] Signal inference workers to stop experience collection... (31100 times) +[2024-06-18 12:07:03,495][12862] Signal inference workers to resume experience collection... (31100 times) +[2024-06-18 12:07:03,536][12883] InferenceWorker_p0-w0: stopping experience collection (31100 times) +[2024-06-18 12:07:03,536][12883] InferenceWorker_p0-w0: resuming experience collection (31100 times) +[2024-06-18 12:07:04,847][12883] Updated weights for policy 0, policy_version 129693 (0.0039) +[2024-06-18 12:07:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2124988416. Throughput: 0: 42685.1. Samples: 2125086520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:07:07,003][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 12:07:08,802][12883] Updated weights for policy 0, policy_version 129703 (0.0037) +[2024-06-18 12:07:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2125185024. Throughput: 0: 42781.2. Samples: 2125343140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:07:11,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 12:07:12,423][12883] Updated weights for policy 0, policy_version 129713 (0.0030) +[2024-06-18 12:07:16,442][12883] Updated weights for policy 0, policy_version 129723 (0.0040) +[2024-06-18 12:07:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 2125398016. Throughput: 0: 42601.3. Samples: 2125468940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:07:17,003][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 12:07:20,047][12883] Updated weights for policy 0, policy_version 129733 (0.0040) +[2024-06-18 12:07:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2125627392. Throughput: 0: 42679.3. Samples: 2125725660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:07:22,008][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 12:07:23,930][12883] Updated weights for policy 0, policy_version 129743 (0.0034) +[2024-06-18 12:07:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2125840384. Throughput: 0: 42998.2. Samples: 2125989680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:07:26,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 12:07:27,547][12883] Updated weights for policy 0, policy_version 129753 (0.0039) +[2024-06-18 12:07:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2126020608. Throughput: 0: 42844.9. Samples: 2126113600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:07:31,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 12:07:32,078][12883] Updated weights for policy 0, policy_version 129763 (0.0030) +[2024-06-18 12:07:35,432][12883] Updated weights for policy 0, policy_version 129773 (0.0024) +[2024-06-18 12:07:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2126249984. Throughput: 0: 42841.1. Samples: 2126371820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:07:36,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 12:07:39,763][12883] Updated weights for policy 0, policy_version 129783 (0.0043) +[2024-06-18 12:07:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2126462976. Throughput: 0: 42854.0. Samples: 2126630880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:07:41,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 12:07:43,031][12883] Updated weights for policy 0, policy_version 129793 (0.0034) +[2024-06-18 12:07:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2126675968. Throughput: 0: 42967.9. Samples: 2126762440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:07:46,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 12:07:47,346][12883] Updated weights for policy 0, policy_version 129803 (0.0043) +[2024-06-18 12:07:50,654][12883] Updated weights for policy 0, policy_version 129813 (0.0035) +[2024-06-18 12:07:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2126905344. Throughput: 0: 42952.1. Samples: 2127019360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:07:51,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 12:07:54,924][12883] Updated weights for policy 0, policy_version 129823 (0.0030) +[2024-06-18 12:07:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 2127101952. Throughput: 0: 42878.8. Samples: 2127272680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:07:56,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 12:07:58,294][12883] Updated weights for policy 0, policy_version 129833 (0.0035) +[2024-06-18 12:08:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42543.4). Total num frames: 2127314944. Throughput: 0: 42844.9. Samples: 2127396960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:08:01,994][12645] Avg episode reward: [(0, '0.806')] +[2024-06-18 12:08:02,608][12883] Updated weights for policy 0, policy_version 129843 (0.0046) +[2024-06-18 12:08:06,049][12883] Updated weights for policy 0, policy_version 129853 (0.0042) +[2024-06-18 12:08:06,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2127560704. Throughput: 0: 42966.1. Samples: 2127659140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:08:06,994][12645] Avg episode reward: [(0, '0.817')] +[2024-06-18 12:08:10,240][12883] Updated weights for policy 0, policy_version 129863 (0.0042) +[2024-06-18 12:08:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2127757312. Throughput: 0: 42669.3. Samples: 2127909800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:08:11,994][12645] Avg episode reward: [(0, '0.673')] +[2024-06-18 12:08:13,736][12883] Updated weights for policy 0, policy_version 129873 (0.0039) +[2024-06-18 12:08:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2127970304. Throughput: 0: 42791.1. Samples: 2128039200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:08:16,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 12:08:17,697][12883] Updated weights for policy 0, policy_version 129883 (0.0031) +[2024-06-18 12:08:21,314][12883] Updated weights for policy 0, policy_version 129893 (0.0045) +[2024-06-18 12:08:21,995][12645] Fps is (10 sec: 44232.6, 60 sec: 42870.8, 300 sec: 42765.1). Total num frames: 2128199680. Throughput: 0: 42820.6. Samples: 2128298780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:08:21,995][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 12:08:25,155][12883] Updated weights for policy 0, policy_version 129903 (0.0032) +[2024-06-18 12:08:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2128396288. Throughput: 0: 42647.9. Samples: 2128550040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 12:08:26,994][12645] Avg episode reward: [(0, '0.639')] +[2024-06-18 12:08:29,164][12883] Updated weights for policy 0, policy_version 129913 (0.0033) +[2024-06-18 12:08:31,994][12645] Fps is (10 sec: 39325.3, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 2128592896. Throughput: 0: 42627.6. Samples: 2128680680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:08:31,994][12645] Avg episode reward: [(0, '0.639')] +[2024-06-18 12:08:32,853][12883] Updated weights for policy 0, policy_version 129923 (0.0038) +[2024-06-18 12:08:36,725][12883] Updated weights for policy 0, policy_version 129933 (0.0027) +[2024-06-18 12:08:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2128822272. Throughput: 0: 42705.4. Samples: 2128941100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:08:36,994][12645] Avg episode reward: [(0, '0.657')] +[2024-06-18 12:08:37,228][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129935_2128855040.pth... +[2024-06-18 12:08:37,284][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129308_2118582272.pth +[2024-06-18 12:08:40,829][12883] Updated weights for policy 0, policy_version 129943 (0.0035) +[2024-06-18 12:08:41,968][12862] Signal inference workers to stop experience collection... (31150 times) +[2024-06-18 12:08:41,968][12862] Signal inference workers to resume experience collection... (31150 times) +[2024-06-18 12:08:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2129051648. Throughput: 0: 42698.2. Samples: 2129194100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:08:41,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 12:08:42,012][12883] InferenceWorker_p0-w0: stopping experience collection (31150 times) +[2024-06-18 12:08:42,012][12883] InferenceWorker_p0-w0: resuming experience collection (31150 times) +[2024-06-18 12:08:44,371][12883] Updated weights for policy 0, policy_version 129953 (0.0030) +[2024-06-18 12:08:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2129248256. Throughput: 0: 42820.3. Samples: 2129323880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:08:46,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 12:08:48,524][12883] Updated weights for policy 0, policy_version 129963 (0.0040) +[2024-06-18 12:08:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2129461248. Throughput: 0: 42805.9. Samples: 2129585400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:08:51,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 12:08:52,013][12883] Updated weights for policy 0, policy_version 129973 (0.0038) +[2024-06-18 12:08:56,097][12883] Updated weights for policy 0, policy_version 129983 (0.0043) +[2024-06-18 12:08:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2129674240. Throughput: 0: 42990.7. Samples: 2129844380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:08:56,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 12:08:59,563][12883] Updated weights for policy 0, policy_version 129993 (0.0033) +[2024-06-18 12:09:01,998][12645] Fps is (10 sec: 44216.1, 60 sec: 43141.2, 300 sec: 42819.9). Total num frames: 2129903616. Throughput: 0: 42926.2. Samples: 2129971080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:09:01,999][12645] Avg episode reward: [(0, '0.670')] +[2024-06-18 12:09:03,538][12883] Updated weights for policy 0, policy_version 130003 (0.0038) +[2024-06-18 12:09:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 2130116608. Throughput: 0: 42864.6. Samples: 2130227640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:09:06,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 12:09:07,132][12883] Updated weights for policy 0, policy_version 130013 (0.0047) +[2024-06-18 12:09:11,762][12883] Updated weights for policy 0, policy_version 130023 (0.0034) +[2024-06-18 12:09:11,996][12645] Fps is (10 sec: 39331.4, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 2130296832. Throughput: 0: 43084.1. Samples: 2130488920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:09:11,996][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 12:09:14,694][12883] Updated weights for policy 0, policy_version 130033 (0.0040) +[2024-06-18 12:09:16,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2130542592. Throughput: 0: 42798.1. Samples: 2130606600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:09:16,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 12:09:19,261][12883] Updated weights for policy 0, policy_version 130043 (0.0047) +[2024-06-18 12:09:21,994][12645] Fps is (10 sec: 47524.4, 60 sec: 42872.2, 300 sec: 42820.6). Total num frames: 2130771968. Throughput: 0: 42853.8. Samples: 2130869520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:09:21,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 12:09:22,182][12883] Updated weights for policy 0, policy_version 130053 (0.0031) +[2024-06-18 12:09:26,777][12883] Updated weights for policy 0, policy_version 130063 (0.0044) +[2024-06-18 12:09:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42654.2). Total num frames: 2130952192. Throughput: 0: 43051.4. Samples: 2131131420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:09:26,994][12645] Avg episode reward: [(0, '0.727')] +[2024-06-18 12:09:29,784][12883] Updated weights for policy 0, policy_version 130073 (0.0041) +[2024-06-18 12:09:31,994][12645] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2131181568. Throughput: 0: 42856.4. Samples: 2131252420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:09:31,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 12:09:34,251][12883] Updated weights for policy 0, policy_version 130083 (0.0031) +[2024-06-18 12:09:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2131394560. Throughput: 0: 42899.5. Samples: 2131515880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:09:36,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 12:09:37,535][12883] Updated weights for policy 0, policy_version 130093 (0.0036) +[2024-06-18 12:09:41,699][12883] Updated weights for policy 0, policy_version 130103 (0.0033) +[2024-06-18 12:09:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2131607552. Throughput: 0: 42857.2. Samples: 2131772960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:09:41,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 12:09:45,355][12883] Updated weights for policy 0, policy_version 130113 (0.0032) +[2024-06-18 12:09:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2131820544. Throughput: 0: 42732.1. Samples: 2131893820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:09:46,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 12:09:49,817][12883] Updated weights for policy 0, policy_version 130123 (0.0029) +[2024-06-18 12:09:51,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2132049920. Throughput: 0: 42892.0. Samples: 2132157780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:09:51,994][12645] Avg episode reward: [(0, '0.713')] +[2024-06-18 12:09:53,188][12883] Updated weights for policy 0, policy_version 130133 (0.0031) +[2024-06-18 12:09:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2132246528. Throughput: 0: 42800.8. Samples: 2132414860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:09:56,994][12645] Avg episode reward: [(0, '0.715')] +[2024-06-18 12:09:57,396][12883] Updated weights for policy 0, policy_version 130143 (0.0033) +[2024-06-18 12:09:58,600][12862] Signal inference workers to stop experience collection... (31200 times) +[2024-06-18 12:09:58,632][12883] InferenceWorker_p0-w0: stopping experience collection (31200 times) +[2024-06-18 12:09:58,656][12862] Signal inference workers to resume experience collection... (31200 times) +[2024-06-18 12:09:58,664][12883] InferenceWorker_p0-w0: resuming experience collection (31200 times) +[2024-06-18 12:10:00,688][12883] Updated weights for policy 0, policy_version 130153 (0.0041) +[2024-06-18 12:10:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42601.7, 300 sec: 42765.0). Total num frames: 2132459520. Throughput: 0: 42834.7. Samples: 2132534160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:10:01,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 12:10:04,893][12883] Updated weights for policy 0, policy_version 130163 (0.0031) +[2024-06-18 12:10:06,996][12645] Fps is (10 sec: 45865.1, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 2132705280. Throughput: 0: 42888.1. Samples: 2132799580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:10:06,996][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 12:10:08,256][12883] Updated weights for policy 0, policy_version 130173 (0.0028) +[2024-06-18 12:10:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2132869120. Throughput: 0: 42713.0. Samples: 2133053500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:10:11,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 12:10:12,611][12883] Updated weights for policy 0, policy_version 130183 (0.0037) +[2024-06-18 12:10:15,832][12883] Updated weights for policy 0, policy_version 130193 (0.0051) +[2024-06-18 12:10:16,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2133114880. Throughput: 0: 42864.1. Samples: 2133181300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:10:16,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 12:10:20,172][12883] Updated weights for policy 0, policy_version 130203 (0.0046) +[2024-06-18 12:10:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2133311488. Throughput: 0: 42745.9. Samples: 2133439440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:10:21,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 12:10:23,639][12883] Updated weights for policy 0, policy_version 130213 (0.0019) +[2024-06-18 12:10:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2133524480. Throughput: 0: 42554.2. Samples: 2133687900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:10:26,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 12:10:27,892][12883] Updated weights for policy 0, policy_version 130223 (0.0030) +[2024-06-18 12:10:31,249][12883] Updated weights for policy 0, policy_version 130233 (0.0036) +[2024-06-18 12:10:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2133753856. Throughput: 0: 42756.4. Samples: 2133817860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 12:10:31,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 12:10:35,531][12883] Updated weights for policy 0, policy_version 130243 (0.0027) +[2024-06-18 12:10:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2133966848. Throughput: 0: 42542.2. Samples: 2134072180. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:10:36,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 12:10:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130247_2133966848.pth... +[2024-06-18 12:10:37,060][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129620_2123694080.pth +[2024-06-18 12:10:38,846][12883] Updated weights for policy 0, policy_version 130253 (0.0033) +[2024-06-18 12:10:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 2134179840. Throughput: 0: 42550.6. Samples: 2134329640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:10:41,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 12:10:43,198][12883] Updated weights for policy 0, policy_version 130263 (0.0047) +[2024-06-18 12:10:46,801][12883] Updated weights for policy 0, policy_version 130273 (0.0041) +[2024-06-18 12:10:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2134409216. Throughput: 0: 42710.4. Samples: 2134456120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:10:46,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 12:10:50,827][12883] Updated weights for policy 0, policy_version 130283 (0.0027) +[2024-06-18 12:10:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2134605824. Throughput: 0: 42525.1. Samples: 2134713120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:10:51,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 12:10:54,365][12883] Updated weights for policy 0, policy_version 130293 (0.0035) +[2024-06-18 12:10:56,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2134802432. Throughput: 0: 42515.9. Samples: 2134966720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:10:56,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 12:10:58,479][12883] Updated weights for policy 0, policy_version 130303 (0.0029) +[2024-06-18 12:10:59,099][12862] Signal inference workers to stop experience collection... (31250 times) +[2024-06-18 12:10:59,100][12862] Signal inference workers to resume experience collection... (31250 times) +[2024-06-18 12:10:59,119][12883] InferenceWorker_p0-w0: stopping experience collection (31250 times) +[2024-06-18 12:10:59,119][12883] InferenceWorker_p0-w0: resuming experience collection (31250 times) +[2024-06-18 12:11:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2135031808. Throughput: 0: 42562.2. Samples: 2135096600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:11:01,994][12645] Avg episode reward: [(0, '0.258')] +[2024-06-18 12:11:02,123][12883] Updated weights for policy 0, policy_version 130313 (0.0035) +[2024-06-18 12:11:06,163][12883] Updated weights for policy 0, policy_version 130323 (0.0042) +[2024-06-18 12:11:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 2135244800. Throughput: 0: 42527.9. Samples: 2135353200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:11:07,003][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 12:11:09,772][12883] Updated weights for policy 0, policy_version 130333 (0.0039) +[2024-06-18 12:11:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2135441408. Throughput: 0: 42633.7. Samples: 2135606420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:11:11,994][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 12:11:14,214][12883] Updated weights for policy 0, policy_version 130343 (0.0039) +[2024-06-18 12:11:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2135670784. Throughput: 0: 42534.2. Samples: 2135731900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:11:16,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 12:11:17,459][12883] Updated weights for policy 0, policy_version 130353 (0.0036) +[2024-06-18 12:11:22,000][12645] Fps is (10 sec: 40934.7, 60 sec: 42320.9, 300 sec: 42653.0). Total num frames: 2135851008. Throughput: 0: 42478.9. Samples: 2135984000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:11:22,001][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 12:11:22,213][12883] Updated weights for policy 0, policy_version 130363 (0.0040) +[2024-06-18 12:11:25,118][12883] Updated weights for policy 0, policy_version 130373 (0.0037) +[2024-06-18 12:11:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2136080384. Throughput: 0: 42346.2. Samples: 2136235220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:11:26,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 12:11:29,804][12883] Updated weights for policy 0, policy_version 130383 (0.0039) +[2024-06-18 12:11:31,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2136293376. Throughput: 0: 42491.4. Samples: 2136368240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) +[2024-06-18 12:11:31,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 12:11:32,725][12883] Updated weights for policy 0, policy_version 130393 (0.0038) +[2024-06-18 12:11:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2136489984. Throughput: 0: 42336.4. Samples: 2136618260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:11:36,994][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 12:11:37,313][12883] Updated weights for policy 0, policy_version 130403 (0.0030) +[2024-06-18 12:11:41,449][12883] Updated weights for policy 0, policy_version 130413 (0.0055) +[2024-06-18 12:11:42,000][12645] Fps is (10 sec: 42572.2, 60 sec: 42321.0, 300 sec: 42764.1). Total num frames: 2136719360. Throughput: 0: 42507.0. Samples: 2136879800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:11:42,001][12645] Avg episode reward: [(0, '0.219')] +[2024-06-18 12:11:45,010][12883] Updated weights for policy 0, policy_version 130423 (0.0030) +[2024-06-18 12:11:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2136932352. Throughput: 0: 42465.3. Samples: 2137007540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:11:46,996][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 12:11:49,006][12883] Updated weights for policy 0, policy_version 130433 (0.0029) +[2024-06-18 12:11:51,994][12645] Fps is (10 sec: 42624.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2137145344. Throughput: 0: 42373.7. Samples: 2137260020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:11:51,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 12:11:52,775][12883] Updated weights for policy 0, policy_version 130443 (0.0023) +[2024-06-18 12:11:56,572][12883] Updated weights for policy 0, policy_version 130453 (0.0030) +[2024-06-18 12:11:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2137374720. Throughput: 0: 42477.4. Samples: 2137517900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:11:56,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 12:12:00,479][12883] Updated weights for policy 0, policy_version 130463 (0.0034) +[2024-06-18 12:12:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2137571328. Throughput: 0: 42522.3. Samples: 2137645400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:12:01,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 12:12:04,206][12883] Updated weights for policy 0, policy_version 130473 (0.0034) +[2024-06-18 12:12:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2137784320. Throughput: 0: 42659.7. Samples: 2137903420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:12:06,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 12:12:07,954][12862] Signal inference workers to stop experience collection... (31300 times) +[2024-06-18 12:12:08,005][12862] Signal inference workers to resume experience collection... (31300 times) +[2024-06-18 12:12:08,006][12883] InferenceWorker_p0-w0: stopping experience collection (31300 times) +[2024-06-18 12:12:08,013][12883] Updated weights for policy 0, policy_version 130483 (0.0031) +[2024-06-18 12:12:08,021][12883] InferenceWorker_p0-w0: resuming experience collection (31300 times) +[2024-06-18 12:12:11,758][12883] Updated weights for policy 0, policy_version 130493 (0.0024) +[2024-06-18 12:12:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2138013696. Throughput: 0: 42795.7. Samples: 2138161020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:12:11,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 12:12:15,418][12883] Updated weights for policy 0, policy_version 130503 (0.0033) +[2024-06-18 12:12:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2138210304. Throughput: 0: 42742.6. Samples: 2138291660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:12:16,994][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 12:12:19,251][12883] Updated weights for policy 0, policy_version 130513 (0.0029) +[2024-06-18 12:12:21,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42874.4, 300 sec: 42653.6). Total num frames: 2138423296. Throughput: 0: 42821.1. Samples: 2138545300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:12:21,997][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 12:12:23,026][12883] Updated weights for policy 0, policy_version 130523 (0.0029) +[2024-06-18 12:12:26,799][12883] Updated weights for policy 0, policy_version 130533 (0.0019) +[2024-06-18 12:12:26,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2138652672. Throughput: 0: 42841.6. Samples: 2138807400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:12:26,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 12:12:30,567][12883] Updated weights for policy 0, policy_version 130543 (0.0042) +[2024-06-18 12:12:31,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2138849280. Throughput: 0: 42866.7. Samples: 2138936540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:12:31,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 12:12:34,343][12883] Updated weights for policy 0, policy_version 130553 (0.0031) +[2024-06-18 12:12:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2139078656. Throughput: 0: 42933.9. Samples: 2139192040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 12:12:36,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 12:12:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130559_2139078656.pth... +[2024-06-18 12:12:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129935_2128855040.pth +[2024-06-18 12:12:38,623][12883] Updated weights for policy 0, policy_version 130563 (0.0031) +[2024-06-18 12:12:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 2139291648. Throughput: 0: 42858.3. Samples: 2139446520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:12:41,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 12:12:42,352][12883] Updated weights for policy 0, policy_version 130573 (0.0036) +[2024-06-18 12:12:46,245][12883] Updated weights for policy 0, policy_version 130583 (0.0042) +[2024-06-18 12:12:47,000][12645] Fps is (10 sec: 40934.3, 60 sec: 42594.0, 300 sec: 42653.0). Total num frames: 2139488256. Throughput: 0: 42855.3. Samples: 2139574160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:12:47,001][12645] Avg episode reward: [(0, '0.181')] +[2024-06-18 12:12:50,060][12883] Updated weights for policy 0, policy_version 130593 (0.0041) +[2024-06-18 12:12:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2139701248. Throughput: 0: 42649.0. Samples: 2139822620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:12:51,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 12:12:53,918][12883] Updated weights for policy 0, policy_version 130603 (0.0038) +[2024-06-18 12:12:56,996][12645] Fps is (10 sec: 42615.6, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 2139914240. Throughput: 0: 42773.5. Samples: 2140085920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:12:56,996][12645] Avg episode reward: [(0, '0.740')] +[2024-06-18 12:12:57,705][12883] Updated weights for policy 0, policy_version 130613 (0.0032) +[2024-06-18 12:13:01,558][12883] Updated weights for policy 0, policy_version 130623 (0.0033) +[2024-06-18 12:13:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2140143616. Throughput: 0: 42630.3. Samples: 2140210020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:13:01,994][12645] Avg episode reward: [(0, '0.660')] +[2024-06-18 12:13:05,626][12883] Updated weights for policy 0, policy_version 130633 (0.0027) +[2024-06-18 12:13:06,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2140340224. Throughput: 0: 42628.3. Samples: 2140463480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:13:06,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 12:13:09,127][12883] Updated weights for policy 0, policy_version 130643 (0.0037) +[2024-06-18 12:13:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2140553216. Throughput: 0: 42545.2. Samples: 2140721940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:13:11,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 12:13:13,513][12883] Updated weights for policy 0, policy_version 130653 (0.0038) +[2024-06-18 12:13:16,649][12883] Updated weights for policy 0, policy_version 130663 (0.0034) +[2024-06-18 12:13:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42709.6). Total num frames: 2140798976. Throughput: 0: 42415.0. Samples: 2140845220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:13:16,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 12:13:21,053][12883] Updated weights for policy 0, policy_version 130673 (0.0048) +[2024-06-18 12:13:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2140995584. Throughput: 0: 42588.1. Samples: 2141108500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:13:21,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 12:13:24,166][12883] Updated weights for policy 0, policy_version 130683 (0.0040) +[2024-06-18 12:13:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2141192192. Throughput: 0: 42588.9. Samples: 2141363020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:13:26,994][12645] Avg episode reward: [(0, '0.694')] +[2024-06-18 12:13:28,598][12883] Updated weights for policy 0, policy_version 130693 (0.0033) +[2024-06-18 12:13:31,819][12883] Updated weights for policy 0, policy_version 130703 (0.0052) +[2024-06-18 12:13:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2141437952. Throughput: 0: 42508.5. Samples: 2141486780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:13:31,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 12:13:34,967][12862] Signal inference workers to stop experience collection... (31350 times) +[2024-06-18 12:13:34,968][12862] Signal inference workers to resume experience collection... (31350 times) +[2024-06-18 12:13:35,015][12883] InferenceWorker_p0-w0: stopping experience collection (31350 times) +[2024-06-18 12:13:35,016][12883] InferenceWorker_p0-w0: resuming experience collection (31350 times) +[2024-06-18 12:13:36,368][12883] Updated weights for policy 0, policy_version 130713 (0.0038) +[2024-06-18 12:13:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2141634560. Throughput: 0: 42681.2. Samples: 2141743280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 12:13:36,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 12:13:39,491][12883] Updated weights for policy 0, policy_version 130723 (0.0024) +[2024-06-18 12:13:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2141831168. Throughput: 0: 42564.4. Samples: 2142001220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:13:41,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 12:13:44,090][12883] Updated weights for policy 0, policy_version 130733 (0.0028) +[2024-06-18 12:13:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43149.0, 300 sec: 42765.0). Total num frames: 2142076928. Throughput: 0: 42508.4. Samples: 2142122900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:13:47,000][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 12:13:47,749][12883] Updated weights for policy 0, policy_version 130743 (0.0038) +[2024-06-18 12:13:51,581][12883] Updated weights for policy 0, policy_version 130753 (0.0033) +[2024-06-18 12:13:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2142273536. Throughput: 0: 42641.0. Samples: 2142382320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:13:51,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 12:13:55,290][12883] Updated weights for policy 0, policy_version 130763 (0.0034) +[2024-06-18 12:13:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42326.9, 300 sec: 42543.5). Total num frames: 2142453760. Throughput: 0: 42565.8. Samples: 2142637400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:13:56,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 12:13:59,190][12883] Updated weights for policy 0, policy_version 130773 (0.0039) +[2024-06-18 12:14:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2142699520. Throughput: 0: 42549.4. Samples: 2142759940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:01,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 12:14:03,048][12883] Updated weights for policy 0, policy_version 130783 (0.0034) +[2024-06-18 12:14:06,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 2142896128. Throughput: 0: 42442.3. Samples: 2143018400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:06,994][12645] Avg episode reward: [(0, '0.173')] +[2024-06-18 12:14:07,010][12883] Updated weights for policy 0, policy_version 130793 (0.0044) +[2024-06-18 12:14:10,775][12883] Updated weights for policy 0, policy_version 130803 (0.0039) +[2024-06-18 12:14:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2143092736. Throughput: 0: 42471.2. Samples: 2143274220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:11,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 12:14:14,563][12883] Updated weights for policy 0, policy_version 130813 (0.0033) +[2024-06-18 12:14:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2143322112. Throughput: 0: 42470.3. Samples: 2143397940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:16,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 12:14:18,428][12883] Updated weights for policy 0, policy_version 130823 (0.0033) +[2024-06-18 12:14:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2143535104. Throughput: 0: 42581.8. Samples: 2143659460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:21,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 12:14:22,389][12883] Updated weights for policy 0, policy_version 130833 (0.0032) +[2024-06-18 12:14:26,004][12883] Updated weights for policy 0, policy_version 130843 (0.0036) +[2024-06-18 12:14:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2143748096. Throughput: 0: 42481.7. Samples: 2143912900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:26,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 12:14:29,947][12883] Updated weights for policy 0, policy_version 130853 (0.0037) +[2024-06-18 12:14:31,995][12645] Fps is (10 sec: 44230.4, 60 sec: 42324.3, 300 sec: 42653.7). Total num frames: 2143977472. Throughput: 0: 42611.1. Samples: 2144040460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:31,996][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 12:14:33,855][12883] Updated weights for policy 0, policy_version 130863 (0.0038) +[2024-06-18 12:14:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2144174080. Throughput: 0: 42563.1. Samples: 2144297660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:36,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 12:14:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130870_2144174080.pth... +[2024-06-18 12:14:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130247_2133966848.pth +[2024-06-18 12:14:37,675][12883] Updated weights for policy 0, policy_version 130873 (0.0027) +[2024-06-18 12:14:41,690][12883] Updated weights for policy 0, policy_version 130883 (0.0034) +[2024-06-18 12:14:41,994][12645] Fps is (10 sec: 40966.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2144387072. Throughput: 0: 42560.5. Samples: 2144552620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) +[2024-06-18 12:14:41,994][12645] Avg episode reward: [(0, '0.140')] +[2024-06-18 12:14:45,299][12883] Updated weights for policy 0, policy_version 130893 (0.0024) +[2024-06-18 12:14:46,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2144616448. Throughput: 0: 42658.1. Samples: 2144679560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:14:46,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 12:14:49,422][12883] Updated weights for policy 0, policy_version 130903 (0.0033) +[2024-06-18 12:14:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2144813056. Throughput: 0: 42668.7. Samples: 2144938500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:14:51,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 12:14:53,005][12883] Updated weights for policy 0, policy_version 130913 (0.0049) +[2024-06-18 12:14:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2145026048. Throughput: 0: 42385.3. Samples: 2145181560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:14:56,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 12:14:57,594][12883] Updated weights for policy 0, policy_version 130923 (0.0042) +[2024-06-18 12:15:00,806][12883] Updated weights for policy 0, policy_version 130933 (0.0035) +[2024-06-18 12:15:01,442][12862] Signal inference workers to stop experience collection... (31400 times) +[2024-06-18 12:15:01,480][12883] InferenceWorker_p0-w0: stopping experience collection (31400 times) +[2024-06-18 12:15:01,501][12862] Signal inference workers to resume experience collection... (31400 times) +[2024-06-18 12:15:01,504][12883] InferenceWorker_p0-w0: resuming experience collection (31400 times) +[2024-06-18 12:15:01,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 2145255424. Throughput: 0: 42609.8. Samples: 2145315380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:01,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 12:15:05,136][12883] Updated weights for policy 0, policy_version 130943 (0.0038) +[2024-06-18 12:15:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.1, 300 sec: 42598.4). Total num frames: 2145435648. Throughput: 0: 42509.7. Samples: 2145572400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:06,994][12645] Avg episode reward: [(0, '0.716')] +[2024-06-18 12:15:08,605][12883] Updated weights for policy 0, policy_version 130953 (0.0036) +[2024-06-18 12:15:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2145665024. Throughput: 0: 42315.3. Samples: 2145817080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:11,994][12645] Avg episode reward: [(0, '0.738')] +[2024-06-18 12:15:12,699][12883] Updated weights for policy 0, policy_version 130963 (0.0036) +[2024-06-18 12:15:16,124][12883] Updated weights for policy 0, policy_version 130973 (0.0033) +[2024-06-18 12:15:16,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2145894400. Throughput: 0: 42554.3. Samples: 2145955340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:16,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 12:15:20,218][12883] Updated weights for policy 0, policy_version 130983 (0.0034) +[2024-06-18 12:15:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2146074624. Throughput: 0: 42467.1. Samples: 2146208680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:21,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 12:15:23,895][12883] Updated weights for policy 0, policy_version 130993 (0.0044) +[2024-06-18 12:15:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2146320384. Throughput: 0: 42336.8. Samples: 2146457780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:26,996][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 12:15:28,074][12883] Updated weights for policy 0, policy_version 131003 (0.0042) +[2024-06-18 12:15:31,587][12883] Updated weights for policy 0, policy_version 131013 (0.0036) +[2024-06-18 12:15:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42326.4, 300 sec: 42542.8). Total num frames: 2146516992. Throughput: 0: 42559.2. Samples: 2146594720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:31,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 12:15:35,759][12883] Updated weights for policy 0, policy_version 131023 (0.0039) +[2024-06-18 12:15:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 2146713600. Throughput: 0: 42453.5. Samples: 2146848900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:36,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 12:15:39,173][12883] Updated weights for policy 0, policy_version 131033 (0.0026) +[2024-06-18 12:15:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2146959360. Throughput: 0: 42610.1. Samples: 2147099020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) +[2024-06-18 12:15:41,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 12:15:43,540][12883] Updated weights for policy 0, policy_version 131043 (0.0031) +[2024-06-18 12:15:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2147155968. Throughput: 0: 42583.0. Samples: 2147231620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:15:46,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 12:15:47,434][12883] Updated weights for policy 0, policy_version 131053 (0.0033) +[2024-06-18 12:15:51,057][12883] Updated weights for policy 0, policy_version 131063 (0.0055) +[2024-06-18 12:15:51,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2147352576. Throughput: 0: 42363.8. Samples: 2147478860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:15:51,997][12645] Avg episode reward: [(0, '0.691')] +[2024-06-18 12:15:54,946][12883] Updated weights for policy 0, policy_version 131073 (0.0033) +[2024-06-18 12:15:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2147581952. Throughput: 0: 42648.0. Samples: 2147736240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:15:56,994][12645] Avg episode reward: [(0, '0.649')] +[2024-06-18 12:15:58,949][12883] Updated weights for policy 0, policy_version 131083 (0.0037) +[2024-06-18 12:16:01,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2147794944. Throughput: 0: 42396.0. Samples: 2147863160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:01,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 12:16:02,802][12883] Updated weights for policy 0, policy_version 131093 (0.0050) +[2024-06-18 12:16:06,501][12883] Updated weights for policy 0, policy_version 131103 (0.0038) +[2024-06-18 12:16:07,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42867.2, 300 sec: 42597.5). Total num frames: 2148007936. Throughput: 0: 42443.5. Samples: 2148118900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:07,000][12645] Avg episode reward: [(0, '0.564')] +[2024-06-18 12:16:10,366][12883] Updated weights for policy 0, policy_version 131113 (0.0032) +[2024-06-18 12:16:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2148220928. Throughput: 0: 42785.8. Samples: 2148383140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:11,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 12:16:14,261][12883] Updated weights for policy 0, policy_version 131123 (0.0045) +[2024-06-18 12:16:16,994][12645] Fps is (10 sec: 42624.7, 60 sec: 42325.3, 300 sec: 42654.8). Total num frames: 2148433920. Throughput: 0: 42533.8. Samples: 2148508740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:16,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 12:16:17,929][12883] Updated weights for policy 0, policy_version 131133 (0.0035) +[2024-06-18 12:16:21,781][12883] Updated weights for policy 0, policy_version 131143 (0.0034) +[2024-06-18 12:16:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2148646912. Throughput: 0: 42529.2. Samples: 2148762720. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:21,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 12:16:25,453][12883] Updated weights for policy 0, policy_version 131153 (0.0033) +[2024-06-18 12:16:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2148859904. Throughput: 0: 42745.3. Samples: 2149022560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:26,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 12:16:29,235][12883] Updated weights for policy 0, policy_version 131163 (0.0042) +[2024-06-18 12:16:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2149056512. Throughput: 0: 42650.7. Samples: 2149150900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:31,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 12:16:32,612][12862] Signal inference workers to stop experience collection... (31450 times) +[2024-06-18 12:16:32,662][12862] Signal inference workers to resume experience collection... (31450 times) +[2024-06-18 12:16:32,663][12883] InferenceWorker_p0-w0: stopping experience collection (31450 times) +[2024-06-18 12:16:32,688][12883] InferenceWorker_p0-w0: resuming experience collection (31450 times) +[2024-06-18 12:16:33,179][12883] Updated weights for policy 0, policy_version 131173 (0.0033) +[2024-06-18 12:16:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 2149285888. Throughput: 0: 42891.0. Samples: 2149408860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:36,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 12:16:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131182_2149285888.pth... +[2024-06-18 12:16:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130559_2139078656.pth +[2024-06-18 12:16:37,328][12883] Updated weights for policy 0, policy_version 131183 (0.0044) +[2024-06-18 12:16:40,784][12883] Updated weights for policy 0, policy_version 131193 (0.0033) +[2024-06-18 12:16:41,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2149515264. Throughput: 0: 42766.9. Samples: 2149660760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:41,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 12:16:44,809][12883] Updated weights for policy 0, policy_version 131203 (0.0039) +[2024-06-18 12:16:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2149695488. Throughput: 0: 42804.0. Samples: 2149789340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) +[2024-06-18 12:16:46,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 12:16:48,572][12883] Updated weights for policy 0, policy_version 131213 (0.0030) +[2024-06-18 12:16:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 2149924864. Throughput: 0: 42899.1. Samples: 2150049100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:16:51,995][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 12:16:52,272][12883] Updated weights for policy 0, policy_version 131223 (0.0032) +[2024-06-18 12:16:56,203][12883] Updated weights for policy 0, policy_version 131233 (0.0037) +[2024-06-18 12:16:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2150154240. Throughput: 0: 42766.8. Samples: 2150307640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:16:56,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 12:16:59,856][12883] Updated weights for policy 0, policy_version 131243 (0.0044) +[2024-06-18 12:17:01,996][12645] Fps is (10 sec: 44227.6, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2150367232. Throughput: 0: 42917.0. Samples: 2150440100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:01,996][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 12:17:03,675][12883] Updated weights for policy 0, policy_version 131253 (0.0033) +[2024-06-18 12:17:06,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42602.7, 300 sec: 42542.8). Total num frames: 2150563840. Throughput: 0: 42944.3. Samples: 2150695220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:06,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 12:17:07,550][12883] Updated weights for policy 0, policy_version 131263 (0.0034) +[2024-06-18 12:17:11,200][12883] Updated weights for policy 0, policy_version 131273 (0.0036) +[2024-06-18 12:17:11,994][12645] Fps is (10 sec: 44246.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2150809600. Throughput: 0: 42726.3. Samples: 2150945240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:11,994][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 12:17:15,157][12883] Updated weights for policy 0, policy_version 131283 (0.0034) +[2024-06-18 12:17:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 2150989824. Throughput: 0: 42924.7. Samples: 2151082520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:16,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 12:17:18,873][12883] Updated weights for policy 0, policy_version 131293 (0.0041) +[2024-06-18 12:17:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2151202816. Throughput: 0: 42795.2. Samples: 2151334640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:21,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 12:17:22,735][12883] Updated weights for policy 0, policy_version 131303 (0.0028) +[2024-06-18 12:17:26,579][12883] Updated weights for policy 0, policy_version 131313 (0.0027) +[2024-06-18 12:17:26,994][12645] Fps is (10 sec: 47514.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2151464960. Throughput: 0: 42884.1. Samples: 2151590540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:26,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 12:17:30,659][12883] Updated weights for policy 0, policy_version 131323 (0.0034) +[2024-06-18 12:17:31,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2151628800. Throughput: 0: 42963.4. Samples: 2151722700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:31,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 12:17:34,448][12883] Updated weights for policy 0, policy_version 131333 (0.0026) +[2024-06-18 12:17:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2151841792. Throughput: 0: 42605.4. Samples: 2151966340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:36,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 12:17:38,345][12883] Updated weights for policy 0, policy_version 131343 (0.0032) +[2024-06-18 12:17:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42654.8). Total num frames: 2152071168. Throughput: 0: 42612.8. Samples: 2152225220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:41,996][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 12:17:42,006][12883] Updated weights for policy 0, policy_version 131353 (0.0031) +[2024-06-18 12:17:46,163][12883] Updated weights for policy 0, policy_version 131363 (0.0023) +[2024-06-18 12:17:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2152284160. Throughput: 0: 42642.5. Samples: 2152358920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) +[2024-06-18 12:17:46,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 12:17:49,598][12883] Updated weights for policy 0, policy_version 131373 (0.0027) +[2024-06-18 12:17:51,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42870.0, 300 sec: 42653.9). Total num frames: 2152497152. Throughput: 0: 42451.8. Samples: 2152605640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:17:51,997][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 12:17:53,738][12883] Updated weights for policy 0, policy_version 131383 (0.0025) +[2024-06-18 12:17:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2152710144. Throughput: 0: 42701.0. Samples: 2152866780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:17:56,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 12:17:57,214][12883] Updated weights for policy 0, policy_version 131393 (0.0036) +[2024-06-18 12:18:01,542][12883] Updated weights for policy 0, policy_version 131403 (0.0036) +[2024-06-18 12:18:01,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 2152906752. Throughput: 0: 42488.7. Samples: 2152994500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:01,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 12:18:04,648][12862] Signal inference workers to stop experience collection... (31500 times) +[2024-06-18 12:18:04,682][12883] InferenceWorker_p0-w0: stopping experience collection (31500 times) +[2024-06-18 12:18:04,696][12862] Signal inference workers to resume experience collection... (31500 times) +[2024-06-18 12:18:04,706][12883] InferenceWorker_p0-w0: resuming experience collection (31500 times) +[2024-06-18 12:18:04,831][12883] Updated weights for policy 0, policy_version 131413 (0.0043) +[2024-06-18 12:18:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2153136128. Throughput: 0: 42480.0. Samples: 2153246240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:06,994][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 12:18:09,167][12883] Updated weights for policy 0, policy_version 131423 (0.0032) +[2024-06-18 12:18:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2153349120. Throughput: 0: 42636.0. Samples: 2153509160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:11,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 12:18:12,430][12883] Updated weights for policy 0, policy_version 131433 (0.0023) +[2024-06-18 12:18:16,789][12883] Updated weights for policy 0, policy_version 131443 (0.0028) +[2024-06-18 12:18:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2153562112. Throughput: 0: 42511.2. Samples: 2153635700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:16,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 12:18:20,351][12883] Updated weights for policy 0, policy_version 131453 (0.0035) +[2024-06-18 12:18:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2153775104. Throughput: 0: 42605.7. Samples: 2153883600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:21,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 12:18:24,452][12883] Updated weights for policy 0, policy_version 131463 (0.0028) +[2024-06-18 12:18:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2153971712. Throughput: 0: 42636.4. Samples: 2154143860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:26,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 12:18:28,010][12883] Updated weights for policy 0, policy_version 131473 (0.0045) +[2024-06-18 12:18:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2154201088. Throughput: 0: 42472.9. Samples: 2154270200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:31,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 12:18:32,553][12883] Updated weights for policy 0, policy_version 131483 (0.0041) +[2024-06-18 12:18:36,037][12883] Updated weights for policy 0, policy_version 131493 (0.0039) +[2024-06-18 12:18:36,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2154414080. Throughput: 0: 42679.1. Samples: 2154526100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:36,994][12645] Avg episode reward: [(0, '0.149')] +[2024-06-18 12:18:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131496_2154430464.pth... +[2024-06-18 12:18:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130870_2144174080.pth +[2024-06-18 12:18:40,260][12883] Updated weights for policy 0, policy_version 131503 (0.0033) +[2024-06-18 12:18:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2154627072. Throughput: 0: 42541.7. Samples: 2154781160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:41,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 12:18:43,718][12883] Updated weights for policy 0, policy_version 131513 (0.0053) +[2024-06-18 12:18:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2154840064. Throughput: 0: 42522.2. Samples: 2154908000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:18:46,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 12:18:47,886][12883] Updated weights for policy 0, policy_version 131523 (0.0030) +[2024-06-18 12:18:51,283][12883] Updated weights for policy 0, policy_version 131533 (0.0045) +[2024-06-18 12:18:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2155069440. Throughput: 0: 42784.0. Samples: 2155171520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:18:51,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 12:18:55,426][12883] Updated weights for policy 0, policy_version 131543 (0.0024) +[2024-06-18 12:18:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2155249664. Throughput: 0: 42649.3. Samples: 2155428380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:18:56,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 12:18:58,862][12883] Updated weights for policy 0, policy_version 131553 (0.0029) +[2024-06-18 12:19:01,994][12645] Fps is (10 sec: 42597.5, 60 sec: 43144.3, 300 sec: 42709.4). Total num frames: 2155495424. Throughput: 0: 42489.2. Samples: 2155547720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:01,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 12:19:02,946][12883] Updated weights for policy 0, policy_version 131563 (0.0032) +[2024-06-18 12:19:06,771][12883] Updated weights for policy 0, policy_version 131573 (0.0034) +[2024-06-18 12:19:06,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2155708416. Throughput: 0: 42709.0. Samples: 2155805500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:06,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 12:19:10,408][12862] Signal inference workers to stop experience collection... (31550 times) +[2024-06-18 12:19:10,408][12862] Signal inference workers to resume experience collection... (31550 times) +[2024-06-18 12:19:10,434][12883] InferenceWorker_p0-w0: stopping experience collection (31550 times) +[2024-06-18 12:19:10,434][12883] InferenceWorker_p0-w0: resuming experience collection (31550 times) +[2024-06-18 12:19:10,555][12883] Updated weights for policy 0, policy_version 131583 (0.0023) +[2024-06-18 12:19:11,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2155905024. Throughput: 0: 42685.0. Samples: 2156064680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:11,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 12:19:14,298][12883] Updated weights for policy 0, policy_version 131593 (0.0025) +[2024-06-18 12:19:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2156134400. Throughput: 0: 42764.9. Samples: 2156194620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:16,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 12:19:18,098][12883] Updated weights for policy 0, policy_version 131603 (0.0035) +[2024-06-18 12:19:21,955][12883] Updated weights for policy 0, policy_version 131613 (0.0031) +[2024-06-18 12:19:21,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2156347392. Throughput: 0: 42805.8. Samples: 2156452460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:21,997][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 12:19:26,152][12883] Updated weights for policy 0, policy_version 131623 (0.0050) +[2024-06-18 12:19:26,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42869.9, 300 sec: 42598.3). Total num frames: 2156544000. Throughput: 0: 42867.6. Samples: 2156710300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:26,997][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 12:19:29,572][12883] Updated weights for policy 0, policy_version 131633 (0.0037) +[2024-06-18 12:19:31,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2156756992. Throughput: 0: 42765.7. Samples: 2156832460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:31,994][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 12:19:33,774][12883] Updated weights for policy 0, policy_version 131643 (0.0037) +[2024-06-18 12:19:36,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2156969984. Throughput: 0: 42507.1. Samples: 2157084340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:36,994][12645] Avg episode reward: [(0, '0.697')] +[2024-06-18 12:19:37,337][12883] Updated weights for policy 0, policy_version 131653 (0.0033) +[2024-06-18 12:19:41,389][12883] Updated weights for policy 0, policy_version 131663 (0.0039) +[2024-06-18 12:19:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2157166592. Throughput: 0: 42431.1. Samples: 2157337780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:41,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 12:19:45,238][12883] Updated weights for policy 0, policy_version 131673 (0.0030) +[2024-06-18 12:19:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2157395968. Throughput: 0: 42693.1. Samples: 2157468900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:46,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 12:19:49,341][12883] Updated weights for policy 0, policy_version 131683 (0.0037) +[2024-06-18 12:19:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2157608960. Throughput: 0: 42579.0. Samples: 2157721560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:19:51,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 12:19:52,974][12883] Updated weights for policy 0, policy_version 131693 (0.0041) +[2024-06-18 12:19:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2157805568. Throughput: 0: 42553.3. Samples: 2157979580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:19:56,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 12:19:57,355][12883] Updated weights for policy 0, policy_version 131703 (0.0037) +[2024-06-18 12:20:00,577][12883] Updated weights for policy 0, policy_version 131713 (0.0039) +[2024-06-18 12:20:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2158034944. Throughput: 0: 42401.2. Samples: 2158102680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:01,994][12645] Avg episode reward: [(0, '0.715')] +[2024-06-18 12:20:05,012][12883] Updated weights for policy 0, policy_version 131723 (0.0045) +[2024-06-18 12:20:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2158264320. Throughput: 0: 42339.8. Samples: 2158357660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:06,996][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 12:20:08,399][12883] Updated weights for policy 0, policy_version 131733 (0.0027) +[2024-06-18 12:20:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2158444544. Throughput: 0: 42465.3. Samples: 2158621140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:11,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 12:20:12,514][12883] Updated weights for policy 0, policy_version 131743 (0.0040) +[2024-06-18 12:20:16,189][12883] Updated weights for policy 0, policy_version 131753 (0.0026) +[2024-06-18 12:20:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2158673920. Throughput: 0: 42388.4. Samples: 2158739940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:16,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 12:20:20,076][12883] Updated weights for policy 0, policy_version 131763 (0.0044) +[2024-06-18 12:20:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 2158886912. Throughput: 0: 42506.3. Samples: 2158997120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:21,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 12:20:24,209][12883] Updated weights for policy 0, policy_version 131773 (0.0032) +[2024-06-18 12:20:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 2159083520. Throughput: 0: 42753.4. Samples: 2159261680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:26,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 12:20:27,637][12883] Updated weights for policy 0, policy_version 131783 (0.0040) +[2024-06-18 12:20:31,719][12883] Updated weights for policy 0, policy_version 131793 (0.0037) +[2024-06-18 12:20:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2159296512. Throughput: 0: 42566.2. Samples: 2159384380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:31,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 12:20:35,224][12883] Updated weights for policy 0, policy_version 131803 (0.0031) +[2024-06-18 12:20:36,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2159542272. Throughput: 0: 42736.5. Samples: 2159644700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:36,994][12645] Avg episode reward: [(0, '0.195')] +[2024-06-18 12:20:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131808_2159542272.pth... +[2024-06-18 12:20:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131182_2149285888.pth +[2024-06-18 12:20:39,191][12883] Updated weights for policy 0, policy_version 131813 (0.0034) +[2024-06-18 12:20:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2159738880. Throughput: 0: 42706.3. Samples: 2159901360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:41,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 12:20:42,868][12883] Updated weights for policy 0, policy_version 131823 (0.0027) +[2024-06-18 12:20:46,578][12883] Updated weights for policy 0, policy_version 131833 (0.0026) +[2024-06-18 12:20:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 2159951872. Throughput: 0: 42788.1. Samples: 2160028140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:46,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 12:20:49,779][12862] Signal inference workers to stop experience collection... (31600 times) +[2024-06-18 12:20:49,779][12862] Signal inference workers to resume experience collection... (31600 times) +[2024-06-18 12:20:49,798][12883] InferenceWorker_p0-w0: stopping experience collection (31600 times) +[2024-06-18 12:20:49,798][12883] InferenceWorker_p0-w0: resuming experience collection (31600 times) +[2024-06-18 12:20:50,674][12883] Updated weights for policy 0, policy_version 131843 (0.0037) +[2024-06-18 12:20:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2160164864. Throughput: 0: 42828.0. Samples: 2160284920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:51,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 12:20:54,039][12883] Updated weights for policy 0, policy_version 131853 (0.0034) +[2024-06-18 12:20:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2160361472. Throughput: 0: 42648.0. Samples: 2160540300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:20:56,994][12645] Avg episode reward: [(0, '0.177')] +[2024-06-18 12:20:58,212][12883] Updated weights for policy 0, policy_version 131863 (0.0044) +[2024-06-18 12:21:01,636][12883] Updated weights for policy 0, policy_version 131873 (0.0024) +[2024-06-18 12:21:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42710.4). Total num frames: 2160607232. Throughput: 0: 42905.0. Samples: 2160670660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:01,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 12:21:05,781][12883] Updated weights for policy 0, policy_version 131883 (0.0031) +[2024-06-18 12:21:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2160803840. Throughput: 0: 42815.2. Samples: 2160923800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:06,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 12:21:09,225][12883] Updated weights for policy 0, policy_version 131893 (0.0038) +[2024-06-18 12:21:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2161016832. Throughput: 0: 42576.9. Samples: 2161177640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:12,003][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 12:21:13,487][12883] Updated weights for policy 0, policy_version 131903 (0.0028) +[2024-06-18 12:21:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2161229824. Throughput: 0: 42715.0. Samples: 2161306560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:17,004][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 12:21:17,375][12883] Updated weights for policy 0, policy_version 131913 (0.0030) +[2024-06-18 12:21:21,517][12883] Updated weights for policy 0, policy_version 131923 (0.0033) +[2024-06-18 12:21:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2161426432. Throughput: 0: 42482.2. Samples: 2161556400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:21,994][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 12:21:24,891][12883] Updated weights for policy 0, policy_version 131933 (0.0037) +[2024-06-18 12:21:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2161639424. Throughput: 0: 42518.2. Samples: 2161814680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:26,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 12:21:29,072][12883] Updated weights for policy 0, policy_version 131943 (0.0030) +[2024-06-18 12:21:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2161868800. Throughput: 0: 42518.7. Samples: 2161941480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:31,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 12:21:32,519][12883] Updated weights for policy 0, policy_version 131953 (0.0031) +[2024-06-18 12:21:36,715][12883] Updated weights for policy 0, policy_version 131963 (0.0037) +[2024-06-18 12:21:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2162081792. Throughput: 0: 42606.2. Samples: 2162202200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:36,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 12:21:40,447][12883] Updated weights for policy 0, policy_version 131973 (0.0045) +[2024-06-18 12:21:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2162294784. Throughput: 0: 42580.5. Samples: 2162456420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:41,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 12:21:44,397][12883] Updated weights for policy 0, policy_version 131983 (0.0029) +[2024-06-18 12:21:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2162507776. Throughput: 0: 42443.0. Samples: 2162580600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:46,995][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 12:21:48,025][12883] Updated weights for policy 0, policy_version 131993 (0.0042) +[2024-06-18 12:21:51,997][12645] Fps is (10 sec: 40954.5, 60 sec: 42324.4, 300 sec: 42542.7). Total num frames: 2162704384. Throughput: 0: 42576.0. Samples: 2162839780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:51,998][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 12:21:52,376][12883] Updated weights for policy 0, policy_version 132003 (0.0044) +[2024-06-18 12:21:55,631][12883] Updated weights for policy 0, policy_version 132013 (0.0038) +[2024-06-18 12:21:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2162933760. Throughput: 0: 42392.9. Samples: 2163085320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) +[2024-06-18 12:21:56,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 12:22:00,081][12883] Updated weights for policy 0, policy_version 132023 (0.0034) +[2024-06-18 12:22:01,994][12645] Fps is (10 sec: 42604.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2163130368. Throughput: 0: 42555.2. Samples: 2163221540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:01,994][12645] Avg episode reward: [(0, '0.166')] +[2024-06-18 12:22:03,310][12883] Updated weights for policy 0, policy_version 132033 (0.0025) +[2024-06-18 12:22:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2163359744. Throughput: 0: 42536.0. Samples: 2163470520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:06,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 12:22:07,721][12883] Updated weights for policy 0, policy_version 132043 (0.0037) +[2024-06-18 12:22:11,367][12883] Updated weights for policy 0, policy_version 132053 (0.0036) +[2024-06-18 12:22:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2163589120. Throughput: 0: 42499.6. Samples: 2163727160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:11,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 12:22:15,327][12883] Updated weights for policy 0, policy_version 132063 (0.0034) +[2024-06-18 12:22:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2163769344. Throughput: 0: 42527.6. Samples: 2163855220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:16,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 12:22:18,999][12883] Updated weights for policy 0, policy_version 132073 (0.0029) +[2024-06-18 12:22:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2163998720. Throughput: 0: 42462.3. Samples: 2164113000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:21,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 12:22:22,842][12883] Updated weights for policy 0, policy_version 132083 (0.0029) +[2024-06-18 12:22:23,836][12862] Signal inference workers to stop experience collection... (31650 times) +[2024-06-18 12:22:23,845][12862] Signal inference workers to resume experience collection... (31650 times) +[2024-06-18 12:22:23,877][12883] InferenceWorker_p0-w0: stopping experience collection (31650 times) +[2024-06-18 12:22:23,877][12883] InferenceWorker_p0-w0: resuming experience collection (31650 times) +[2024-06-18 12:22:26,892][12883] Updated weights for policy 0, policy_version 132093 (0.0038) +[2024-06-18 12:22:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2164211712. Throughput: 0: 42552.0. Samples: 2164371260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:26,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 12:22:30,817][12883] Updated weights for policy 0, policy_version 132103 (0.0040) +[2024-06-18 12:22:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2164424704. Throughput: 0: 42598.8. Samples: 2164497540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:31,994][12645] Avg episode reward: [(0, '0.812')] +[2024-06-18 12:22:34,412][12883] Updated weights for policy 0, policy_version 132113 (0.0040) +[2024-06-18 12:22:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2164654080. Throughput: 0: 42547.8. Samples: 2164754380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:36,994][12645] Avg episode reward: [(0, '0.732')] +[2024-06-18 12:22:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132120_2164654080.pth... +[2024-06-18 12:22:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131496_2154430464.pth +[2024-06-18 12:22:38,310][12883] Updated weights for policy 0, policy_version 132123 (0.0031) +[2024-06-18 12:22:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2164834304. Throughput: 0: 42906.3. Samples: 2165016100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:41,994][12645] Avg episode reward: [(0, '0.738')] +[2024-06-18 12:22:42,139][12883] Updated weights for policy 0, policy_version 132133 (0.0032) +[2024-06-18 12:22:45,836][12883] Updated weights for policy 0, policy_version 132143 (0.0040) +[2024-06-18 12:22:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2165063680. Throughput: 0: 42672.8. Samples: 2165141820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:46,994][12645] Avg episode reward: [(0, '0.740')] +[2024-06-18 12:22:49,938][12883] Updated weights for policy 0, policy_version 132153 (0.0041) +[2024-06-18 12:22:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42872.5, 300 sec: 42598.4). Total num frames: 2165276672. Throughput: 0: 42673.8. Samples: 2165390840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:51,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 12:22:53,360][12883] Updated weights for policy 0, policy_version 132163 (0.0025) +[2024-06-18 12:22:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2165473280. Throughput: 0: 42837.6. Samples: 2165654860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:22:56,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 12:22:57,581][12883] Updated weights for policy 0, policy_version 132173 (0.0047) +[2024-06-18 12:23:00,903][12883] Updated weights for policy 0, policy_version 132183 (0.0031) +[2024-06-18 12:23:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2165702656. Throughput: 0: 42797.6. Samples: 2165781120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 12:23:01,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 12:23:05,458][12883] Updated weights for policy 0, policy_version 132193 (0.0037) +[2024-06-18 12:23:06,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2165932032. Throughput: 0: 42651.2. Samples: 2166032300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:06,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 12:23:08,620][12883] Updated weights for policy 0, policy_version 132203 (0.0046) +[2024-06-18 12:23:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2166095872. Throughput: 0: 42586.6. Samples: 2166287660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:11,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 12:23:13,197][12883] Updated weights for policy 0, policy_version 132213 (0.0038) +[2024-06-18 12:23:16,671][12883] Updated weights for policy 0, policy_version 132223 (0.0033) +[2024-06-18 12:23:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2166341632. Throughput: 0: 42470.1. Samples: 2166408700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:16,994][12645] Avg episode reward: [(0, '0.686')] +[2024-06-18 12:23:20,958][12883] Updated weights for policy 0, policy_version 132233 (0.0036) +[2024-06-18 12:23:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2166554624. Throughput: 0: 42605.8. Samples: 2166671640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:21,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 12:23:24,227][12883] Updated weights for policy 0, policy_version 132243 (0.0043) +[2024-06-18 12:23:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2166751232. Throughput: 0: 42475.9. Samples: 2166927520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:26,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 12:23:28,507][12883] Updated weights for policy 0, policy_version 132253 (0.0028) +[2024-06-18 12:23:31,783][12883] Updated weights for policy 0, policy_version 132263 (0.0032) +[2024-06-18 12:23:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2166996992. Throughput: 0: 42539.5. Samples: 2167056100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:31,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 12:23:36,129][12883] Updated weights for policy 0, policy_version 132273 (0.0043) +[2024-06-18 12:23:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 2167193600. Throughput: 0: 42660.0. Samples: 2167310540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:36,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 12:23:39,558][12883] Updated weights for policy 0, policy_version 132283 (0.0027) +[2024-06-18 12:23:41,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2167390208. Throughput: 0: 42536.6. Samples: 2167569000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:41,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 12:23:43,730][12883] Updated weights for policy 0, policy_version 132293 (0.0029) +[2024-06-18 12:23:46,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2167635968. Throughput: 0: 42467.0. Samples: 2167692140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:47,000][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 12:23:47,151][12883] Updated weights for policy 0, policy_version 132303 (0.0029) +[2024-06-18 12:23:51,438][12883] Updated weights for policy 0, policy_version 132313 (0.0056) +[2024-06-18 12:23:51,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2167832576. Throughput: 0: 42593.4. Samples: 2167949100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:51,997][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 12:23:54,886][12883] Updated weights for policy 0, policy_version 132323 (0.0033) +[2024-06-18 12:23:55,806][12862] Signal inference workers to stop experience collection... (31700 times) +[2024-06-18 12:23:55,806][12862] Signal inference workers to resume experience collection... (31700 times) +[2024-06-18 12:23:55,819][12883] InferenceWorker_p0-w0: stopping experience collection (31700 times) +[2024-06-18 12:23:55,844][12883] InferenceWorker_p0-w0: resuming experience collection (31700 times) +[2024-06-18 12:23:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2168029184. Throughput: 0: 42553.2. Samples: 2168202560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:23:56,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 12:23:59,401][12883] Updated weights for policy 0, policy_version 132333 (0.0031) +[2024-06-18 12:24:01,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2168258560. Throughput: 0: 42662.4. Samples: 2168328500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:24:01,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 12:24:02,715][12883] Updated weights for policy 0, policy_version 132343 (0.0032) +[2024-06-18 12:24:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2168471552. Throughput: 0: 42482.3. Samples: 2168583340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:06,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 12:24:07,006][12883] Updated weights for policy 0, policy_version 132353 (0.0031) +[2024-06-18 12:24:10,511][12883] Updated weights for policy 0, policy_version 132363 (0.0038) +[2024-06-18 12:24:11,996][12645] Fps is (10 sec: 42588.7, 60 sec: 43142.9, 300 sec: 42542.5). Total num frames: 2168684544. Throughput: 0: 42273.1. Samples: 2168829900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:11,996][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 12:24:14,790][12883] Updated weights for policy 0, policy_version 132373 (0.0031) +[2024-06-18 12:24:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2168881152. Throughput: 0: 42415.6. Samples: 2168964800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:16,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 12:24:18,173][12883] Updated weights for policy 0, policy_version 132383 (0.0030) +[2024-06-18 12:24:21,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42052.3, 300 sec: 42487.6). Total num frames: 2169077760. Throughput: 0: 42502.1. Samples: 2169223140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:21,994][12645] Avg episode reward: [(0, '0.650')] +[2024-06-18 12:24:22,445][12883] Updated weights for policy 0, policy_version 132393 (0.0034) +[2024-06-18 12:24:26,058][12883] Updated weights for policy 0, policy_version 132403 (0.0035) +[2024-06-18 12:24:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2169323520. Throughput: 0: 42205.7. Samples: 2169468260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:26,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 12:24:30,559][12883] Updated weights for policy 0, policy_version 132413 (0.0032) +[2024-06-18 12:24:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2169520128. Throughput: 0: 42467.3. Samples: 2169603160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:31,994][12645] Avg episode reward: [(0, '0.706')] +[2024-06-18 12:24:33,970][12883] Updated weights for policy 0, policy_version 132423 (0.0027) +[2024-06-18 12:24:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2169733120. Throughput: 0: 42490.0. Samples: 2169861060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:36,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 12:24:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132430_2169733120.pth... +[2024-06-18 12:24:37,045][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131808_2159542272.pth +[2024-06-18 12:24:37,971][12883] Updated weights for policy 0, policy_version 132433 (0.0033) +[2024-06-18 12:24:41,562][12883] Updated weights for policy 0, policy_version 132443 (0.0026) +[2024-06-18 12:24:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2169962496. Throughput: 0: 42441.1. Samples: 2170112400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:41,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 12:24:45,818][12883] Updated weights for policy 0, policy_version 132453 (0.0034) +[2024-06-18 12:24:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 2170175488. Throughput: 0: 42488.4. Samples: 2170240480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:46,994][12645] Avg episode reward: [(0, '0.796')] +[2024-06-18 12:24:49,160][12883] Updated weights for policy 0, policy_version 132463 (0.0030) +[2024-06-18 12:24:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 2170355712. Throughput: 0: 42463.1. Samples: 2170494180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:51,994][12645] Avg episode reward: [(0, '0.710')] +[2024-06-18 12:24:53,348][12883] Updated weights for policy 0, policy_version 132473 (0.0044) +[2024-06-18 12:24:56,882][12883] Updated weights for policy 0, policy_version 132483 (0.0028) +[2024-06-18 12:24:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2170601472. Throughput: 0: 42591.3. Samples: 2170746420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:24:56,995][12645] Avg episode reward: [(0, '0.705')] +[2024-06-18 12:25:01,034][12883] Updated weights for policy 0, policy_version 132493 (0.0031) +[2024-06-18 12:25:01,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2170814464. Throughput: 0: 42466.7. Samples: 2170875800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:25:01,994][12645] Avg episode reward: [(0, '0.714')] +[2024-06-18 12:25:04,402][12883] Updated weights for policy 0, policy_version 132503 (0.0033) +[2024-06-18 12:25:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2171011072. Throughput: 0: 42367.7. Samples: 2171129680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) +[2024-06-18 12:25:06,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 12:25:08,610][12883] Updated weights for policy 0, policy_version 132513 (0.0051) +[2024-06-18 12:25:09,935][12862] Signal inference workers to stop experience collection... (31750 times) +[2024-06-18 12:25:09,970][12883] InferenceWorker_p0-w0: stopping experience collection (31750 times) +[2024-06-18 12:25:09,982][12862] Signal inference workers to resume experience collection... (31750 times) +[2024-06-18 12:25:09,988][12883] InferenceWorker_p0-w0: resuming experience collection (31750 times) +[2024-06-18 12:25:11,998][12645] Fps is (10 sec: 42581.0, 60 sec: 42597.1, 300 sec: 42597.8). Total num frames: 2171240448. Throughput: 0: 42650.4. Samples: 2171387700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:11,998][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 12:25:12,757][12883] Updated weights for policy 0, policy_version 132523 (0.0038) +[2024-06-18 12:25:16,669][12883] Updated weights for policy 0, policy_version 132533 (0.0037) +[2024-06-18 12:25:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2171453440. Throughput: 0: 42514.3. Samples: 2171516300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:16,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 12:25:20,388][12883] Updated weights for policy 0, policy_version 132543 (0.0046) +[2024-06-18 12:25:21,994][12645] Fps is (10 sec: 39338.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2171633664. Throughput: 0: 42303.7. Samples: 2171764720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:21,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 12:25:24,367][12883] Updated weights for policy 0, policy_version 132553 (0.0042) +[2024-06-18 12:25:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2171879424. Throughput: 0: 42364.4. Samples: 2172018800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:26,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 12:25:28,075][12883] Updated weights for policy 0, policy_version 132563 (0.0029) +[2024-06-18 12:25:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2172059648. Throughput: 0: 42470.7. Samples: 2172151660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:31,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 12:25:32,204][12883] Updated weights for policy 0, policy_version 132573 (0.0029) +[2024-06-18 12:25:35,832][12883] Updated weights for policy 0, policy_version 132583 (0.0034) +[2024-06-18 12:25:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2172272640. Throughput: 0: 42393.2. Samples: 2172401880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:36,994][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 12:25:39,859][12883] Updated weights for policy 0, policy_version 132593 (0.0034) +[2024-06-18 12:25:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 2172485632. Throughput: 0: 42341.3. Samples: 2172651780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:41,995][12645] Avg episode reward: [(0, '0.789')] +[2024-06-18 12:25:43,600][12883] Updated weights for policy 0, policy_version 132603 (0.0028) +[2024-06-18 12:25:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2172698624. Throughput: 0: 42297.8. Samples: 2172779200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:46,996][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 12:25:47,579][12883] Updated weights for policy 0, policy_version 132613 (0.0037) +[2024-06-18 12:25:51,204][12883] Updated weights for policy 0, policy_version 132623 (0.0030) +[2024-06-18 12:25:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2172911616. Throughput: 0: 42299.8. Samples: 2173033180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:51,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 12:25:55,194][12883] Updated weights for policy 0, policy_version 132633 (0.0040) +[2024-06-18 12:25:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 2173124608. Throughput: 0: 42214.6. Samples: 2173287180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:25:56,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 12:25:58,922][12883] Updated weights for policy 0, policy_version 132643 (0.0038) +[2024-06-18 12:26:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2173337600. Throughput: 0: 42181.2. Samples: 2173414460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:26:01,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 12:26:02,686][12883] Updated weights for policy 0, policy_version 132653 (0.0025) +[2024-06-18 12:26:06,654][12883] Updated weights for policy 0, policy_version 132663 (0.0036) +[2024-06-18 12:26:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2173550592. Throughput: 0: 42309.8. Samples: 2173668660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) +[2024-06-18 12:26:06,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 12:26:10,349][12883] Updated weights for policy 0, policy_version 132673 (0.0036) +[2024-06-18 12:26:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42055.2, 300 sec: 42487.3). Total num frames: 2173763584. Throughput: 0: 42292.9. Samples: 2173921980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:11,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 12:26:14,372][12883] Updated weights for policy 0, policy_version 132683 (0.0034) +[2024-06-18 12:26:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2173960192. Throughput: 0: 42092.9. Samples: 2174045840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:16,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 12:26:18,477][12883] Updated weights for policy 0, policy_version 132693 (0.0040) +[2024-06-18 12:26:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2174189568. Throughput: 0: 42249.4. Samples: 2174303100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:21,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 12:26:22,358][12883] Updated weights for policy 0, policy_version 132703 (0.0030) +[2024-06-18 12:26:26,178][12883] Updated weights for policy 0, policy_version 132713 (0.0033) +[2024-06-18 12:26:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2174402560. Throughput: 0: 42205.5. Samples: 2174551020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:26,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 12:26:30,059][12883] Updated weights for policy 0, policy_version 132723 (0.0029) +[2024-06-18 12:26:31,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42050.7, 300 sec: 42375.9). Total num frames: 2174582784. Throughput: 0: 42156.6. Samples: 2174676340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:31,996][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 12:26:34,106][12883] Updated weights for policy 0, policy_version 132733 (0.0027) +[2024-06-18 12:26:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2174812160. Throughput: 0: 42166.0. Samples: 2174930640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:36,994][12645] Avg episode reward: [(0, '0.681')] +[2024-06-18 12:26:37,090][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132741_2174828544.pth... +[2024-06-18 12:26:37,135][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132120_2164654080.pth +[2024-06-18 12:26:37,754][12883] Updated weights for policy 0, policy_version 132743 (0.0023) +[2024-06-18 12:26:41,727][12883] Updated weights for policy 0, policy_version 132753 (0.0040) +[2024-06-18 12:26:41,996][12645] Fps is (10 sec: 45875.3, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 2175041536. Throughput: 0: 42295.2. Samples: 2175190560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:41,997][12645] Avg episode reward: [(0, '0.633')] +[2024-06-18 12:26:45,135][12862] Signal inference workers to stop experience collection... (31800 times) +[2024-06-18 12:26:45,136][12862] Signal inference workers to resume experience collection... (31800 times) +[2024-06-18 12:26:45,155][12883] InferenceWorker_p0-w0: stopping experience collection (31800 times) +[2024-06-18 12:26:45,155][12883] InferenceWorker_p0-w0: resuming experience collection (31800 times) +[2024-06-18 12:26:45,288][12883] Updated weights for policy 0, policy_version 132763 (0.0037) +[2024-06-18 12:26:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.5). Total num frames: 2175238144. Throughput: 0: 42276.5. Samples: 2175316900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:46,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 12:26:49,203][12883] Updated weights for policy 0, policy_version 132773 (0.0028) +[2024-06-18 12:26:51,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2175467520. Throughput: 0: 42456.7. Samples: 2175579220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:51,994][12645] Avg episode reward: [(0, '0.296')] +[2024-06-18 12:26:52,830][12883] Updated weights for policy 0, policy_version 132783 (0.0030) +[2024-06-18 12:26:56,766][12883] Updated weights for policy 0, policy_version 132793 (0.0044) +[2024-06-18 12:26:56,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2175680512. Throughput: 0: 42507.8. Samples: 2175834840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:26:56,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 12:27:00,605][12883] Updated weights for policy 0, policy_version 132803 (0.0033) +[2024-06-18 12:27:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2175877120. Throughput: 0: 42580.3. Samples: 2175961960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:27:01,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 12:27:04,527][12883] Updated weights for policy 0, policy_version 132813 (0.0043) +[2024-06-18 12:27:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2176106496. Throughput: 0: 42634.1. Samples: 2176221640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:27:06,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 12:27:08,542][12883] Updated weights for policy 0, policy_version 132823 (0.0025) +[2024-06-18 12:27:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2176319488. Throughput: 0: 42807.5. Samples: 2176477360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 12:27:11,998][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 12:27:12,103][12883] Updated weights for policy 0, policy_version 132833 (0.0036) +[2024-06-18 12:27:16,044][12883] Updated weights for policy 0, policy_version 132843 (0.0034) +[2024-06-18 12:27:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2176548864. Throughput: 0: 42941.7. Samples: 2176608620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:16,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 12:27:19,531][12883] Updated weights for policy 0, policy_version 132853 (0.0044) +[2024-06-18 12:27:21,994][12645] Fps is (10 sec: 42595.6, 60 sec: 42597.9, 300 sec: 42487.2). Total num frames: 2176745472. Throughput: 0: 42937.9. Samples: 2176862880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:21,995][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 12:27:23,681][12883] Updated weights for policy 0, policy_version 132863 (0.0030) +[2024-06-18 12:27:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2176974848. Throughput: 0: 42830.0. Samples: 2177117820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:26,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 12:27:27,287][12883] Updated weights for policy 0, policy_version 132873 (0.0040) +[2024-06-18 12:27:31,318][12883] Updated weights for policy 0, policy_version 132883 (0.0030) +[2024-06-18 12:27:31,994][12645] Fps is (10 sec: 42601.4, 60 sec: 43146.1, 300 sec: 42431.8). Total num frames: 2177171456. Throughput: 0: 42941.7. Samples: 2177249280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:31,994][12645] Avg episode reward: [(0, '0.106')] +[2024-06-18 12:27:35,391][12883] Updated weights for policy 0, policy_version 132893 (0.0036) +[2024-06-18 12:27:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2177400832. Throughput: 0: 42810.7. Samples: 2177505700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:36,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 12:27:39,233][12883] Updated weights for policy 0, policy_version 132903 (0.0055) +[2024-06-18 12:27:42,000][12645] Fps is (10 sec: 42571.9, 60 sec: 42595.6, 300 sec: 42486.4). Total num frames: 2177597440. Throughput: 0: 42670.2. Samples: 2177755260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:42,001][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 12:27:42,865][12883] Updated weights for policy 0, policy_version 132913 (0.0041) +[2024-06-18 12:27:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2177810432. Throughput: 0: 42838.7. Samples: 2177889700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:46,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 12:27:46,998][12883] Updated weights for policy 0, policy_version 132923 (0.0037) +[2024-06-18 12:27:50,996][12883] Updated weights for policy 0, policy_version 132933 (0.0028) +[2024-06-18 12:27:51,995][12645] Fps is (10 sec: 42620.1, 60 sec: 42597.6, 300 sec: 42542.7). Total num frames: 2178023424. Throughput: 0: 42766.5. Samples: 2178146180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:51,995][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 12:27:54,676][12883] Updated weights for policy 0, policy_version 132943 (0.0026) +[2024-06-18 12:27:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2178236416. Throughput: 0: 42553.8. Samples: 2178392280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:27:56,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 12:27:58,611][12883] Updated weights for policy 0, policy_version 132953 (0.0025) +[2024-06-18 12:28:01,994][12645] Fps is (10 sec: 40964.9, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 2178433024. Throughput: 0: 42560.1. Samples: 2178523820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:28:02,000][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 12:28:02,434][12883] Updated weights for policy 0, policy_version 132963 (0.0032) +[2024-06-18 12:28:06,151][12883] Updated weights for policy 0, policy_version 132973 (0.0045) +[2024-06-18 12:28:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2178678784. Throughput: 0: 42599.4. Samples: 2178779820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:28:06,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 12:28:09,931][12883] Updated weights for policy 0, policy_version 132983 (0.0044) +[2024-06-18 12:28:10,611][12862] Signal inference workers to stop experience collection... (31850 times) +[2024-06-18 12:28:10,665][12883] InferenceWorker_p0-w0: stopping experience collection (31850 times) +[2024-06-18 12:28:10,668][12862] Signal inference workers to resume experience collection... (31850 times) +[2024-06-18 12:28:10,678][12883] InferenceWorker_p0-w0: resuming experience collection (31850 times) +[2024-06-18 12:28:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2178875392. Throughput: 0: 42565.9. Samples: 2179033280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 12:28:11,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 12:28:13,710][12883] Updated weights for policy 0, policy_version 132993 (0.0028) +[2024-06-18 12:28:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2179088384. Throughput: 0: 42554.7. Samples: 2179164240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:16,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 12:28:17,432][12883] Updated weights for policy 0, policy_version 133003 (0.0040) +[2024-06-18 12:28:21,211][12883] Updated weights for policy 0, policy_version 133013 (0.0028) +[2024-06-18 12:28:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.9, 300 sec: 42542.9). Total num frames: 2179301376. Throughput: 0: 42546.3. Samples: 2179420280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:21,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 12:28:25,223][12883] Updated weights for policy 0, policy_version 133023 (0.0027) +[2024-06-18 12:28:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2179530752. Throughput: 0: 42756.1. Samples: 2179679020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:26,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 12:28:28,784][12883] Updated weights for policy 0, policy_version 133033 (0.0035) +[2024-06-18 12:28:31,996][12645] Fps is (10 sec: 42587.3, 60 sec: 42596.6, 300 sec: 42486.9). Total num frames: 2179727360. Throughput: 0: 42571.3. Samples: 2179805520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:31,997][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 12:28:32,755][12883] Updated weights for policy 0, policy_version 133043 (0.0029) +[2024-06-18 12:28:36,578][12883] Updated weights for policy 0, policy_version 133053 (0.0042) +[2024-06-18 12:28:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2179940352. Throughput: 0: 42521.1. Samples: 2180059580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:36,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 12:28:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133054_2179956736.pth... +[2024-06-18 12:28:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132430_2169733120.pth +[2024-06-18 12:28:40,449][12883] Updated weights for policy 0, policy_version 133063 (0.0026) +[2024-06-18 12:28:41,994][12645] Fps is (10 sec: 42609.5, 60 sec: 42602.8, 300 sec: 42431.8). Total num frames: 2180153344. Throughput: 0: 42829.3. Samples: 2180319600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:41,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 12:28:44,115][12883] Updated weights for policy 0, policy_version 133073 (0.0033) +[2024-06-18 12:28:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 2180366336. Throughput: 0: 42709.4. Samples: 2180445740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:46,994][12645] Avg episode reward: [(0, '0.714')] +[2024-06-18 12:28:48,117][12883] Updated weights for policy 0, policy_version 133083 (0.0037) +[2024-06-18 12:28:51,941][12883] Updated weights for policy 0, policy_version 133093 (0.0026) +[2024-06-18 12:28:51,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42870.7, 300 sec: 42598.1). Total num frames: 2180595712. Throughput: 0: 42761.0. Samples: 2180704160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:51,996][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 12:28:55,625][12883] Updated weights for policy 0, policy_version 133103 (0.0039) +[2024-06-18 12:28:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2180792320. Throughput: 0: 42872.8. Samples: 2180962560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:28:56,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 12:28:59,427][12883] Updated weights for policy 0, policy_version 133113 (0.0033) +[2024-06-18 12:29:01,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2181021696. Throughput: 0: 42636.9. Samples: 2181082900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:29:01,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 12:29:03,550][12883] Updated weights for policy 0, policy_version 133123 (0.0039) +[2024-06-18 12:29:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2181234688. Throughput: 0: 42778.7. Samples: 2181345320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:29:06,994][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 12:29:07,030][12883] Updated weights for policy 0, policy_version 133133 (0.0031) +[2024-06-18 12:29:11,089][12883] Updated weights for policy 0, policy_version 133143 (0.0031) +[2024-06-18 12:29:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2181447680. Throughput: 0: 42577.8. Samples: 2181595020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:29:11,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 12:29:15,239][12883] Updated weights for policy 0, policy_version 133153 (0.0039) +[2024-06-18 12:29:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2181644288. Throughput: 0: 42724.8. Samples: 2181728020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:29:16,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 12:29:18,562][12883] Updated weights for policy 0, policy_version 133163 (0.0027) +[2024-06-18 12:29:21,999][12645] Fps is (10 sec: 42577.7, 60 sec: 42867.9, 300 sec: 42542.2). Total num frames: 2181873664. Throughput: 0: 42770.4. Samples: 2181984460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:29:21,999][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 12:29:22,813][12883] Updated weights for policy 0, policy_version 133173 (0.0036) +[2024-06-18 12:29:26,148][12883] Updated weights for policy 0, policy_version 133183 (0.0038) +[2024-06-18 12:29:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2182086656. Throughput: 0: 42612.8. Samples: 2182237180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:29:26,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 12:29:30,685][12883] Updated weights for policy 0, policy_version 133193 (0.0022) +[2024-06-18 12:29:31,994][12645] Fps is (10 sec: 40979.8, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 2182283264. Throughput: 0: 42751.8. Samples: 2182369580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:29:31,994][12645] Avg episode reward: [(0, '0.707')] +[2024-06-18 12:29:34,066][12883] Updated weights for policy 0, policy_version 133203 (0.0031) +[2024-06-18 12:29:37,000][12645] Fps is (10 sec: 42572.3, 60 sec: 42867.0, 300 sec: 42542.0). Total num frames: 2182512640. Throughput: 0: 42562.9. Samples: 2182619660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:29:37,000][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 12:29:38,357][12883] Updated weights for policy 0, policy_version 133213 (0.0038) +[2024-06-18 12:29:41,915][12883] Updated weights for policy 0, policy_version 133223 (0.0032) +[2024-06-18 12:29:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2182725632. Throughput: 0: 42474.3. Samples: 2182873900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:29:41,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 12:29:46,357][12883] Updated weights for policy 0, policy_version 133233 (0.0035) +[2024-06-18 12:29:46,994][12645] Fps is (10 sec: 40985.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2182922240. Throughput: 0: 42618.6. Samples: 2183000740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:29:46,994][12645] Avg episode reward: [(0, '0.240')] +[2024-06-18 12:29:49,707][12883] Updated weights for policy 0, policy_version 133243 (0.0032) +[2024-06-18 12:29:49,748][12862] Signal inference workers to stop experience collection... (31900 times) +[2024-06-18 12:29:49,748][12862] Signal inference workers to resume experience collection... (31900 times) +[2024-06-18 12:29:49,775][12883] InferenceWorker_p0-w0: stopping experience collection (31900 times) +[2024-06-18 12:29:49,775][12883] InferenceWorker_p0-w0: resuming experience collection (31900 times) +[2024-06-18 12:29:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 2183151616. Throughput: 0: 42399.6. Samples: 2183253300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:29:51,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 12:29:54,206][12883] Updated weights for policy 0, policy_version 133253 (0.0036) +[2024-06-18 12:29:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2183364608. Throughput: 0: 42555.1. Samples: 2183510000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:29:56,994][12645] Avg episode reward: [(0, '0.663')] +[2024-06-18 12:29:57,242][12883] Updated weights for policy 0, policy_version 133263 (0.0030) +[2024-06-18 12:30:01,938][12883] Updated weights for policy 0, policy_version 133273 (0.0030) +[2024-06-18 12:30:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2183544832. Throughput: 0: 42445.2. Samples: 2183638060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:30:01,995][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 12:30:04,826][12883] Updated weights for policy 0, policy_version 133283 (0.0028) +[2024-06-18 12:30:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42543.5). Total num frames: 2183790592. Throughput: 0: 42292.6. Samples: 2183887420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:30:06,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 12:30:09,656][12883] Updated weights for policy 0, policy_version 133293 (0.0038) +[2024-06-18 12:30:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2184003584. Throughput: 0: 42393.4. Samples: 2184144880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:30:11,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 12:30:12,833][12883] Updated weights for policy 0, policy_version 133303 (0.0043) +[2024-06-18 12:30:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2184183808. Throughput: 0: 42295.6. Samples: 2184272880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 12:30:16,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 12:30:17,282][12883] Updated weights for policy 0, policy_version 133313 (0.0035) +[2024-06-18 12:30:20,359][12883] Updated weights for policy 0, policy_version 133323 (0.0024) +[2024-06-18 12:30:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42328.8, 300 sec: 42487.3). Total num frames: 2184413184. Throughput: 0: 42363.1. Samples: 2184525740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:30:21,994][12645] Avg episode reward: [(0, '0.265')] +[2024-06-18 12:30:24,995][12883] Updated weights for policy 0, policy_version 133333 (0.0030) +[2024-06-18 12:30:26,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2184642560. Throughput: 0: 42294.3. Samples: 2184777140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:30:26,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 12:30:28,021][12883] Updated weights for policy 0, policy_version 133343 (0.0039) +[2024-06-18 12:30:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2184822784. Throughput: 0: 42471.2. Samples: 2184911940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:30:31,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 12:30:32,571][12883] Updated weights for policy 0, policy_version 133353 (0.0033) +[2024-06-18 12:30:35,916][12883] Updated weights for policy 0, policy_version 133363 (0.0033) +[2024-06-18 12:30:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42329.7, 300 sec: 42598.4). Total num frames: 2185052160. Throughput: 0: 42642.2. Samples: 2185172200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:30:36,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 12:30:37,050][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133366_2185068544.pth... +[2024-06-18 12:30:37,111][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132741_2174828544.pth +[2024-06-18 12:30:40,409][12883] Updated weights for policy 0, policy_version 133373 (0.0043) +[2024-06-18 12:30:41,997][12645] Fps is (10 sec: 45860.1, 60 sec: 42596.1, 300 sec: 42653.5). Total num frames: 2185281536. Throughput: 0: 42465.9. Samples: 2185421100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:30:41,998][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 12:30:43,526][12883] Updated weights for policy 0, policy_version 133383 (0.0037) +[2024-06-18 12:30:46,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42323.8, 300 sec: 42542.6). Total num frames: 2185461760. Throughput: 0: 42562.8. Samples: 2185553480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:30:46,996][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 12:30:47,900][12883] Updated weights for policy 0, policy_version 133393 (0.0028) +[2024-06-18 12:30:51,037][12883] Updated weights for policy 0, policy_version 133403 (0.0040) +[2024-06-18 12:30:51,994][12645] Fps is (10 sec: 40973.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2185691136. Throughput: 0: 42741.3. Samples: 2185810780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:30:51,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 12:30:55,433][12883] Updated weights for policy 0, policy_version 133413 (0.0033) +[2024-06-18 12:30:55,891][12862] Signal inference workers to stop experience collection... (31950 times) +[2024-06-18 12:30:55,891][12862] Signal inference workers to resume experience collection... (31950 times) +[2024-06-18 12:30:55,932][12883] InferenceWorker_p0-w0: stopping experience collection (31950 times) +[2024-06-18 12:30:55,932][12883] InferenceWorker_p0-w0: resuming experience collection (31950 times) +[2024-06-18 12:30:56,994][12645] Fps is (10 sec: 45885.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2185920512. Throughput: 0: 42748.9. Samples: 2186068580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:30:56,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 12:30:58,486][12883] Updated weights for policy 0, policy_version 133423 (0.0037) +[2024-06-18 12:31:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2186117120. Throughput: 0: 42766.7. Samples: 2186197380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:31:01,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 12:31:02,960][12883] Updated weights for policy 0, policy_version 133433 (0.0026) +[2024-06-18 12:31:06,203][12883] Updated weights for policy 0, policy_version 133443 (0.0027) +[2024-06-18 12:31:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2186346496. Throughput: 0: 42700.8. Samples: 2186447280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:31:06,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 12:31:10,594][12883] Updated weights for policy 0, policy_version 133453 (0.0022) +[2024-06-18 12:31:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2186559488. Throughput: 0: 42896.9. Samples: 2186707500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:31:11,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 12:31:14,419][12883] Updated weights for policy 0, policy_version 133463 (0.0048) +[2024-06-18 12:31:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2186756096. Throughput: 0: 42669.8. Samples: 2186832080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:31:16,994][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 12:31:18,189][12883] Updated weights for policy 0, policy_version 133473 (0.0033) +[2024-06-18 12:31:21,874][12883] Updated weights for policy 0, policy_version 133483 (0.0035) +[2024-06-18 12:31:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2186985472. Throughput: 0: 42628.0. Samples: 2187090460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 12:31:21,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 12:31:26,295][12883] Updated weights for policy 0, policy_version 133493 (0.0031) +[2024-06-18 12:31:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 2187182080. Throughput: 0: 42912.9. Samples: 2187352040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:31:26,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 12:31:29,461][12883] Updated weights for policy 0, policy_version 133503 (0.0041) +[2024-06-18 12:31:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2187395072. Throughput: 0: 42599.4. Samples: 2187470360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:31:31,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 12:31:34,014][12883] Updated weights for policy 0, policy_version 133513 (0.0028) +[2024-06-18 12:31:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2187624448. Throughput: 0: 42560.0. Samples: 2187725980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:31:36,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 12:31:37,158][12883] Updated weights for policy 0, policy_version 133523 (0.0041) +[2024-06-18 12:31:41,621][12883] Updated weights for policy 0, policy_version 133533 (0.0036) +[2024-06-18 12:31:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42327.7, 300 sec: 42653.9). Total num frames: 2187821056. Throughput: 0: 42620.9. Samples: 2187986520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:31:41,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 12:31:44,783][12883] Updated weights for policy 0, policy_version 133543 (0.0033) +[2024-06-18 12:31:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 2188034048. Throughput: 0: 42480.8. Samples: 2188109020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:31:46,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 12:31:49,285][12883] Updated weights for policy 0, policy_version 133553 (0.0027) +[2024-06-18 12:31:51,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2188263424. Throughput: 0: 42625.5. Samples: 2188365520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:31:51,996][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 12:31:52,581][12883] Updated weights for policy 0, policy_version 133563 (0.0034) +[2024-06-18 12:31:56,797][12883] Updated weights for policy 0, policy_version 133573 (0.0034) +[2024-06-18 12:31:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2188460032. Throughput: 0: 42718.6. Samples: 2188629840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:31:56,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 12:32:00,247][12883] Updated weights for policy 0, policy_version 133583 (0.0040) +[2024-06-18 12:32:02,000][12645] Fps is (10 sec: 40943.6, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 2188673024. Throughput: 0: 42624.3. Samples: 2188750440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:32:02,001][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 12:32:04,690][12883] Updated weights for policy 0, policy_version 133593 (0.0032) +[2024-06-18 12:32:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2188902400. Throughput: 0: 42674.2. Samples: 2189010800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:32:06,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 12:32:07,631][12883] Updated weights for policy 0, policy_version 133603 (0.0039) +[2024-06-18 12:32:11,996][12645] Fps is (10 sec: 40976.3, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 2189082624. Throughput: 0: 42605.0. Samples: 2189269360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:32:11,997][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 12:32:12,488][12883] Updated weights for policy 0, policy_version 133613 (0.0039) +[2024-06-18 12:32:15,487][12883] Updated weights for policy 0, policy_version 133623 (0.0034) +[2024-06-18 12:32:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.5). Total num frames: 2189312000. Throughput: 0: 42589.8. Samples: 2189386900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:32:16,994][12645] Avg episode reward: [(0, '0.637')] +[2024-06-18 12:32:20,129][12883] Updated weights for policy 0, policy_version 133633 (0.0042) +[2024-06-18 12:32:21,994][12645] Fps is (10 sec: 47524.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2189557760. Throughput: 0: 42651.6. Samples: 2189645300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 12:32:21,994][12645] Avg episode reward: [(0, '0.800')] +[2024-06-18 12:32:22,875][12883] Updated weights for policy 0, policy_version 133643 (0.0034) +[2024-06-18 12:32:26,995][12645] Fps is (10 sec: 42594.9, 60 sec: 42597.8, 300 sec: 42598.3). Total num frames: 2189737984. Throughput: 0: 42736.0. Samples: 2189909680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:32:26,995][12645] Avg episode reward: [(0, '0.834')] +[2024-06-18 12:32:27,947][12883] Updated weights for policy 0, policy_version 133653 (0.0025) +[2024-06-18 12:32:30,309][12883] Updated weights for policy 0, policy_version 133663 (0.0026) +[2024-06-18 12:32:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2189950976. Throughput: 0: 42579.6. Samples: 2190025100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:32:31,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 12:32:35,653][12883] Updated weights for policy 0, policy_version 133673 (0.0045) +[2024-06-18 12:32:36,382][12862] Signal inference workers to stop experience collection... (32000 times) +[2024-06-18 12:32:36,383][12862] Signal inference workers to resume experience collection... (32000 times) +[2024-06-18 12:32:36,403][12883] InferenceWorker_p0-w0: stopping experience collection (32000 times) +[2024-06-18 12:32:36,403][12883] InferenceWorker_p0-w0: resuming experience collection (32000 times) +[2024-06-18 12:32:36,994][12645] Fps is (10 sec: 44240.7, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 2190180352. Throughput: 0: 42926.1. Samples: 2190297100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:32:36,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 12:32:37,090][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133679_2190196736.pth... +[2024-06-18 12:32:37,160][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133054_2179956736.pth +[2024-06-18 12:32:37,888][12883] Updated weights for policy 0, policy_version 133683 (0.0038) +[2024-06-18 12:32:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2190376960. Throughput: 0: 42564.3. Samples: 2190545240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:32:41,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 12:32:43,291][12883] Updated weights for policy 0, policy_version 133693 (0.0027) +[2024-06-18 12:32:45,613][12883] Updated weights for policy 0, policy_version 133703 (0.0029) +[2024-06-18 12:32:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 2190589952. Throughput: 0: 42705.0. Samples: 2190671900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:32:46,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 12:32:50,920][12883] Updated weights for policy 0, policy_version 133713 (0.0027) +[2024-06-18 12:32:51,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 2190802944. Throughput: 0: 42759.6. Samples: 2190934980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:32:51,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 12:32:53,298][12883] Updated weights for policy 0, policy_version 133723 (0.0032) +[2024-06-18 12:32:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2191015936. Throughput: 0: 42517.7. Samples: 2191182560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:32:56,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 12:32:58,770][12883] Updated weights for policy 0, policy_version 133733 (0.0044) +[2024-06-18 12:33:01,660][12883] Updated weights for policy 0, policy_version 133743 (0.0038) +[2024-06-18 12:33:01,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42875.8, 300 sec: 42598.4). Total num frames: 2191245312. Throughput: 0: 42677.3. Samples: 2191307380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:33:01,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 12:33:06,303][12883] Updated weights for policy 0, policy_version 133753 (0.0032) +[2024-06-18 12:33:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2191441920. Throughput: 0: 42725.8. Samples: 2191567960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:33:06,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 12:33:09,464][12883] Updated weights for policy 0, policy_version 133763 (0.0042) +[2024-06-18 12:33:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2191654912. Throughput: 0: 42414.2. Samples: 2191818280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:33:11,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 12:33:13,839][12883] Updated weights for policy 0, policy_version 133773 (0.0027) +[2024-06-18 12:33:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2191884288. Throughput: 0: 42709.4. Samples: 2191947020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:33:16,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 12:33:17,022][12883] Updated weights for policy 0, policy_version 133783 (0.0033) +[2024-06-18 12:33:21,715][12883] Updated weights for policy 0, policy_version 133793 (0.0031) +[2024-06-18 12:33:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2192080896. Throughput: 0: 42489.0. Samples: 2192209100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:33:21,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 12:33:24,675][12883] Updated weights for policy 0, policy_version 133803 (0.0041) +[2024-06-18 12:33:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42326.0, 300 sec: 42543.2). Total num frames: 2192277504. Throughput: 0: 42534.8. Samples: 2192459300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 12:33:26,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 12:33:29,464][12883] Updated weights for policy 0, policy_version 133813 (0.0032) +[2024-06-18 12:33:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2192523264. Throughput: 0: 42634.8. Samples: 2192590460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:33:31,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 12:33:32,684][12883] Updated weights for policy 0, policy_version 133823 (0.0036) +[2024-06-18 12:33:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2192703488. Throughput: 0: 42513.2. Samples: 2192848080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:33:36,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 12:33:37,121][12883] Updated weights for policy 0, policy_version 133833 (0.0033) +[2024-06-18 12:33:40,525][12883] Updated weights for policy 0, policy_version 133843 (0.0036) +[2024-06-18 12:33:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2192932864. Throughput: 0: 42441.3. Samples: 2193092420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:33:41,994][12645] Avg episode reward: [(0, '0.759')] +[2024-06-18 12:33:44,783][12883] Updated weights for policy 0, policy_version 133853 (0.0036) +[2024-06-18 12:33:46,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 2193178624. Throughput: 0: 42701.4. Samples: 2193228940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:33:46,994][12645] Avg episode reward: [(0, '0.796')] +[2024-06-18 12:33:48,142][12883] Updated weights for policy 0, policy_version 133863 (0.0034) +[2024-06-18 12:33:48,370][12862] Signal inference workers to stop experience collection... (32050 times) +[2024-06-18 12:33:48,371][12862] Signal inference workers to resume experience collection... (32050 times) +[2024-06-18 12:33:48,411][12883] InferenceWorker_p0-w0: stopping experience collection (32050 times) +[2024-06-18 12:33:48,411][12883] InferenceWorker_p0-w0: resuming experience collection (32050 times) +[2024-06-18 12:33:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 2193342464. Throughput: 0: 42690.1. Samples: 2193489020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:33:51,994][12645] Avg episode reward: [(0, '0.660')] +[2024-06-18 12:33:52,296][12883] Updated weights for policy 0, policy_version 133873 (0.0041) +[2024-06-18 12:33:55,615][12883] Updated weights for policy 0, policy_version 133883 (0.0037) +[2024-06-18 12:33:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2193588224. Throughput: 0: 42547.6. Samples: 2193732920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:33:56,994][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 12:33:59,957][12883] Updated weights for policy 0, policy_version 133893 (0.0037) +[2024-06-18 12:34:01,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2193801216. Throughput: 0: 42651.7. Samples: 2193866340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:34:01,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 12:34:03,182][12883] Updated weights for policy 0, policy_version 133903 (0.0027) +[2024-06-18 12:34:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2193981440. Throughput: 0: 42647.4. Samples: 2194128240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:34:06,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 12:34:07,865][12883] Updated weights for policy 0, policy_version 133913 (0.0044) +[2024-06-18 12:34:10,920][12883] Updated weights for policy 0, policy_version 133923 (0.0030) +[2024-06-18 12:34:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2194210816. Throughput: 0: 42428.1. Samples: 2194368560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:34:11,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 12:34:15,534][12883] Updated weights for policy 0, policy_version 133933 (0.0028) +[2024-06-18 12:34:16,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42654.6). Total num frames: 2194456576. Throughput: 0: 42509.6. Samples: 2194503400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:34:16,994][12645] Avg episode reward: [(0, '0.739')] +[2024-06-18 12:34:18,654][12883] Updated weights for policy 0, policy_version 133943 (0.0038) +[2024-06-18 12:34:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2194604032. Throughput: 0: 42347.9. Samples: 2194753740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:34:21,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 12:34:23,386][12883] Updated weights for policy 0, policy_version 133953 (0.0041) +[2024-06-18 12:34:26,390][12883] Updated weights for policy 0, policy_version 133963 (0.0033) +[2024-06-18 12:34:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2194849792. Throughput: 0: 42410.7. Samples: 2195000900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) +[2024-06-18 12:34:26,994][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 12:34:31,029][12883] Updated weights for policy 0, policy_version 133973 (0.0035) +[2024-06-18 12:34:31,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 2195079168. Throughput: 0: 42481.7. Samples: 2195140620. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:34:31,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 12:34:34,566][12883] Updated weights for policy 0, policy_version 133983 (0.0023) +[2024-06-18 12:34:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2195243008. Throughput: 0: 42154.8. Samples: 2195385980. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:34:36,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 12:34:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133987_2195243008.pth... +[2024-06-18 12:34:37,056][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133366_2185068544.pth +[2024-06-18 12:34:38,651][12883] Updated weights for policy 0, policy_version 133993 (0.0042) +[2024-06-18 12:34:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2195488768. Throughput: 0: 42388.0. Samples: 2195640380. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:34:41,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 12:34:42,310][12883] Updated weights for policy 0, policy_version 134003 (0.0032) +[2024-06-18 12:34:46,213][12883] Updated weights for policy 0, policy_version 134013 (0.0030) +[2024-06-18 12:34:46,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2195718144. Throughput: 0: 42487.9. Samples: 2195778300. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:34:46,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 12:34:49,805][12883] Updated weights for policy 0, policy_version 134023 (0.0037) +[2024-06-18 12:34:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2195881984. Throughput: 0: 42034.2. Samples: 2196019780. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:34:51,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 12:34:53,919][12883] Updated weights for policy 0, policy_version 134033 (0.0036) +[2024-06-18 12:34:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2196127744. Throughput: 0: 42468.2. Samples: 2196279640. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:34:56,995][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 12:34:57,281][12883] Updated weights for policy 0, policy_version 134043 (0.0037) +[2024-06-18 12:35:00,516][12862] Signal inference workers to stop experience collection... (32100 times) +[2024-06-18 12:35:00,517][12862] Signal inference workers to resume experience collection... (32100 times) +[2024-06-18 12:35:00,527][12883] InferenceWorker_p0-w0: stopping experience collection (32100 times) +[2024-06-18 12:35:00,528][12883] InferenceWorker_p0-w0: resuming experience collection (32100 times) +[2024-06-18 12:35:01,459][12883] Updated weights for policy 0, policy_version 134053 (0.0037) +[2024-06-18 12:35:02,000][12645] Fps is (10 sec: 45846.3, 60 sec: 42320.8, 300 sec: 42541.9). Total num frames: 2196340736. Throughput: 0: 42399.0. Samples: 2196411620. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:35:02,001][12645] Avg episode reward: [(0, '0.740')] +[2024-06-18 12:35:05,211][12883] Updated weights for policy 0, policy_version 134063 (0.0031) +[2024-06-18 12:35:06,996][12645] Fps is (10 sec: 39313.3, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 2196520960. Throughput: 0: 42360.6. Samples: 2196660060. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:35:06,996][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 12:35:09,100][12883] Updated weights for policy 0, policy_version 134073 (0.0030) +[2024-06-18 12:35:11,994][12645] Fps is (10 sec: 42625.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2196766720. Throughput: 0: 42540.4. Samples: 2196915220. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:35:11,994][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 12:35:12,924][12883] Updated weights for policy 0, policy_version 134083 (0.0033) +[2024-06-18 12:35:16,851][12883] Updated weights for policy 0, policy_version 134093 (0.0034) +[2024-06-18 12:35:16,994][12645] Fps is (10 sec: 45884.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2196979712. Throughput: 0: 42451.1. Samples: 2197050920. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:35:16,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 12:35:20,716][12883] Updated weights for policy 0, policy_version 134103 (0.0043) +[2024-06-18 12:35:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2197176320. Throughput: 0: 42576.8. Samples: 2197301940. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:35:21,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 12:35:24,473][12883] Updated weights for policy 0, policy_version 134113 (0.0048) +[2024-06-18 12:35:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2197422080. Throughput: 0: 42509.6. Samples: 2197553320. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:35:26,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 12:35:28,399][12883] Updated weights for policy 0, policy_version 134123 (0.0041) +[2024-06-18 12:35:31,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2197602304. Throughput: 0: 42423.2. Samples: 2197687340. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) +[2024-06-18 12:35:31,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 12:35:32,264][12883] Updated weights for policy 0, policy_version 134133 (0.0031) +[2024-06-18 12:35:35,875][12883] Updated weights for policy 0, policy_version 134143 (0.0024) +[2024-06-18 12:35:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42543.3). Total num frames: 2197831680. Throughput: 0: 42742.2. Samples: 2197943180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:35:36,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 12:35:39,830][12883] Updated weights for policy 0, policy_version 134153 (0.0032) +[2024-06-18 12:35:41,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2198061056. Throughput: 0: 42574.4. Samples: 2198195480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:35:41,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 12:35:43,826][12883] Updated weights for policy 0, policy_version 134163 (0.0041) +[2024-06-18 12:35:46,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42050.7, 300 sec: 42542.5). Total num frames: 2198241280. Throughput: 0: 42482.5. Samples: 2198323160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:35:46,997][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 12:35:47,561][12883] Updated weights for policy 0, policy_version 134173 (0.0028) +[2024-06-18 12:35:51,811][12883] Updated weights for policy 0, policy_version 134183 (0.0031) +[2024-06-18 12:35:51,994][12645] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 2198470656. Throughput: 0: 42648.3. Samples: 2198579140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:35:51,994][12645] Avg episode reward: [(0, '0.314')] +[2024-06-18 12:35:55,174][12883] Updated weights for policy 0, policy_version 134193 (0.0047) +[2024-06-18 12:35:56,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2198683648. Throughput: 0: 42621.4. Samples: 2198833180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:35:56,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 12:35:59,497][12883] Updated weights for policy 0, policy_version 134203 (0.0036) +[2024-06-18 12:36:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42329.8, 300 sec: 42487.3). Total num frames: 2198880256. Throughput: 0: 42509.5. Samples: 2198963840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:36:01,994][12645] Avg episode reward: [(0, '0.127')] +[2024-06-18 12:36:03,293][12883] Updated weights for policy 0, policy_version 134213 (0.0043) +[2024-06-18 12:36:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42873.0, 300 sec: 42487.3). Total num frames: 2199093248. Throughput: 0: 42561.7. Samples: 2199217220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:36:06,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 12:36:07,075][12883] Updated weights for policy 0, policy_version 134223 (0.0028) +[2024-06-18 12:36:10,945][12883] Updated weights for policy 0, policy_version 134233 (0.0044) +[2024-06-18 12:36:11,097][12862] Signal inference workers to stop experience collection... (32150 times) +[2024-06-18 12:36:11,097][12862] Signal inference workers to resume experience collection... (32150 times) +[2024-06-18 12:36:11,107][12883] InferenceWorker_p0-w0: stopping experience collection (32150 times) +[2024-06-18 12:36:11,108][12883] InferenceWorker_p0-w0: resuming experience collection (32150 times) +[2024-06-18 12:36:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2199322624. Throughput: 0: 42521.4. Samples: 2199466780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:36:11,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 12:36:14,831][12883] Updated weights for policy 0, policy_version 134243 (0.0023) +[2024-06-18 12:36:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2199519232. Throughput: 0: 42508.8. Samples: 2199600240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:36:16,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 12:36:18,489][12883] Updated weights for policy 0, policy_version 134253 (0.0037) +[2024-06-18 12:36:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2199732224. Throughput: 0: 42403.6. Samples: 2199851340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:36:21,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 12:36:22,482][12883] Updated weights for policy 0, policy_version 134263 (0.0039) +[2024-06-18 12:36:26,120][12883] Updated weights for policy 0, policy_version 134273 (0.0031) +[2024-06-18 12:36:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2199961600. Throughput: 0: 42546.6. Samples: 2200110080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:36:26,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 12:36:30,202][12883] Updated weights for policy 0, policy_version 134283 (0.0030) +[2024-06-18 12:36:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2200158208. Throughput: 0: 42543.9. Samples: 2200237540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) +[2024-06-18 12:36:31,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 12:36:33,983][12883] Updated weights for policy 0, policy_version 134293 (0.0039) +[2024-06-18 12:36:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2200371200. Throughput: 0: 42476.5. Samples: 2200490580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:36:37,003][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 12:36:37,042][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134301_2200387584.pth... +[2024-06-18 12:36:37,091][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133679_2190196736.pth +[2024-06-18 12:36:38,474][12883] Updated weights for policy 0, policy_version 134303 (0.0029) +[2024-06-18 12:36:41,822][12883] Updated weights for policy 0, policy_version 134313 (0.0030) +[2024-06-18 12:36:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 2200584192. Throughput: 0: 42513.6. Samples: 2200746300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:36:41,994][12645] Avg episode reward: [(0, '0.618')] +[2024-06-18 12:36:46,357][12883] Updated weights for policy 0, policy_version 134323 (0.0021) +[2024-06-18 12:36:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.0, 300 sec: 42543.2). Total num frames: 2200813568. Throughput: 0: 42387.5. Samples: 2200871280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:36:46,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 12:36:49,473][12883] Updated weights for policy 0, policy_version 134333 (0.0031) +[2024-06-18 12:36:51,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2201026560. Throughput: 0: 42439.2. Samples: 2201126980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:36:51,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 12:36:53,882][12883] Updated weights for policy 0, policy_version 134343 (0.0028) +[2024-06-18 12:36:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42543.8). Total num frames: 2201223168. Throughput: 0: 42640.5. Samples: 2201385600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:36:56,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 12:36:57,061][12883] Updated weights for policy 0, policy_version 134353 (0.0047) +[2024-06-18 12:37:01,443][12883] Updated weights for policy 0, policy_version 134363 (0.0028) +[2024-06-18 12:37:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2201436160. Throughput: 0: 42520.5. Samples: 2201513660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:37:01,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 12:37:04,779][12883] Updated weights for policy 0, policy_version 134373 (0.0033) +[2024-06-18 12:37:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2201649152. Throughput: 0: 42395.5. Samples: 2201759140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:37:06,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 12:37:09,083][12883] Updated weights for policy 0, policy_version 134383 (0.0037) +[2024-06-18 12:37:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2201862144. Throughput: 0: 42496.5. Samples: 2202022420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:37:11,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 12:37:12,320][12883] Updated weights for policy 0, policy_version 134393 (0.0026) +[2024-06-18 12:37:16,658][12883] Updated weights for policy 0, policy_version 134403 (0.0039) +[2024-06-18 12:37:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 2202058752. Throughput: 0: 42367.5. Samples: 2202144080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:37:16,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 12:37:19,864][12883] Updated weights for policy 0, policy_version 134413 (0.0041) +[2024-06-18 12:37:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.5). Total num frames: 2202304512. Throughput: 0: 42457.4. Samples: 2202401160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:37:21,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 12:37:23,053][12862] Signal inference workers to stop experience collection... (32200 times) +[2024-06-18 12:37:23,081][12883] InferenceWorker_p0-w0: stopping experience collection (32200 times) +[2024-06-18 12:37:23,100][12862] Signal inference workers to resume experience collection... (32200 times) +[2024-06-18 12:37:23,109][12883] InferenceWorker_p0-w0: resuming experience collection (32200 times) +[2024-06-18 12:37:24,238][12883] Updated weights for policy 0, policy_version 134423 (0.0036) +[2024-06-18 12:37:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2202501120. Throughput: 0: 42632.2. Samples: 2202664740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:37:26,994][12645] Avg episode reward: [(0, '0.230')] +[2024-06-18 12:37:27,681][12883] Updated weights for policy 0, policy_version 134433 (0.0022) +[2024-06-18 12:37:31,832][12883] Updated weights for policy 0, policy_version 134443 (0.0046) +[2024-06-18 12:37:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2202714112. Throughput: 0: 42496.9. Samples: 2202783640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:37:31,994][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 12:37:35,248][12883] Updated weights for policy 0, policy_version 134453 (0.0032) +[2024-06-18 12:37:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2202927104. Throughput: 0: 42578.2. Samples: 2203043000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 12:37:36,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 12:37:39,839][12883] Updated weights for policy 0, policy_version 134463 (0.0037) +[2024-06-18 12:37:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2203140096. Throughput: 0: 42550.7. Samples: 2203300380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:37:41,994][12645] Avg episode reward: [(0, '0.746')] +[2024-06-18 12:37:42,812][12883] Updated weights for policy 0, policy_version 134473 (0.0037) +[2024-06-18 12:37:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2203336704. Throughput: 0: 42427.0. Samples: 2203422880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:37:46,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 12:37:47,475][12883] Updated weights for policy 0, policy_version 134483 (0.0028) +[2024-06-18 12:37:50,824][12883] Updated weights for policy 0, policy_version 134493 (0.0029) +[2024-06-18 12:37:51,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2203582464. Throughput: 0: 42635.3. Samples: 2203677820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:37:51,996][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 12:37:55,120][12883] Updated weights for policy 0, policy_version 134503 (0.0030) +[2024-06-18 12:37:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2203762688. Throughput: 0: 42573.2. Samples: 2203938220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:37:56,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 12:37:58,523][12883] Updated weights for policy 0, policy_version 134513 (0.0036) +[2024-06-18 12:38:01,994][12645] Fps is (10 sec: 39330.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2203975680. Throughput: 0: 42550.7. Samples: 2204058860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:38:01,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 12:38:02,693][12883] Updated weights for policy 0, policy_version 134523 (0.0032) +[2024-06-18 12:38:05,946][12883] Updated weights for policy 0, policy_version 134533 (0.0055) +[2024-06-18 12:38:06,998][12645] Fps is (10 sec: 45856.1, 60 sec: 42868.5, 300 sec: 42597.8). Total num frames: 2204221440. Throughput: 0: 42599.5. Samples: 2204318320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:38:06,998][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 12:38:10,461][12883] Updated weights for policy 0, policy_version 134543 (0.0051) +[2024-06-18 12:38:12,000][12645] Fps is (10 sec: 42571.8, 60 sec: 42320.9, 300 sec: 42430.9). Total num frames: 2204401664. Throughput: 0: 42541.1. Samples: 2204579360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:38:12,000][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 12:38:13,800][12883] Updated weights for policy 0, policy_version 134553 (0.0042) +[2024-06-18 12:38:16,994][12645] Fps is (10 sec: 40977.3, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2204631040. Throughput: 0: 42541.8. Samples: 2204698020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:38:16,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 12:38:18,033][12883] Updated weights for policy 0, policy_version 134563 (0.0031) +[2024-06-18 12:38:21,434][12883] Updated weights for policy 0, policy_version 134573 (0.0036) +[2024-06-18 12:38:21,994][12645] Fps is (10 sec: 47543.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2204876800. Throughput: 0: 42638.3. Samples: 2204961720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:38:21,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 12:38:25,922][12883] Updated weights for policy 0, policy_version 134583 (0.0026) +[2024-06-18 12:38:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2205057024. Throughput: 0: 42524.8. Samples: 2205214000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:38:26,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 12:38:29,086][12883] Updated weights for policy 0, policy_version 134593 (0.0047) +[2024-06-18 12:38:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2205270016. Throughput: 0: 42573.9. Samples: 2205338700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:38:31,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 12:38:33,558][12883] Updated weights for policy 0, policy_version 134603 (0.0033) +[2024-06-18 12:38:35,551][12862] Signal inference workers to stop experience collection... (32250 times) +[2024-06-18 12:38:35,552][12862] Signal inference workers to resume experience collection... (32250 times) +[2024-06-18 12:38:35,567][12883] InferenceWorker_p0-w0: stopping experience collection (32250 times) +[2024-06-18 12:38:35,567][12883] InferenceWorker_p0-w0: resuming experience collection (32250 times) +[2024-06-18 12:38:36,732][12883] Updated weights for policy 0, policy_version 134613 (0.0029) +[2024-06-18 12:38:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2205499392. Throughput: 0: 42795.4. Samples: 2205603520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:38:36,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 12:38:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134613_2205499392.pth... +[2024-06-18 12:38:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133987_2195243008.pth +[2024-06-18 12:38:41,204][12883] Updated weights for policy 0, policy_version 134623 (0.0033) +[2024-06-18 12:38:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2205712384. Throughput: 0: 42670.8. Samples: 2205858400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:38:41,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 12:38:44,890][12883] Updated weights for policy 0, policy_version 134633 (0.0041) +[2024-06-18 12:38:47,008][12645] Fps is (10 sec: 40902.6, 60 sec: 42861.5, 300 sec: 42596.4). Total num frames: 2205908992. Throughput: 0: 42757.7. Samples: 2205983560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:38:47,008][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 12:38:48,884][12883] Updated weights for policy 0, policy_version 134643 (0.0042) +[2024-06-18 12:38:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 2206121984. Throughput: 0: 42731.1. Samples: 2206241040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:38:51,995][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 12:38:52,410][12883] Updated weights for policy 0, policy_version 134653 (0.0038) +[2024-06-18 12:38:56,670][12883] Updated weights for policy 0, policy_version 134663 (0.0031) +[2024-06-18 12:38:56,994][12645] Fps is (10 sec: 42658.9, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 2206334976. Throughput: 0: 42709.1. Samples: 2206501000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:38:56,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 12:39:00,065][12883] Updated weights for policy 0, policy_version 134673 (0.0049) +[2024-06-18 12:39:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2206564352. Throughput: 0: 42814.2. Samples: 2206624660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:01,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 12:39:04,351][12883] Updated weights for policy 0, policy_version 134683 (0.0039) +[2024-06-18 12:39:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42328.4, 300 sec: 42542.9). Total num frames: 2206760960. Throughput: 0: 42616.0. Samples: 2206879440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:06,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 12:39:07,774][12883] Updated weights for policy 0, policy_version 134693 (0.0031) +[2024-06-18 12:39:11,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42602.9, 300 sec: 42376.3). Total num frames: 2206957568. Throughput: 0: 42713.5. Samples: 2207136100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:11,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 12:39:12,023][12883] Updated weights for policy 0, policy_version 134703 (0.0027) +[2024-06-18 12:39:15,666][12883] Updated weights for policy 0, policy_version 134713 (0.0044) +[2024-06-18 12:39:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2207203328. Throughput: 0: 42678.7. Samples: 2207259240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:16,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 12:39:19,650][12883] Updated weights for policy 0, policy_version 134723 (0.0035) +[2024-06-18 12:39:21,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 2207399936. Throughput: 0: 42595.9. Samples: 2207520340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:21,995][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 12:39:23,415][12883] Updated weights for policy 0, policy_version 134733 (0.0028) +[2024-06-18 12:39:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2207596544. Throughput: 0: 42558.7. Samples: 2207773540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:26,994][12645] Avg episode reward: [(0, '0.725')] +[2024-06-18 12:39:27,498][12883] Updated weights for policy 0, policy_version 134743 (0.0022) +[2024-06-18 12:39:31,030][12883] Updated weights for policy 0, policy_version 134753 (0.0043) +[2024-06-18 12:39:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2207842304. Throughput: 0: 42532.8. Samples: 2207896940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:31,994][12645] Avg episode reward: [(0, '0.243')] +[2024-06-18 12:39:35,329][12883] Updated weights for policy 0, policy_version 134763 (0.0037) +[2024-06-18 12:39:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2208038912. Throughput: 0: 42586.3. Samples: 2208157420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:36,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 12:39:38,439][12883] Updated weights for policy 0, policy_version 134773 (0.0033) +[2024-06-18 12:39:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2208235520. Throughput: 0: 42429.7. Samples: 2208410340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:39:41,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 12:39:42,888][12883] Updated weights for policy 0, policy_version 134783 (0.0045) +[2024-06-18 12:39:46,461][12883] Updated weights for policy 0, policy_version 134793 (0.0029) +[2024-06-18 12:39:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42881.5, 300 sec: 42709.5). Total num frames: 2208481280. Throughput: 0: 42426.7. Samples: 2208533860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:39:46,994][12645] Avg episode reward: [(0, '0.702')] +[2024-06-18 12:39:50,413][12883] Updated weights for policy 0, policy_version 134803 (0.0030) +[2024-06-18 12:39:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2208677888. Throughput: 0: 42487.0. Samples: 2208791360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:39:51,995][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 12:39:54,385][12883] Updated weights for policy 0, policy_version 134813 (0.0032) +[2024-06-18 12:39:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42543.8). Total num frames: 2208890880. Throughput: 0: 42641.7. Samples: 2209054980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:39:56,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 12:39:57,929][12883] Updated weights for policy 0, policy_version 134823 (0.0028) +[2024-06-18 12:40:01,959][12883] Updated weights for policy 0, policy_version 134833 (0.0030) +[2024-06-18 12:40:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 2209103872. Throughput: 0: 42710.7. Samples: 2209181220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:01,994][12645] Avg episode reward: [(0, '0.611')] +[2024-06-18 12:40:05,504][12883] Updated weights for policy 0, policy_version 134843 (0.0050) +[2024-06-18 12:40:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 2209316864. Throughput: 0: 42648.4. Samples: 2209439520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:06,995][12645] Avg episode reward: [(0, '0.611')] +[2024-06-18 12:40:09,501][12883] Updated weights for policy 0, policy_version 134853 (0.0029) +[2024-06-18 12:40:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2209546240. Throughput: 0: 42816.9. Samples: 2209700300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:11,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 12:40:13,194][12883] Updated weights for policy 0, policy_version 134863 (0.0029) +[2024-06-18 12:40:15,260][12862] Signal inference workers to stop experience collection... (32300 times) +[2024-06-18 12:40:15,261][12862] Signal inference workers to resume experience collection... (32300 times) +[2024-06-18 12:40:15,277][12883] InferenceWorker_p0-w0: stopping experience collection (32300 times) +[2024-06-18 12:40:15,277][12883] InferenceWorker_p0-w0: resuming experience collection (32300 times) +[2024-06-18 12:40:16,973][12883] Updated weights for policy 0, policy_version 134873 (0.0033) +[2024-06-18 12:40:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2209759232. Throughput: 0: 42857.4. Samples: 2209825520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:16,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 12:40:20,951][12883] Updated weights for policy 0, policy_version 134883 (0.0028) +[2024-06-18 12:40:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2209972224. Throughput: 0: 42851.9. Samples: 2210085760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:21,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 12:40:24,673][12883] Updated weights for policy 0, policy_version 134893 (0.0033) +[2024-06-18 12:40:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2210201600. Throughput: 0: 42912.9. Samples: 2210341420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:26,994][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 12:40:28,413][12883] Updated weights for policy 0, policy_version 134903 (0.0039) +[2024-06-18 12:40:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2210398208. Throughput: 0: 43064.9. Samples: 2210471780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:31,994][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 12:40:32,176][12883] Updated weights for policy 0, policy_version 134913 (0.0037) +[2024-06-18 12:40:35,870][12883] Updated weights for policy 0, policy_version 134923 (0.0037) +[2024-06-18 12:40:36,997][12645] Fps is (10 sec: 40945.5, 60 sec: 42868.9, 300 sec: 42542.3). Total num frames: 2210611200. Throughput: 0: 43182.8. Samples: 2210734740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:37,006][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 12:40:37,085][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134926_2210627584.pth... +[2024-06-18 12:40:37,154][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134301_2200387584.pth +[2024-06-18 12:40:39,942][12883] Updated weights for policy 0, policy_version 134933 (0.0028) +[2024-06-18 12:40:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 2210824192. Throughput: 0: 42918.7. Samples: 2210986320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:41,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 12:40:43,439][12883] Updated weights for policy 0, policy_version 134943 (0.0035) +[2024-06-18 12:40:46,994][12645] Fps is (10 sec: 40975.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2211020800. Throughput: 0: 42872.8. Samples: 2211110500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 12:40:46,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 12:40:47,714][12883] Updated weights for policy 0, policy_version 134953 (0.0031) +[2024-06-18 12:40:51,061][12883] Updated weights for policy 0, policy_version 134963 (0.0031) +[2024-06-18 12:40:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2211266560. Throughput: 0: 42759.7. Samples: 2211363700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:40:51,996][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 12:40:55,453][12883] Updated weights for policy 0, policy_version 134973 (0.0025) +[2024-06-18 12:40:56,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2211479552. Throughput: 0: 42710.5. Samples: 2211622280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:40:56,997][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 12:40:58,773][12883] Updated weights for policy 0, policy_version 134983 (0.0038) +[2024-06-18 12:41:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2211659776. Throughput: 0: 42825.3. Samples: 2211752660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:01,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 12:41:03,310][12883] Updated weights for policy 0, policy_version 134993 (0.0038) +[2024-06-18 12:41:06,371][12883] Updated weights for policy 0, policy_version 135003 (0.0030) +[2024-06-18 12:41:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 2211905536. Throughput: 0: 42701.8. Samples: 2212007340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:06,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 12:41:10,797][12883] Updated weights for policy 0, policy_version 135013 (0.0031) +[2024-06-18 12:41:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2212118528. Throughput: 0: 42777.4. Samples: 2212266400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:11,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 12:41:13,906][12883] Updated weights for policy 0, policy_version 135023 (0.0034) +[2024-06-18 12:41:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2212298752. Throughput: 0: 42728.9. Samples: 2212394580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:16,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 12:41:18,490][12883] Updated weights for policy 0, policy_version 135033 (0.0024) +[2024-06-18 12:41:21,318][12883] Updated weights for policy 0, policy_version 135043 (0.0023) +[2024-06-18 12:41:21,996][12645] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 2212560896. Throughput: 0: 42677.7. Samples: 2212655180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:21,997][12645] Avg episode reward: [(0, '0.700')] +[2024-06-18 12:41:26,194][12883] Updated weights for policy 0, policy_version 135053 (0.0047) +[2024-06-18 12:41:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2212757504. Throughput: 0: 42811.1. Samples: 2212912820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:26,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 12:41:29,272][12883] Updated weights for policy 0, policy_version 135063 (0.0028) +[2024-06-18 12:41:31,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2212954112. Throughput: 0: 42711.6. Samples: 2213032520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:31,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 12:41:33,935][12883] Updated weights for policy 0, policy_version 135073 (0.0029) +[2024-06-18 12:41:36,895][12883] Updated weights for policy 0, policy_version 135083 (0.0036) +[2024-06-18 12:41:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43147.1, 300 sec: 42765.0). Total num frames: 2213199872. Throughput: 0: 42868.9. Samples: 2213292800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:36,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 12:41:41,512][12883] Updated weights for policy 0, policy_version 135093 (0.0036) +[2024-06-18 12:41:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2213380096. Throughput: 0: 42926.7. Samples: 2213553980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:41,999][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 12:41:44,445][12883] Updated weights for policy 0, policy_version 135103 (0.0036) +[2024-06-18 12:41:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2213609472. Throughput: 0: 42720.1. Samples: 2213675060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 12:41:46,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 12:41:49,026][12883] Updated weights for policy 0, policy_version 135113 (0.0037) +[2024-06-18 12:41:51,820][12862] Signal inference workers to stop experience collection... (32350 times) +[2024-06-18 12:41:51,820][12862] Signal inference workers to resume experience collection... (32350 times) +[2024-06-18 12:41:51,863][12883] InferenceWorker_p0-w0: stopping experience collection (32350 times) +[2024-06-18 12:41:51,864][12883] InferenceWorker_p0-w0: resuming experience collection (32350 times) +[2024-06-18 12:41:51,952][12883] Updated weights for policy 0, policy_version 135123 (0.0027) +[2024-06-18 12:41:51,994][12645] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2213855232. Throughput: 0: 43004.1. Samples: 2213942520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:41:51,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 12:41:56,994][12645] Fps is (10 sec: 39320.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2214002688. Throughput: 0: 42884.2. Samples: 2214196200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:41:56,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 12:41:57,161][12883] Updated weights for policy 0, policy_version 135133 (0.0037) +[2024-06-18 12:41:59,590][12883] Updated weights for policy 0, policy_version 135143 (0.0034) +[2024-06-18 12:42:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2214248448. Throughput: 0: 42673.3. Samples: 2214314880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:01,994][12645] Avg episode reward: [(0, '0.135')] +[2024-06-18 12:42:04,600][12883] Updated weights for policy 0, policy_version 135153 (0.0036) +[2024-06-18 12:42:06,994][12645] Fps is (10 sec: 45876.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2214461440. Throughput: 0: 42796.0. Samples: 2214580900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:06,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 12:42:07,423][12883] Updated weights for policy 0, policy_version 135163 (0.0028) +[2024-06-18 12:42:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2214658048. Throughput: 0: 42854.7. Samples: 2214841280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:11,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 12:42:12,170][12883] Updated weights for policy 0, policy_version 135173 (0.0033) +[2024-06-18 12:42:15,123][12883] Updated weights for policy 0, policy_version 135183 (0.0037) +[2024-06-18 12:42:16,996][12645] Fps is (10 sec: 42588.4, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 2214887424. Throughput: 0: 42956.9. Samples: 2214965680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:16,997][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 12:42:19,762][12883] Updated weights for policy 0, policy_version 135193 (0.0031) +[2024-06-18 12:42:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 2215116800. Throughput: 0: 42952.4. Samples: 2215225660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:21,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 12:42:22,648][12883] Updated weights for policy 0, policy_version 135203 (0.0026) +[2024-06-18 12:42:26,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2215297024. Throughput: 0: 42981.3. Samples: 2215488140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:26,994][12645] Avg episode reward: [(0, '0.843')] +[2024-06-18 12:42:27,315][12883] Updated weights for policy 0, policy_version 135213 (0.0035) +[2024-06-18 12:42:30,489][12883] Updated weights for policy 0, policy_version 135223 (0.0043) +[2024-06-18 12:42:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2215542784. Throughput: 0: 42918.5. Samples: 2215606400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:31,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 12:42:35,225][12883] Updated weights for policy 0, policy_version 135233 (0.0028) +[2024-06-18 12:42:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2215739392. Throughput: 0: 42779.9. Samples: 2215867620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:36,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 12:42:37,138][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135239_2215755776.pth... +[2024-06-18 12:42:37,183][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134613_2205499392.pth +[2024-06-18 12:42:38,281][12883] Updated weights for policy 0, policy_version 135243 (0.0030) +[2024-06-18 12:42:41,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2215936000. Throughput: 0: 42784.2. Samples: 2216121480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:41,994][12645] Avg episode reward: [(0, '0.719')] +[2024-06-18 12:42:42,785][12883] Updated weights for policy 0, policy_version 135253 (0.0045) +[2024-06-18 12:42:46,026][12883] Updated weights for policy 0, policy_version 135263 (0.0028) +[2024-06-18 12:42:46,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 2216198144. Throughput: 0: 42922.3. Samples: 2216246380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:46,994][12645] Avg episode reward: [(0, '0.616')] +[2024-06-18 12:42:50,459][12883] Updated weights for policy 0, policy_version 135273 (0.0037) +[2024-06-18 12:42:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2216378368. Throughput: 0: 42756.5. Samples: 2216504940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 12:42:52,002][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 12:42:53,690][12883] Updated weights for policy 0, policy_version 135283 (0.0037) +[2024-06-18 12:42:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 2216574976. Throughput: 0: 42640.5. Samples: 2216760100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:42:56,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 12:42:58,134][12883] Updated weights for policy 0, policy_version 135293 (0.0034) +[2024-06-18 12:43:01,120][12862] Signal inference workers to stop experience collection... (32400 times) +[2024-06-18 12:43:01,120][12862] Signal inference workers to resume experience collection... (32400 times) +[2024-06-18 12:43:01,137][12883] InferenceWorker_p0-w0: stopping experience collection (32400 times) +[2024-06-18 12:43:01,167][12883] InferenceWorker_p0-w0: resuming experience collection (32400 times) +[2024-06-18 12:43:01,258][12883] Updated weights for policy 0, policy_version 135303 (0.0028) +[2024-06-18 12:43:01,994][12645] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42765.6). Total num frames: 2216837120. Throughput: 0: 42758.5. Samples: 2216889720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:01,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 12:43:05,755][12883] Updated weights for policy 0, policy_version 135313 (0.0032) +[2024-06-18 12:43:06,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.2, 300 sec: 42765.9). Total num frames: 2217017344. Throughput: 0: 42686.6. Samples: 2217146560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:06,995][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 12:43:08,910][12883] Updated weights for policy 0, policy_version 135323 (0.0040) +[2024-06-18 12:43:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2217230336. Throughput: 0: 42441.4. Samples: 2217398000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:11,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 12:43:13,575][12883] Updated weights for policy 0, policy_version 135333 (0.0034) +[2024-06-18 12:43:16,602][12883] Updated weights for policy 0, policy_version 135343 (0.0030) +[2024-06-18 12:43:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 2217476096. Throughput: 0: 42697.4. Samples: 2217527780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:16,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 12:43:21,059][12883] Updated weights for policy 0, policy_version 135353 (0.0033) +[2024-06-18 12:43:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2217656320. Throughput: 0: 42662.2. Samples: 2217787420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:21,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 12:43:24,263][12883] Updated weights for policy 0, policy_version 135363 (0.0042) +[2024-06-18 12:43:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2217885696. Throughput: 0: 42451.4. Samples: 2218031800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:26,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 12:43:29,421][12883] Updated weights for policy 0, policy_version 135373 (0.0033) +[2024-06-18 12:43:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2218082304. Throughput: 0: 42713.4. Samples: 2218168480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:31,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 12:43:32,151][12883] Updated weights for policy 0, policy_version 135383 (0.0045) +[2024-06-18 12:43:36,994][12645] Fps is (10 sec: 37681.4, 60 sec: 42051.8, 300 sec: 42542.8). Total num frames: 2218262528. Throughput: 0: 42478.9. Samples: 2218416520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:36,995][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 12:43:37,071][12883] Updated weights for policy 0, policy_version 135393 (0.0038) +[2024-06-18 12:43:39,957][12883] Updated weights for policy 0, policy_version 135403 (0.0032) +[2024-06-18 12:43:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42767.1). Total num frames: 2218524672. Throughput: 0: 42362.6. Samples: 2218666420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:41,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 12:43:44,777][12883] Updated weights for policy 0, policy_version 135413 (0.0036) +[2024-06-18 12:43:46,994][12645] Fps is (10 sec: 45877.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2218721280. Throughput: 0: 42505.3. Samples: 2218802460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:46,994][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 12:43:47,685][12883] Updated weights for policy 0, policy_version 135423 (0.0040) +[2024-06-18 12:43:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 2218901504. Throughput: 0: 42286.7. Samples: 2219049460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:43:51,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 12:43:52,471][12883] Updated weights for policy 0, policy_version 135433 (0.0043) +[2024-06-18 12:43:55,381][12883] Updated weights for policy 0, policy_version 135443 (0.0036) +[2024-06-18 12:43:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2219163648. Throughput: 0: 42234.2. Samples: 2219298540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:43:56,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 12:44:00,142][12883] Updated weights for policy 0, policy_version 135453 (0.0034) +[2024-06-18 12:44:01,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2219360256. Throughput: 0: 42406.3. Samples: 2219436060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:01,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 12:44:03,014][12883] Updated weights for policy 0, policy_version 135463 (0.0041) +[2024-06-18 12:44:06,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2219540480. Throughput: 0: 42147.4. Samples: 2219684060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:06,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 12:44:08,131][12883] Updated weights for policy 0, policy_version 135473 (0.0035) +[2024-06-18 12:44:10,790][12883] Updated weights for policy 0, policy_version 135483 (0.0027) +[2024-06-18 12:44:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2219802624. Throughput: 0: 42216.2. Samples: 2219931520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:11,994][12645] Avg episode reward: [(0, '0.214')] +[2024-06-18 12:44:15,813][12883] Updated weights for policy 0, policy_version 135493 (0.0051) +[2024-06-18 12:44:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 2219982848. Throughput: 0: 42143.0. Samples: 2220064920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:16,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 12:44:18,388][12883] Updated weights for policy 0, policy_version 135503 (0.0032) +[2024-06-18 12:44:21,996][12645] Fps is (10 sec: 39312.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 2220195840. Throughput: 0: 42138.0. Samples: 2220312800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:21,996][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 12:44:23,486][12883] Updated weights for policy 0, policy_version 135513 (0.0028) +[2024-06-18 12:44:24,445][12862] Signal inference workers to stop experience collection... (32450 times) +[2024-06-18 12:44:24,497][12862] Signal inference workers to resume experience collection... (32450 times) +[2024-06-18 12:44:24,498][12883] InferenceWorker_p0-w0: stopping experience collection (32450 times) +[2024-06-18 12:44:24,513][12883] InferenceWorker_p0-w0: resuming experience collection (32450 times) +[2024-06-18 12:44:26,056][12883] Updated weights for policy 0, policy_version 135523 (0.0043) +[2024-06-18 12:44:26,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2220441600. Throughput: 0: 42248.0. Samples: 2220567580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:26,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 12:44:31,094][12883] Updated weights for policy 0, policy_version 135533 (0.0036) +[2024-06-18 12:44:31,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2220621824. Throughput: 0: 42240.5. Samples: 2220703280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:31,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 12:44:34,156][12883] Updated weights for policy 0, policy_version 135543 (0.0033) +[2024-06-18 12:44:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.8, 300 sec: 42653.9). Total num frames: 2220818432. Throughput: 0: 42274.7. Samples: 2220951820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:36,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 12:44:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135549_2220834816.pth... +[2024-06-18 12:44:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134926_2210627584.pth +[2024-06-18 12:44:38,797][12883] Updated weights for policy 0, policy_version 135553 (0.0038) +[2024-06-18 12:44:41,743][12883] Updated weights for policy 0, policy_version 135563 (0.0033) +[2024-06-18 12:44:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2221080576. Throughput: 0: 42384.5. Samples: 2221205840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:41,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 12:44:46,413][12883] Updated weights for policy 0, policy_version 135573 (0.0043) +[2024-06-18 12:44:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2221260800. Throughput: 0: 42371.4. Samples: 2221342780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:46,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 12:44:49,294][12883] Updated weights for policy 0, policy_version 135583 (0.0034) +[2024-06-18 12:44:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2221473792. Throughput: 0: 42405.7. Samples: 2221592320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:51,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 12:44:54,003][12883] Updated weights for policy 0, policy_version 135593 (0.0043) +[2024-06-18 12:44:56,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42323.8, 300 sec: 42709.1). Total num frames: 2221703168. Throughput: 0: 42568.1. Samples: 2221847180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 12:44:56,996][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 12:44:57,128][12883] Updated weights for policy 0, policy_version 135603 (0.0028) +[2024-06-18 12:45:01,537][12883] Updated weights for policy 0, policy_version 135613 (0.0042) +[2024-06-18 12:45:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2221899776. Throughput: 0: 42528.0. Samples: 2221978680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:01,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 12:45:04,564][12883] Updated weights for policy 0, policy_version 135623 (0.0038) +[2024-06-18 12:45:07,000][12645] Fps is (10 sec: 39305.6, 60 sec: 42594.0, 300 sec: 42541.9). Total num frames: 2222096384. Throughput: 0: 42556.6. Samples: 2222228020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:07,001][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 12:45:09,542][12883] Updated weights for policy 0, policy_version 135633 (0.0029) +[2024-06-18 12:45:11,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2222358528. Throughput: 0: 42639.7. Samples: 2222486360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:11,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 12:45:12,135][12883] Updated weights for policy 0, policy_version 135643 (0.0027) +[2024-06-18 12:45:16,994][12645] Fps is (10 sec: 42625.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2222522368. Throughput: 0: 42636.5. Samples: 2222621920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:16,994][12645] Avg episode reward: [(0, '0.661')] +[2024-06-18 12:45:17,009][12883] Updated weights for policy 0, policy_version 135653 (0.0028) +[2024-06-18 12:45:19,737][12883] Updated weights for policy 0, policy_version 135663 (0.0026) +[2024-06-18 12:45:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 2222751744. Throughput: 0: 42642.2. Samples: 2222870720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:21,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 12:45:24,737][12883] Updated weights for policy 0, policy_version 135673 (0.0036) +[2024-06-18 12:45:26,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2222997504. Throughput: 0: 42729.7. Samples: 2223128680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:27,000][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 12:45:27,384][12883] Updated weights for policy 0, policy_version 135683 (0.0035) +[2024-06-18 12:45:27,986][12862] Signal inference workers to stop experience collection... (32500 times) +[2024-06-18 12:45:27,986][12862] Signal inference workers to resume experience collection... (32500 times) +[2024-06-18 12:45:27,997][12883] InferenceWorker_p0-w0: stopping experience collection (32500 times) +[2024-06-18 12:45:28,013][12883] InferenceWorker_p0-w0: resuming experience collection (32500 times) +[2024-06-18 12:45:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42598.9). Total num frames: 2223177728. Throughput: 0: 42697.4. Samples: 2223264160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:31,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 12:45:32,177][12883] Updated weights for policy 0, policy_version 135693 (0.0035) +[2024-06-18 12:45:35,067][12883] Updated weights for policy 0, policy_version 135703 (0.0029) +[2024-06-18 12:45:36,996][12645] Fps is (10 sec: 40951.1, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 2223407104. Throughput: 0: 42618.9. Samples: 2223510260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:36,996][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 12:45:39,665][12883] Updated weights for policy 0, policy_version 135713 (0.0034) +[2024-06-18 12:45:41,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2223620096. Throughput: 0: 42888.4. Samples: 2223777060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:41,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 12:45:42,893][12883] Updated weights for policy 0, policy_version 135723 (0.0033) +[2024-06-18 12:45:46,994][12645] Fps is (10 sec: 40966.5, 60 sec: 42598.0, 300 sec: 42542.8). Total num frames: 2223816704. Throughput: 0: 42679.9. Samples: 2223899300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:46,995][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 12:45:47,418][12883] Updated weights for policy 0, policy_version 135733 (0.0027) +[2024-06-18 12:45:50,483][12883] Updated weights for policy 0, policy_version 135743 (0.0033) +[2024-06-18 12:45:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2224046080. Throughput: 0: 42711.3. Samples: 2224149760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:51,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 12:45:55,021][12883] Updated weights for policy 0, policy_version 135753 (0.0040) +[2024-06-18 12:45:56,994][12645] Fps is (10 sec: 44239.2, 60 sec: 42599.9, 300 sec: 42709.5). Total num frames: 2224259072. Throughput: 0: 42902.0. Samples: 2224416960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) +[2024-06-18 12:45:56,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 12:45:58,115][12883] Updated weights for policy 0, policy_version 135763 (0.0040) +[2024-06-18 12:46:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2224455680. Throughput: 0: 42658.6. Samples: 2224541560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:01,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 12:46:02,846][12883] Updated weights for policy 0, policy_version 135773 (0.0032) +[2024-06-18 12:46:05,665][12883] Updated weights for policy 0, policy_version 135783 (0.0027) +[2024-06-18 12:46:06,996][12645] Fps is (10 sec: 44227.6, 60 sec: 43420.6, 300 sec: 42653.6). Total num frames: 2224701440. Throughput: 0: 42792.1. Samples: 2224796460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:06,997][12645] Avg episode reward: [(0, '0.217')] +[2024-06-18 12:46:10,472][12883] Updated weights for policy 0, policy_version 135793 (0.0035) +[2024-06-18 12:46:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2224881664. Throughput: 0: 42945.0. Samples: 2225061200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:11,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 12:46:13,234][12883] Updated weights for policy 0, policy_version 135803 (0.0032) +[2024-06-18 12:46:16,994][12645] Fps is (10 sec: 37691.5, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 2225078272. Throughput: 0: 42714.3. Samples: 2225186300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:16,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 12:46:17,890][12883] Updated weights for policy 0, policy_version 135813 (0.0028) +[2024-06-18 12:46:20,879][12883] Updated weights for policy 0, policy_version 135823 (0.0030) +[2024-06-18 12:46:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2225340416. Throughput: 0: 42929.8. Samples: 2225442000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:21,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 12:46:25,744][12883] Updated weights for policy 0, policy_version 135833 (0.0046) +[2024-06-18 12:46:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2225537024. Throughput: 0: 42766.6. Samples: 2225701560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:26,994][12645] Avg episode reward: [(0, '0.624')] +[2024-06-18 12:46:29,121][12883] Updated weights for policy 0, policy_version 135843 (0.0036) +[2024-06-18 12:46:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2225733632. Throughput: 0: 42837.1. Samples: 2225826940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:31,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 12:46:33,181][12883] Updated weights for policy 0, policy_version 135853 (0.0025) +[2024-06-18 12:46:36,625][12883] Updated weights for policy 0, policy_version 135863 (0.0034) +[2024-06-18 12:46:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 2225995776. Throughput: 0: 43030.2. Samples: 2226086120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:36,995][12645] Avg episode reward: [(0, '0.686')] +[2024-06-18 12:46:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135864_2225995776.pth... +[2024-06-18 12:46:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135239_2215755776.pth +[2024-06-18 12:46:40,959][12883] Updated weights for policy 0, policy_version 135873 (0.0037) +[2024-06-18 12:46:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2226192384. Throughput: 0: 42807.6. Samples: 2226343300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:41,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 12:46:44,208][12883] Updated weights for policy 0, policy_version 135883 (0.0036) +[2024-06-18 12:46:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.9, 300 sec: 42487.3). Total num frames: 2226388992. Throughput: 0: 42834.6. Samples: 2226469120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:46,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 12:46:48,783][12883] Updated weights for policy 0, policy_version 135893 (0.0039) +[2024-06-18 12:46:51,948][12883] Updated weights for policy 0, policy_version 135903 (0.0037) +[2024-06-18 12:46:51,994][12645] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2226634752. Throughput: 0: 42892.0. Samples: 2226726500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:51,994][12645] Avg episode reward: [(0, '0.680')] +[2024-06-18 12:46:56,349][12883] Updated weights for policy 0, policy_version 135913 (0.0024) +[2024-06-18 12:46:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2226831360. Throughput: 0: 42758.7. Samples: 2226985340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:46:56,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 12:46:58,201][12862] Signal inference workers to stop experience collection... (32550 times) +[2024-06-18 12:46:58,202][12862] Signal inference workers to resume experience collection... (32550 times) +[2024-06-18 12:46:58,249][12883] InferenceWorker_p0-w0: stopping experience collection (32550 times) +[2024-06-18 12:46:58,249][12883] InferenceWorker_p0-w0: resuming experience collection (32550 times) +[2024-06-18 12:46:59,498][12883] Updated weights for policy 0, policy_version 135923 (0.0031) +[2024-06-18 12:47:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2227044352. Throughput: 0: 42698.3. Samples: 2227107720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 12:47:01,994][12645] Avg episode reward: [(0, '0.661')] +[2024-06-18 12:47:04,162][12883] Updated weights for policy 0, policy_version 135933 (0.0038) +[2024-06-18 12:47:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2227273728. Throughput: 0: 42784.0. Samples: 2227367280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:06,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 12:47:07,565][12883] Updated weights for policy 0, policy_version 135943 (0.0026) +[2024-06-18 12:47:11,721][12883] Updated weights for policy 0, policy_version 135953 (0.0044) +[2024-06-18 12:47:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 2227470336. Throughput: 0: 42811.9. Samples: 2227628100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:11,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 12:47:15,329][12883] Updated weights for policy 0, policy_version 135963 (0.0032) +[2024-06-18 12:47:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 2227683328. Throughput: 0: 42820.9. Samples: 2227753880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:16,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 12:47:19,281][12883] Updated weights for policy 0, policy_version 135973 (0.0042) +[2024-06-18 12:47:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2227912704. Throughput: 0: 42730.2. Samples: 2228008980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:21,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 12:47:22,941][12883] Updated weights for policy 0, policy_version 135983 (0.0045) +[2024-06-18 12:47:26,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42596.8, 300 sec: 42542.6). Total num frames: 2228092928. Throughput: 0: 42729.9. Samples: 2228266240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:26,997][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 12:47:27,287][12883] Updated weights for policy 0, policy_version 135993 (0.0028) +[2024-06-18 12:47:30,631][12883] Updated weights for policy 0, policy_version 136003 (0.0037) +[2024-06-18 12:47:31,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 2228305920. Throughput: 0: 42513.9. Samples: 2228382340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:31,997][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 12:47:34,874][12883] Updated weights for policy 0, policy_version 136013 (0.0035) +[2024-06-18 12:47:36,994][12645] Fps is (10 sec: 47524.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2228568064. Throughput: 0: 42588.3. Samples: 2228642980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:36,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 12:47:38,127][12883] Updated weights for policy 0, policy_version 136023 (0.0039) +[2024-06-18 12:47:41,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2228731904. Throughput: 0: 42744.1. Samples: 2228908820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:41,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 12:47:42,431][12883] Updated weights for policy 0, policy_version 136033 (0.0032) +[2024-06-18 12:47:46,081][12883] Updated weights for policy 0, policy_version 136043 (0.0032) +[2024-06-18 12:47:46,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2228944896. Throughput: 0: 42578.2. Samples: 2229023740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:46,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 12:47:50,002][12883] Updated weights for policy 0, policy_version 136053 (0.0032) +[2024-06-18 12:47:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2229190656. Throughput: 0: 42566.1. Samples: 2229282760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:51,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 12:47:53,695][12883] Updated weights for policy 0, policy_version 136063 (0.0024) +[2024-06-18 12:47:56,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 2229370880. Throughput: 0: 42663.3. Samples: 2229548040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:47:56,997][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 12:47:57,621][12883] Updated weights for policy 0, policy_version 136073 (0.0055) +[2024-06-18 12:48:01,507][12883] Updated weights for policy 0, policy_version 136083 (0.0028) +[2024-06-18 12:48:01,993][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2229583872. Throughput: 0: 42497.0. Samples: 2229666240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 12:48:01,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 12:48:05,555][12883] Updated weights for policy 0, policy_version 136093 (0.0033) +[2024-06-18 12:48:06,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2229829632. Throughput: 0: 42605.4. Samples: 2229926220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:06,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 12:48:09,376][12883] Updated weights for policy 0, policy_version 136103 (0.0027) +[2024-06-18 12:48:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2230009856. Throughput: 0: 42572.8. Samples: 2230181920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:11,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 12:48:13,229][12883] Updated weights for policy 0, policy_version 136113 (0.0049) +[2024-06-18 12:48:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2230222848. Throughput: 0: 42690.1. Samples: 2230303300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:16,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 12:48:17,006][12883] Updated weights for policy 0, policy_version 136123 (0.0043) +[2024-06-18 12:48:20,788][12883] Updated weights for policy 0, policy_version 136133 (0.0028) +[2024-06-18 12:48:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2230452224. Throughput: 0: 42635.1. Samples: 2230561560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:21,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 12:48:25,144][12883] Updated weights for policy 0, policy_version 136143 (0.0033) +[2024-06-18 12:48:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42599.9, 300 sec: 42598.4). Total num frames: 2230648832. Throughput: 0: 42559.8. Samples: 2230824020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:26,994][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 12:48:28,533][12862] Signal inference workers to stop experience collection... (32600 times) +[2024-06-18 12:48:28,533][12862] Signal inference workers to resume experience collection... (32600 times) +[2024-06-18 12:48:28,534][12883] Updated weights for policy 0, policy_version 136153 (0.0030) +[2024-06-18 12:48:28,588][12883] InferenceWorker_p0-w0: stopping experience collection (32600 times) +[2024-06-18 12:48:28,589][12883] InferenceWorker_p0-w0: resuming experience collection (32600 times) +[2024-06-18 12:48:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42599.9, 300 sec: 42709.6). Total num frames: 2230861824. Throughput: 0: 42668.7. Samples: 2230943840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:31,994][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 12:48:32,635][12883] Updated weights for policy 0, policy_version 136163 (0.0032) +[2024-06-18 12:48:36,069][12883] Updated weights for policy 0, policy_version 136173 (0.0030) +[2024-06-18 12:48:36,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2231107584. Throughput: 0: 42754.7. Samples: 2231206720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:36,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 12:48:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136176_2231107584.pth... +[2024-06-18 12:48:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135549_2220834816.pth +[2024-06-18 12:48:40,176][12883] Updated weights for policy 0, policy_version 136183 (0.0023) +[2024-06-18 12:48:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2231271424. Throughput: 0: 42670.2. Samples: 2231468100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:41,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 12:48:43,640][12883] Updated weights for policy 0, policy_version 136193 (0.0041) +[2024-06-18 12:48:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2231500800. Throughput: 0: 42552.7. Samples: 2231581120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:46,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 12:48:48,022][12883] Updated weights for policy 0, policy_version 136203 (0.0026) +[2024-06-18 12:48:51,377][12883] Updated weights for policy 0, policy_version 136213 (0.0038) +[2024-06-18 12:48:51,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2231746560. Throughput: 0: 42542.6. Samples: 2231840640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:51,996][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 12:48:56,100][12883] Updated weights for policy 0, policy_version 136223 (0.0029) +[2024-06-18 12:48:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42542.8). Total num frames: 2231910400. Throughput: 0: 42507.9. Samples: 2232094780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:48:56,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 12:48:59,002][12883] Updated weights for policy 0, policy_version 136233 (0.0031) +[2024-06-18 12:49:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2232156160. Throughput: 0: 42439.2. Samples: 2232213060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:49:01,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 12:49:03,823][12883] Updated weights for policy 0, policy_version 136243 (0.0027) +[2024-06-18 12:49:06,586][12883] Updated weights for policy 0, policy_version 136253 (0.0045) +[2024-06-18 12:49:06,996][12645] Fps is (10 sec: 45865.3, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 2232369152. Throughput: 0: 42569.5. Samples: 2232477280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 12:49:06,997][12645] Avg episode reward: [(0, '0.633')] +[2024-06-18 12:49:11,519][12883] Updated weights for policy 0, policy_version 136263 (0.0041) +[2024-06-18 12:49:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2232532992. Throughput: 0: 42375.3. Samples: 2232730900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:11,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 12:49:14,192][12883] Updated weights for policy 0, policy_version 136273 (0.0031) +[2024-06-18 12:49:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2232795136. Throughput: 0: 42394.8. Samples: 2232851600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:16,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 12:49:19,326][12883] Updated weights for policy 0, policy_version 136283 (0.0032) +[2024-06-18 12:49:21,830][12883] Updated weights for policy 0, policy_version 136293 (0.0031) +[2024-06-18 12:49:21,994][12645] Fps is (10 sec: 49151.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2233024512. Throughput: 0: 42441.3. Samples: 2233116580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:21,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 12:49:26,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2233171968. Throughput: 0: 42403.1. Samples: 2233376240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:26,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 12:49:27,019][12883] Updated weights for policy 0, policy_version 136303 (0.0037) +[2024-06-18 12:49:29,050][12862] Signal inference workers to stop experience collection... (32650 times) +[2024-06-18 12:49:29,096][12883] InferenceWorker_p0-w0: stopping experience collection (32650 times) +[2024-06-18 12:49:29,097][12862] Signal inference workers to resume experience collection... (32650 times) +[2024-06-18 12:49:29,115][12883] InferenceWorker_p0-w0: resuming experience collection (32650 times) +[2024-06-18 12:49:29,703][12883] Updated weights for policy 0, policy_version 136313 (0.0039) +[2024-06-18 12:49:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2233450496. Throughput: 0: 42526.7. Samples: 2233494820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:31,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 12:49:34,761][12883] Updated weights for policy 0, policy_version 136323 (0.0039) +[2024-06-18 12:49:36,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2233647104. Throughput: 0: 42627.7. Samples: 2233758880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:36,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 12:49:37,256][12883] Updated weights for policy 0, policy_version 136333 (0.0043) +[2024-06-18 12:49:41,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2233827328. Throughput: 0: 42613.7. Samples: 2234012400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:41,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 12:49:42,414][12883] Updated weights for policy 0, policy_version 136343 (0.0046) +[2024-06-18 12:49:45,090][12883] Updated weights for policy 0, policy_version 136353 (0.0028) +[2024-06-18 12:49:46,996][12645] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2234089472. Throughput: 0: 42643.2. Samples: 2234132100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:46,997][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 12:49:49,940][12883] Updated weights for policy 0, policy_version 136363 (0.0022) +[2024-06-18 12:49:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 42543.2). Total num frames: 2234253312. Throughput: 0: 42509.7. Samples: 2234390120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:51,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 12:49:52,841][12883] Updated weights for policy 0, policy_version 136373 (0.0049) +[2024-06-18 12:49:56,994][12645] Fps is (10 sec: 37691.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2234466304. Throughput: 0: 42552.5. Samples: 2234645760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:49:56,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 12:49:57,487][12883] Updated weights for policy 0, policy_version 136383 (0.0036) +[2024-06-18 12:50:00,798][12883] Updated weights for policy 0, policy_version 136393 (0.0034) +[2024-06-18 12:50:01,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 2234728448. Throughput: 0: 42750.5. Samples: 2234775380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:50:01,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 12:50:04,965][12883] Updated weights for policy 0, policy_version 136403 (0.0026) +[2024-06-18 12:50:06,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42326.8, 300 sec: 42542.8). Total num frames: 2234908672. Throughput: 0: 42545.7. Samples: 2235031140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:50:06,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 12:50:08,505][12883] Updated weights for policy 0, policy_version 136413 (0.0031) +[2024-06-18 12:50:11,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2235105280. Throughput: 0: 42550.2. Samples: 2235291000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 12:50:11,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 12:50:12,599][12883] Updated weights for policy 0, policy_version 136423 (0.0030) +[2024-06-18 12:50:16,099][12883] Updated weights for policy 0, policy_version 136433 (0.0030) +[2024-06-18 12:50:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2235367424. Throughput: 0: 42746.6. Samples: 2235418420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:16,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 12:50:20,520][12883] Updated weights for policy 0, policy_version 136443 (0.0028) +[2024-06-18 12:50:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 2235531264. Throughput: 0: 42499.1. Samples: 2235671340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:21,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 12:50:23,698][12883] Updated weights for policy 0, policy_version 136453 (0.0037) +[2024-06-18 12:50:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 2235760640. Throughput: 0: 42494.4. Samples: 2235924640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:26,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 12:50:28,141][12883] Updated weights for policy 0, policy_version 136463 (0.0035) +[2024-06-18 12:50:31,372][12883] Updated weights for policy 0, policy_version 136473 (0.0035) +[2024-06-18 12:50:31,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 2236006400. Throughput: 0: 42801.7. Samples: 2236058080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:31,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 12:50:35,881][12883] Updated weights for policy 0, policy_version 136483 (0.0039) +[2024-06-18 12:50:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2236186624. Throughput: 0: 42884.9. Samples: 2236319940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:36,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 12:50:37,064][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136487_2236203008.pth... +[2024-06-18 12:50:37,122][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135864_2225995776.pth +[2024-06-18 12:50:37,866][12862] Signal inference workers to stop experience collection... (32700 times) +[2024-06-18 12:50:37,916][12883] InferenceWorker_p0-w0: stopping experience collection (32700 times) +[2024-06-18 12:50:37,924][12862] Signal inference workers to resume experience collection... (32700 times) +[2024-06-18 12:50:37,928][12883] InferenceWorker_p0-w0: resuming experience collection (32700 times) +[2024-06-18 12:50:39,088][12883] Updated weights for policy 0, policy_version 136493 (0.0029) +[2024-06-18 12:50:41,995][12645] Fps is (10 sec: 40953.5, 60 sec: 43143.5, 300 sec: 42709.4). Total num frames: 2236416000. Throughput: 0: 42688.3. Samples: 2236566800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:41,996][12645] Avg episode reward: [(0, '0.232')] +[2024-06-18 12:50:43,516][12883] Updated weights for policy 0, policy_version 136503 (0.0037) +[2024-06-18 12:50:46,872][12883] Updated weights for policy 0, policy_version 136513 (0.0030) +[2024-06-18 12:50:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 2236628992. Throughput: 0: 42677.9. Samples: 2236695880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:46,994][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 12:50:51,092][12883] Updated weights for policy 0, policy_version 136523 (0.0028) +[2024-06-18 12:50:51,994][12645] Fps is (10 sec: 40966.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2236825600. Throughput: 0: 42767.2. Samples: 2236955660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:51,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 12:50:54,500][12883] Updated weights for policy 0, policy_version 136533 (0.0032) +[2024-06-18 12:50:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2237054976. Throughput: 0: 42599.1. Samples: 2237207960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:50:56,994][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 12:50:58,547][12883] Updated weights for policy 0, policy_version 136543 (0.0029) +[2024-06-18 12:51:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42598.7). Total num frames: 2237267968. Throughput: 0: 42775.7. Samples: 2237343320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:51:01,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 12:51:02,099][12883] Updated weights for policy 0, policy_version 136553 (0.0030) +[2024-06-18 12:51:05,963][12883] Updated weights for policy 0, policy_version 136563 (0.0041) +[2024-06-18 12:51:06,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 2237480960. Throughput: 0: 42890.2. Samples: 2237601500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:51:06,996][12645] Avg episode reward: [(0, '0.237')] +[2024-06-18 12:51:09,691][12883] Updated weights for policy 0, policy_version 136573 (0.0031) +[2024-06-18 12:51:12,000][12645] Fps is (10 sec: 42571.6, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 2237693952. Throughput: 0: 42851.8. Samples: 2237853240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 12:51:12,000][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 12:51:13,463][12883] Updated weights for policy 0, policy_version 136583 (0.0034) +[2024-06-18 12:51:16,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2237906944. Throughput: 0: 42733.7. Samples: 2237981100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:16,994][12645] Avg episode reward: [(0, '0.658')] +[2024-06-18 12:51:17,355][12883] Updated weights for policy 0, policy_version 136593 (0.0034) +[2024-06-18 12:51:21,097][12883] Updated weights for policy 0, policy_version 136603 (0.0034) +[2024-06-18 12:51:21,994][12645] Fps is (10 sec: 42625.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2238119936. Throughput: 0: 42736.5. Samples: 2238243080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:21,994][12645] Avg episode reward: [(0, '0.724')] +[2024-06-18 12:51:24,896][12883] Updated weights for policy 0, policy_version 136613 (0.0047) +[2024-06-18 12:51:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2238316544. Throughput: 0: 42873.9. Samples: 2238496060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:26,999][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 12:51:28,737][12883] Updated weights for policy 0, policy_version 136623 (0.0042) +[2024-06-18 12:51:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2238562304. Throughput: 0: 42828.5. Samples: 2238623160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:31,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 12:51:32,699][12883] Updated weights for policy 0, policy_version 136633 (0.0028) +[2024-06-18 12:51:36,734][12883] Updated weights for policy 0, policy_version 136643 (0.0035) +[2024-06-18 12:51:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 2238775296. Throughput: 0: 42779.1. Samples: 2238880720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:36,994][12645] Avg episode reward: [(0, '0.688')] +[2024-06-18 12:51:40,243][12883] Updated weights for policy 0, policy_version 136653 (0.0035) +[2024-06-18 12:51:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42599.4, 300 sec: 42653.9). Total num frames: 2238971904. Throughput: 0: 42745.2. Samples: 2239131500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:41,994][12645] Avg episode reward: [(0, '0.633')] +[2024-06-18 12:51:44,421][12883] Updated weights for policy 0, policy_version 136663 (0.0034) +[2024-06-18 12:51:47,000][12645] Fps is (10 sec: 42571.6, 60 sec: 42867.0, 300 sec: 42597.5). Total num frames: 2239201280. Throughput: 0: 42673.5. Samples: 2239263900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:47,000][12645] Avg episode reward: [(0, '0.700')] +[2024-06-18 12:51:47,890][12883] Updated weights for policy 0, policy_version 136673 (0.0030) +[2024-06-18 12:51:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2239381504. Throughput: 0: 42691.0. Samples: 2239522500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:51,994][12645] Avg episode reward: [(0, '0.769')] +[2024-06-18 12:51:52,346][12883] Updated weights for policy 0, policy_version 136683 (0.0035) +[2024-06-18 12:51:55,514][12883] Updated weights for policy 0, policy_version 136693 (0.0039) +[2024-06-18 12:51:56,994][12645] Fps is (10 sec: 40985.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2239610880. Throughput: 0: 42742.8. Samples: 2239776400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:51:56,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 12:51:59,785][12883] Updated weights for policy 0, policy_version 136703 (0.0039) +[2024-06-18 12:52:01,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2239856640. Throughput: 0: 42934.8. Samples: 2239913160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:52:01,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 12:52:03,212][12883] Updated weights for policy 0, policy_version 136713 (0.0053) +[2024-06-18 12:52:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2240036864. Throughput: 0: 42690.2. Samples: 2240164140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:52:06,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 12:52:07,322][12883] Updated weights for policy 0, policy_version 136723 (0.0032) +[2024-06-18 12:52:10,715][12883] Updated weights for policy 0, policy_version 136733 (0.0042) +[2024-06-18 12:52:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42876.0, 300 sec: 42654.0). Total num frames: 2240266240. Throughput: 0: 42739.7. Samples: 2240419340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:52:11,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 12:52:14,853][12883] Updated weights for policy 0, policy_version 136743 (0.0029) +[2024-06-18 12:52:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2240479232. Throughput: 0: 42868.8. Samples: 2240552260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 12:52:16,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 12:52:18,550][12883] Updated weights for policy 0, policy_version 136753 (0.0031) +[2024-06-18 12:52:21,042][12862] Signal inference workers to stop experience collection... (32750 times) +[2024-06-18 12:52:21,087][12883] InferenceWorker_p0-w0: stopping experience collection (32750 times) +[2024-06-18 12:52:21,097][12862] Signal inference workers to resume experience collection... (32750 times) +[2024-06-18 12:52:21,111][12883] InferenceWorker_p0-w0: resuming experience collection (32750 times) +[2024-06-18 12:52:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2240675840. Throughput: 0: 42936.4. Samples: 2240812860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:52:21,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 12:52:22,910][12883] Updated weights for policy 0, policy_version 136763 (0.0030) +[2024-06-18 12:52:26,067][12883] Updated weights for policy 0, policy_version 136773 (0.0036) +[2024-06-18 12:52:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 42820.9). Total num frames: 2240937984. Throughput: 0: 42963.2. Samples: 2241064840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:52:26,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 12:52:30,458][12883] Updated weights for policy 0, policy_version 136783 (0.0030) +[2024-06-18 12:52:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2241118208. Throughput: 0: 42982.4. Samples: 2241197840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:52:31,994][12645] Avg episode reward: [(0, '0.681')] +[2024-06-18 12:52:33,916][12883] Updated weights for policy 0, policy_version 136793 (0.0037) +[2024-06-18 12:52:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2241331200. Throughput: 0: 42861.3. Samples: 2241451260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:52:36,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 12:52:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136800_2241331200.pth... +[2024-06-18 12:52:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136176_2231107584.pth +[2024-06-18 12:52:37,998][12883] Updated weights for policy 0, policy_version 136803 (0.0046) +[2024-06-18 12:52:41,466][12883] Updated weights for policy 0, policy_version 136813 (0.0031) +[2024-06-18 12:52:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 2241576960. Throughput: 0: 42803.1. Samples: 2241702540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:52:41,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 12:52:45,884][12883] Updated weights for policy 0, policy_version 136823 (0.0031) +[2024-06-18 12:52:46,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42601.3, 300 sec: 42598.1). Total num frames: 2241757184. Throughput: 0: 42812.1. Samples: 2241839800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:52:46,996][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 12:52:48,993][12883] Updated weights for policy 0, policy_version 136833 (0.0041) +[2024-06-18 12:52:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2241953792. Throughput: 0: 42781.7. Samples: 2242089320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:52:51,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 12:52:53,502][12883] Updated weights for policy 0, policy_version 136843 (0.0025) +[2024-06-18 12:52:56,526][12883] Updated weights for policy 0, policy_version 136853 (0.0033) +[2024-06-18 12:52:56,994][12645] Fps is (10 sec: 45885.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 2242215936. Throughput: 0: 42769.2. Samples: 2242343960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:52:56,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 12:53:01,085][12883] Updated weights for policy 0, policy_version 136863 (0.0036) +[2024-06-18 12:53:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2242396160. Throughput: 0: 42920.5. Samples: 2242483680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:53:01,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 12:53:04,187][12883] Updated weights for policy 0, policy_version 136873 (0.0040) +[2024-06-18 12:53:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2242609152. Throughput: 0: 42695.4. Samples: 2242734160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:53:06,994][12645] Avg episode reward: [(0, '0.648')] +[2024-06-18 12:53:08,890][12883] Updated weights for policy 0, policy_version 136883 (0.0033) +[2024-06-18 12:53:11,815][12883] Updated weights for policy 0, policy_version 136893 (0.0036) +[2024-06-18 12:53:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2242854912. Throughput: 0: 42679.2. Samples: 2242985400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:53:11,994][12645] Avg episode reward: [(0, '0.648')] +[2024-06-18 12:53:16,582][12883] Updated weights for policy 0, policy_version 136903 (0.0035) +[2024-06-18 12:53:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2243051520. Throughput: 0: 42660.5. Samples: 2243117560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 12:53:16,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 12:53:19,450][12883] Updated weights for policy 0, policy_version 136913 (0.0033) +[2024-06-18 12:53:21,994][12645] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2243264512. Throughput: 0: 42618.1. Samples: 2243369080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:53:21,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 12:53:24,444][12883] Updated weights for policy 0, policy_version 136923 (0.0023) +[2024-06-18 12:53:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2243493888. Throughput: 0: 42704.4. Samples: 2243624240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:53:26,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 12:53:27,105][12883] Updated weights for policy 0, policy_version 136933 (0.0041) +[2024-06-18 12:53:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2243657728. Throughput: 0: 42459.3. Samples: 2243750380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:53:31,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 12:53:32,392][12883] Updated weights for policy 0, policy_version 136943 (0.0026) +[2024-06-18 12:53:34,931][12883] Updated weights for policy 0, policy_version 136953 (0.0033) +[2024-06-18 12:53:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2243903488. Throughput: 0: 42547.5. Samples: 2244003960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:53:36,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 12:53:39,875][12883] Updated weights for policy 0, policy_version 136963 (0.0037) +[2024-06-18 12:53:41,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 2244100096. Throughput: 0: 42614.8. Samples: 2244261620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:53:41,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 12:53:43,021][12883] Updated weights for policy 0, policy_version 136973 (0.0044) +[2024-06-18 12:53:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42326.8, 300 sec: 42542.9). Total num frames: 2244296704. Throughput: 0: 42274.6. Samples: 2244386040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:53:46,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 12:53:47,442][12883] Updated weights for policy 0, policy_version 136983 (0.0034) +[2024-06-18 12:53:47,828][12862] Signal inference workers to stop experience collection... (32800 times) +[2024-06-18 12:53:47,828][12862] Signal inference workers to resume experience collection... (32800 times) +[2024-06-18 12:53:47,863][12883] InferenceWorker_p0-w0: stopping experience collection (32800 times) +[2024-06-18 12:53:47,868][12883] InferenceWorker_p0-w0: resuming experience collection (32800 times) +[2024-06-18 12:53:50,738][12883] Updated weights for policy 0, policy_version 136993 (0.0038) +[2024-06-18 12:53:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2244526080. Throughput: 0: 42420.6. Samples: 2244643080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:53:51,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 12:53:55,142][12883] Updated weights for policy 0, policy_version 137003 (0.0021) +[2024-06-18 12:53:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2244722688. Throughput: 0: 42631.0. Samples: 2244903800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:53:56,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 12:53:58,317][12883] Updated weights for policy 0, policy_version 137013 (0.0040) +[2024-06-18 12:54:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2244952064. Throughput: 0: 42457.8. Samples: 2245028160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:54:01,999][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 12:54:02,641][12883] Updated weights for policy 0, policy_version 137023 (0.0029) +[2024-06-18 12:54:06,350][12883] Updated weights for policy 0, policy_version 137033 (0.0028) +[2024-06-18 12:54:06,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2245165056. Throughput: 0: 42636.2. Samples: 2245287700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:54:06,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 12:54:10,218][12883] Updated weights for policy 0, policy_version 137043 (0.0036) +[2024-06-18 12:54:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2245378048. Throughput: 0: 42674.2. Samples: 2245544580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:54:11,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 12:54:13,806][12883] Updated weights for policy 0, policy_version 137053 (0.0031) +[2024-06-18 12:54:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2245591040. Throughput: 0: 42612.1. Samples: 2245667920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:54:16,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 12:54:17,692][12883] Updated weights for policy 0, policy_version 137063 (0.0022) +[2024-06-18 12:54:21,370][12883] Updated weights for policy 0, policy_version 137073 (0.0025) +[2024-06-18 12:54:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2245820416. Throughput: 0: 42897.4. Samples: 2245934340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) +[2024-06-18 12:54:21,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 12:54:25,447][12883] Updated weights for policy 0, policy_version 137083 (0.0027) +[2024-06-18 12:54:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 2246017024. Throughput: 0: 42828.4. Samples: 2246188900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:54:26,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 12:54:28,991][12883] Updated weights for policy 0, policy_version 137093 (0.0032) +[2024-06-18 12:54:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2246246400. Throughput: 0: 42793.5. Samples: 2246311740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:54:31,994][12645] Avg episode reward: [(0, '0.172')] +[2024-06-18 12:54:33,116][12883] Updated weights for policy 0, policy_version 137103 (0.0034) +[2024-06-18 12:54:36,709][12883] Updated weights for policy 0, policy_version 137113 (0.0032) +[2024-06-18 12:54:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2246459392. Throughput: 0: 42944.4. Samples: 2246575580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:54:36,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 12:54:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137113_2246459392.pth... +[2024-06-18 12:54:37,098][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136487_2236203008.pth +[2024-06-18 12:54:40,626][12883] Updated weights for policy 0, policy_version 137123 (0.0037) +[2024-06-18 12:54:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 2246656000. Throughput: 0: 42772.9. Samples: 2246828580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:54:41,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 12:54:44,306][12883] Updated weights for policy 0, policy_version 137133 (0.0031) +[2024-06-18 12:54:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2246901760. Throughput: 0: 42865.7. Samples: 2246957120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:54:46,996][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 12:54:48,109][12883] Updated weights for policy 0, policy_version 137143 (0.0039) +[2024-06-18 12:54:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2247098368. Throughput: 0: 42945.2. Samples: 2247220240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:54:51,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 12:54:52,325][12883] Updated weights for policy 0, policy_version 137153 (0.0033) +[2024-06-18 12:54:56,044][12883] Updated weights for policy 0, policy_version 137163 (0.0040) +[2024-06-18 12:54:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2247311360. Throughput: 0: 42761.3. Samples: 2247468840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:54:56,998][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 12:55:00,038][12883] Updated weights for policy 0, policy_version 137173 (0.0044) +[2024-06-18 12:55:01,995][12645] Fps is (10 sec: 44231.7, 60 sec: 43143.7, 300 sec: 42820.4). Total num frames: 2247540736. Throughput: 0: 42883.3. Samples: 2247597720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:55:01,996][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 12:55:03,624][12883] Updated weights for policy 0, policy_version 137183 (0.0038) +[2024-06-18 12:55:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2247720960. Throughput: 0: 42665.8. Samples: 2247854300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:55:06,996][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 12:55:07,656][12883] Updated weights for policy 0, policy_version 137193 (0.0036) +[2024-06-18 12:55:11,100][12862] Signal inference workers to stop experience collection... (32850 times) +[2024-06-18 12:55:11,100][12862] Signal inference workers to resume experience collection... (32850 times) +[2024-06-18 12:55:11,127][12883] InferenceWorker_p0-w0: stopping experience collection (32850 times) +[2024-06-18 12:55:11,127][12883] InferenceWorker_p0-w0: resuming experience collection (32850 times) +[2024-06-18 12:55:11,233][12883] Updated weights for policy 0, policy_version 137203 (0.0040) +[2024-06-18 12:55:11,994][12645] Fps is (10 sec: 40964.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2247950336. Throughput: 0: 42429.7. Samples: 2248098240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:55:11,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 12:55:15,403][12883] Updated weights for policy 0, policy_version 137213 (0.0037) +[2024-06-18 12:55:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2248163328. Throughput: 0: 42703.5. Samples: 2248233400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:55:16,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 12:55:18,986][12883] Updated weights for policy 0, policy_version 137223 (0.0038) +[2024-06-18 12:55:21,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 2248376320. Throughput: 0: 42598.3. Samples: 2248492600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 12:55:21,996][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 12:55:22,977][12883] Updated weights for policy 0, policy_version 137233 (0.0032) +[2024-06-18 12:55:26,741][12883] Updated weights for policy 0, policy_version 137243 (0.0035) +[2024-06-18 12:55:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2248589312. Throughput: 0: 42606.1. Samples: 2248745860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:55:26,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 12:55:30,911][12883] Updated weights for policy 0, policy_version 137253 (0.0032) +[2024-06-18 12:55:31,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2248802304. Throughput: 0: 42577.4. Samples: 2248873100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:55:31,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 12:55:34,530][12883] Updated weights for policy 0, policy_version 137263 (0.0045) +[2024-06-18 12:55:36,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42869.9, 300 sec: 42764.9). Total num frames: 2249031680. Throughput: 0: 42468.6. Samples: 2249131420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:55:36,996][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 12:55:38,563][12883] Updated weights for policy 0, policy_version 137273 (0.0026) +[2024-06-18 12:55:41,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2249228288. Throughput: 0: 42500.2. Samples: 2249381440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:55:41,997][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 12:55:42,230][12883] Updated weights for policy 0, policy_version 137283 (0.0029) +[2024-06-18 12:55:46,179][12883] Updated weights for policy 0, policy_version 137293 (0.0025) +[2024-06-18 12:55:46,996][12645] Fps is (10 sec: 40960.0, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 2249441280. Throughput: 0: 42498.6. Samples: 2249510200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:55:46,997][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 12:55:50,009][12883] Updated weights for policy 0, policy_version 137303 (0.0037) +[2024-06-18 12:55:51,994][12645] Fps is (10 sec: 39330.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2249621504. Throughput: 0: 42491.5. Samples: 2249766420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:55:51,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 12:55:53,861][12883] Updated weights for policy 0, policy_version 137313 (0.0038) +[2024-06-18 12:55:56,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2249867264. Throughput: 0: 42651.5. Samples: 2250017560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:55:56,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 12:55:57,587][12883] Updated weights for policy 0, policy_version 137323 (0.0041) +[2024-06-18 12:56:01,491][12883] Updated weights for policy 0, policy_version 137333 (0.0024) +[2024-06-18 12:56:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42326.2, 300 sec: 42709.8). Total num frames: 2250080256. Throughput: 0: 42610.6. Samples: 2250150880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:56:01,994][12645] Avg episode reward: [(0, '0.599')] +[2024-06-18 12:56:05,119][12883] Updated weights for policy 0, policy_version 137343 (0.0034) +[2024-06-18 12:56:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42654.8). Total num frames: 2250276864. Throughput: 0: 42460.2. Samples: 2250403220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:56:06,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 12:56:09,207][12883] Updated weights for policy 0, policy_version 137353 (0.0034) +[2024-06-18 12:56:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2250506240. Throughput: 0: 42473.1. Samples: 2250657140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:56:11,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 12:56:13,019][12883] Updated weights for policy 0, policy_version 137363 (0.0026) +[2024-06-18 12:56:16,786][12883] Updated weights for policy 0, policy_version 137373 (0.0031) +[2024-06-18 12:56:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2250719232. Throughput: 0: 42592.3. Samples: 2250789760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:56:16,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 12:56:20,664][12883] Updated weights for policy 0, policy_version 137383 (0.0022) +[2024-06-18 12:56:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 2250932224. Throughput: 0: 42670.9. Samples: 2251051520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:56:21,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 12:56:24,263][12883] Updated weights for policy 0, policy_version 137393 (0.0035) +[2024-06-18 12:56:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2251161600. Throughput: 0: 42835.0. Samples: 2251308920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 12:56:26,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 12:56:28,311][12883] Updated weights for policy 0, policy_version 137403 (0.0032) +[2024-06-18 12:56:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2251358208. Throughput: 0: 42901.7. Samples: 2251440680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:56:31,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 12:56:32,035][12883] Updated weights for policy 0, policy_version 137413 (0.0027) +[2024-06-18 12:56:36,036][12883] Updated weights for policy 0, policy_version 137423 (0.0028) +[2024-06-18 12:56:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 2251571200. Throughput: 0: 42853.8. Samples: 2251694840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:56:36,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 12:56:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137425_2251571200.pth... +[2024-06-18 12:56:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136800_2241331200.pth +[2024-06-18 12:56:39,594][12883] Updated weights for policy 0, policy_version 137433 (0.0032) +[2024-06-18 12:56:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42873.0, 300 sec: 42710.4). Total num frames: 2251800576. Throughput: 0: 42876.0. Samples: 2251946980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:56:41,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 12:56:43,910][12883] Updated weights for policy 0, policy_version 137443 (0.0026) +[2024-06-18 12:56:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42820.6). Total num frames: 2252013568. Throughput: 0: 42827.5. Samples: 2252078120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:56:46,996][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 12:56:47,115][12883] Updated weights for policy 0, policy_version 137453 (0.0041) +[2024-06-18 12:56:51,438][12883] Updated weights for policy 0, policy_version 137463 (0.0040) +[2024-06-18 12:56:52,000][12645] Fps is (10 sec: 40934.7, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 2252210176. Throughput: 0: 42872.8. Samples: 2252332760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:56:52,001][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 12:56:54,552][12862] Signal inference workers to stop experience collection... (32900 times) +[2024-06-18 12:56:54,584][12883] InferenceWorker_p0-w0: stopping experience collection (32900 times) +[2024-06-18 12:56:54,611][12862] Signal inference workers to resume experience collection... (32900 times) +[2024-06-18 12:56:54,612][12883] InferenceWorker_p0-w0: resuming experience collection (32900 times) +[2024-06-18 12:56:54,750][12883] Updated weights for policy 0, policy_version 137473 (0.0039) +[2024-06-18 12:56:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2252439552. Throughput: 0: 42976.8. Samples: 2252591100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:56:56,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 12:56:59,028][12883] Updated weights for policy 0, policy_version 137483 (0.0051) +[2024-06-18 12:57:01,994][12645] Fps is (10 sec: 42625.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2252636160. Throughput: 0: 42881.9. Samples: 2252719440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:57:01,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 12:57:02,828][12883] Updated weights for policy 0, policy_version 137493 (0.0027) +[2024-06-18 12:57:06,464][12883] Updated weights for policy 0, policy_version 137503 (0.0041) +[2024-06-18 12:57:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2252849152. Throughput: 0: 42622.1. Samples: 2252969520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:57:06,995][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 12:57:10,441][12883] Updated weights for policy 0, policy_version 137513 (0.0035) +[2024-06-18 12:57:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2253062144. Throughput: 0: 42613.7. Samples: 2253226540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:57:11,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 12:57:14,098][12883] Updated weights for policy 0, policy_version 137523 (0.0034) +[2024-06-18 12:57:16,994][12645] Fps is (10 sec: 42599.8, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 2253275136. Throughput: 0: 42473.4. Samples: 2253351980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:57:16,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 12:57:18,125][12883] Updated weights for policy 0, policy_version 137533 (0.0031) +[2024-06-18 12:57:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2253488128. Throughput: 0: 42438.6. Samples: 2253604580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:57:21,998][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 12:57:22,429][12883] Updated weights for policy 0, policy_version 137543 (0.0032) +[2024-06-18 12:57:25,842][12883] Updated weights for policy 0, policy_version 137553 (0.0033) +[2024-06-18 12:57:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2253717504. Throughput: 0: 42429.8. Samples: 2253856320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 12:57:26,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 12:57:30,280][12883] Updated weights for policy 0, policy_version 137563 (0.0037) +[2024-06-18 12:57:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2253897728. Throughput: 0: 42396.5. Samples: 2253985960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:57:31,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 12:57:33,518][12883] Updated weights for policy 0, policy_version 137573 (0.0042) +[2024-06-18 12:57:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2254127104. Throughput: 0: 42316.7. Samples: 2254236740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:57:36,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 12:57:38,038][12883] Updated weights for policy 0, policy_version 137583 (0.0043) +[2024-06-18 12:57:41,772][12883] Updated weights for policy 0, policy_version 137593 (0.0027) +[2024-06-18 12:57:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2254340096. Throughput: 0: 42295.2. Samples: 2254494380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:57:41,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 12:57:45,729][12883] Updated weights for policy 0, policy_version 137603 (0.0061) +[2024-06-18 12:57:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2254536704. Throughput: 0: 42203.5. Samples: 2254618600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:57:46,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 12:57:49,361][12883] Updated weights for policy 0, policy_version 137613 (0.0029) +[2024-06-18 12:57:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 2254766080. Throughput: 0: 42367.8. Samples: 2254876060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:57:51,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 12:57:53,479][12883] Updated weights for policy 0, policy_version 137623 (0.0035) +[2024-06-18 12:57:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2254962688. Throughput: 0: 42401.8. Samples: 2255134620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:57:56,994][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 12:57:57,033][12883] Updated weights for policy 0, policy_version 137633 (0.0034) +[2024-06-18 12:58:01,039][12883] Updated weights for policy 0, policy_version 137643 (0.0019) +[2024-06-18 12:58:01,997][12645] Fps is (10 sec: 42582.8, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 2255192064. Throughput: 0: 42283.6. Samples: 2255254900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:58:01,998][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 12:58:04,489][12883] Updated weights for policy 0, policy_version 137653 (0.0038) +[2024-06-18 12:58:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2255421440. Throughput: 0: 42447.9. Samples: 2255514740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:58:06,994][12645] Avg episode reward: [(0, '0.659')] +[2024-06-18 12:58:08,812][12883] Updated weights for policy 0, policy_version 137663 (0.0029) +[2024-06-18 12:58:11,994][12645] Fps is (10 sec: 40975.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2255601664. Throughput: 0: 42616.6. Samples: 2255774060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:58:11,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 12:58:12,185][12883] Updated weights for policy 0, policy_version 137673 (0.0043) +[2024-06-18 12:58:14,038][12862] Signal inference workers to stop experience collection... (32950 times) +[2024-06-18 12:58:14,087][12862] Signal inference workers to resume experience collection... (32950 times) +[2024-06-18 12:58:14,088][12883] InferenceWorker_p0-w0: stopping experience collection (32950 times) +[2024-06-18 12:58:14,101][12883] InferenceWorker_p0-w0: resuming experience collection (32950 times) +[2024-06-18 12:58:16,633][12883] Updated weights for policy 0, policy_version 137683 (0.0028) +[2024-06-18 12:58:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 2255831040. Throughput: 0: 42325.7. Samples: 2255890620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:58:16,994][12645] Avg episode reward: [(0, '0.714')] +[2024-06-18 12:58:19,882][12883] Updated weights for policy 0, policy_version 137693 (0.0046) +[2024-06-18 12:58:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2256060416. Throughput: 0: 42593.7. Samples: 2256153460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:58:21,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 12:58:24,300][12883] Updated weights for policy 0, policy_version 137703 (0.0028) +[2024-06-18 12:58:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2256240640. Throughput: 0: 42634.1. Samples: 2256412920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:58:26,995][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 12:58:27,911][12883] Updated weights for policy 0, policy_version 137713 (0.0035) +[2024-06-18 12:58:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2256437248. Throughput: 0: 42479.2. Samples: 2256530160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 12:58:31,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 12:58:32,116][12883] Updated weights for policy 0, policy_version 137723 (0.0028) +[2024-06-18 12:58:35,725][12883] Updated weights for policy 0, policy_version 137733 (0.0041) +[2024-06-18 12:58:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2256683008. Throughput: 0: 42559.9. Samples: 2256791260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:58:36,994][12645] Avg episode reward: [(0, '0.594')] +[2024-06-18 12:58:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137738_2256699392.pth... +[2024-06-18 12:58:37,084][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137113_2246459392.pth +[2024-06-18 12:58:39,900][12883] Updated weights for policy 0, policy_version 137743 (0.0028) +[2024-06-18 12:58:41,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2256879616. Throughput: 0: 42461.7. Samples: 2257045400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:58:41,994][12645] Avg episode reward: [(0, '0.886')] +[2024-06-18 12:58:43,473][12883] Updated weights for policy 0, policy_version 137753 (0.0032) +[2024-06-18 12:58:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2257076224. Throughput: 0: 42458.0. Samples: 2257165360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:58:46,994][12645] Avg episode reward: [(0, '0.599')] +[2024-06-18 12:58:47,440][12883] Updated weights for policy 0, policy_version 137763 (0.0035) +[2024-06-18 12:58:51,004][12883] Updated weights for policy 0, policy_version 137773 (0.0037) +[2024-06-18 12:58:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2257321984. Throughput: 0: 42541.3. Samples: 2257429100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:58:51,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 12:58:54,995][12883] Updated weights for policy 0, policy_version 137783 (0.0036) +[2024-06-18 12:58:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2257534976. Throughput: 0: 42576.4. Samples: 2257690000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:58:56,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 12:58:58,428][12883] Updated weights for policy 0, policy_version 137793 (0.0035) +[2024-06-18 12:59:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42327.8, 300 sec: 42598.4). Total num frames: 2257731584. Throughput: 0: 42632.4. Samples: 2257809080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:59:01,994][12645] Avg episode reward: [(0, '0.687')] +[2024-06-18 12:59:02,421][12883] Updated weights for policy 0, policy_version 137803 (0.0032) +[2024-06-18 12:59:06,273][12883] Updated weights for policy 0, policy_version 137813 (0.0037) +[2024-06-18 12:59:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2257960960. Throughput: 0: 42640.8. Samples: 2258072300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:59:06,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 12:59:09,894][12883] Updated weights for policy 0, policy_version 137823 (0.0038) +[2024-06-18 12:59:11,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2258157568. Throughput: 0: 42579.3. Samples: 2258328980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:59:11,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 12:59:13,884][12883] Updated weights for policy 0, policy_version 137833 (0.0036) +[2024-06-18 12:59:16,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2258386944. Throughput: 0: 42751.6. Samples: 2258454080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:59:16,997][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 12:59:17,505][12883] Updated weights for policy 0, policy_version 137843 (0.0029) +[2024-06-18 12:59:21,492][12883] Updated weights for policy 0, policy_version 137853 (0.0032) +[2024-06-18 12:59:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2258583552. Throughput: 0: 42633.4. Samples: 2258709760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:59:21,994][12645] Avg episode reward: [(0, '0.213')] +[2024-06-18 12:59:22,835][12862] Signal inference workers to stop experience collection... (33000 times) +[2024-06-18 12:59:22,836][12862] Signal inference workers to resume experience collection... (33000 times) +[2024-06-18 12:59:22,848][12883] InferenceWorker_p0-w0: stopping experience collection (33000 times) +[2024-06-18 12:59:22,848][12883] InferenceWorker_p0-w0: resuming experience collection (33000 times) +[2024-06-18 12:59:25,007][12883] Updated weights for policy 0, policy_version 137863 (0.0030) +[2024-06-18 12:59:26,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2258796544. Throughput: 0: 42667.5. Samples: 2258965440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:59:26,994][12645] Avg episode reward: [(0, '0.639')] +[2024-06-18 12:59:29,410][12883] Updated weights for policy 0, policy_version 137873 (0.0029) +[2024-06-18 12:59:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2259009536. Throughput: 0: 42747.1. Samples: 2259088980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:59:31,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 12:59:32,926][12883] Updated weights for policy 0, policy_version 137883 (0.0024) +[2024-06-18 12:59:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2259222528. Throughput: 0: 42565.0. Samples: 2259344520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 12:59:36,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 12:59:37,049][12883] Updated weights for policy 0, policy_version 137893 (0.0038) +[2024-06-18 12:59:40,885][12883] Updated weights for policy 0, policy_version 137903 (0.0043) +[2024-06-18 12:59:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2259435520. Throughput: 0: 42349.7. Samples: 2259595740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:59:41,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 12:59:44,685][12883] Updated weights for policy 0, policy_version 137913 (0.0033) +[2024-06-18 12:59:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2259648512. Throughput: 0: 42585.4. Samples: 2259725420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:59:46,994][12645] Avg episode reward: [(0, '0.302')] +[2024-06-18 12:59:48,808][12883] Updated weights for policy 0, policy_version 137923 (0.0036) +[2024-06-18 12:59:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2259861504. Throughput: 0: 42340.0. Samples: 2259977600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:59:51,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 12:59:52,434][12883] Updated weights for policy 0, policy_version 137933 (0.0027) +[2024-06-18 12:59:56,706][12883] Updated weights for policy 0, policy_version 137943 (0.0034) +[2024-06-18 12:59:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.5). Total num frames: 2260074496. Throughput: 0: 42326.6. Samples: 2260233680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 12:59:56,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 13:00:00,253][12883] Updated weights for policy 0, policy_version 137953 (0.0041) +[2024-06-18 13:00:01,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2260271104. Throughput: 0: 42315.6. Samples: 2260358180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 13:00:01,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 13:00:04,352][12883] Updated weights for policy 0, policy_version 137963 (0.0031) +[2024-06-18 13:00:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2260500480. Throughput: 0: 42300.9. Samples: 2260613300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 13:00:06,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 13:00:07,852][12883] Updated weights for policy 0, policy_version 137973 (0.0034) +[2024-06-18 13:00:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2260697088. Throughput: 0: 42366.9. Samples: 2260871940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 13:00:11,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 13:00:12,013][12883] Updated weights for policy 0, policy_version 137983 (0.0030) +[2024-06-18 13:00:15,520][12883] Updated weights for policy 0, policy_version 137993 (0.0047) +[2024-06-18 13:00:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42053.8, 300 sec: 42487.6). Total num frames: 2260910080. Throughput: 0: 42253.3. Samples: 2260990380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 13:00:16,994][12645] Avg episode reward: [(0, '0.564')] +[2024-06-18 13:00:19,900][12883] Updated weights for policy 0, policy_version 138003 (0.0049) +[2024-06-18 13:00:21,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42596.8, 300 sec: 42542.6). Total num frames: 2261139456. Throughput: 0: 42272.2. Samples: 2261246860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 13:00:21,996][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 13:00:23,398][12883] Updated weights for policy 0, policy_version 138013 (0.0027) +[2024-06-18 13:00:26,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42050.8, 300 sec: 42431.5). Total num frames: 2261319680. Throughput: 0: 42402.0. Samples: 2261503920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 13:00:26,996][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 13:00:27,590][12883] Updated weights for policy 0, policy_version 138023 (0.0038) +[2024-06-18 13:00:31,061][12883] Updated weights for policy 0, policy_version 138033 (0.0043) +[2024-06-18 13:00:31,994][12645] Fps is (10 sec: 42607.3, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2261565440. Throughput: 0: 42234.5. Samples: 2261625980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 13:00:31,994][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 13:00:35,240][12883] Updated weights for policy 0, policy_version 138043 (0.0043) +[2024-06-18 13:00:36,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2261778432. Throughput: 0: 42434.8. Samples: 2261887160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 13:00:36,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 13:00:37,029][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138049_2261794816.pth... +[2024-06-18 13:00:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137425_2251571200.pth +[2024-06-18 13:00:38,910][12883] Updated weights for policy 0, policy_version 138053 (0.0028) +[2024-06-18 13:00:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2261975040. Throughput: 0: 42236.7. Samples: 2262134340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:00:41,994][12645] Avg episode reward: [(0, '0.742')] +[2024-06-18 13:00:42,883][12883] Updated weights for policy 0, policy_version 138063 (0.0034) +[2024-06-18 13:00:46,586][12883] Updated weights for policy 0, policy_version 138073 (0.0045) +[2024-06-18 13:00:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2262188032. Throughput: 0: 42327.0. Samples: 2262262900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:00:46,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 13:00:50,442][12883] Updated weights for policy 0, policy_version 138083 (0.0025) +[2024-06-18 13:00:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2262401024. Throughput: 0: 42345.7. Samples: 2262518860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:00:51,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 13:00:52,854][12862] Signal inference workers to stop experience collection... (33050 times) +[2024-06-18 13:00:52,855][12862] Signal inference workers to resume experience collection... (33050 times) +[2024-06-18 13:00:52,884][12883] InferenceWorker_p0-w0: stopping experience collection (33050 times) +[2024-06-18 13:00:52,885][12883] InferenceWorker_p0-w0: resuming experience collection (33050 times) +[2024-06-18 13:00:54,480][12883] Updated weights for policy 0, policy_version 138093 (0.0037) +[2024-06-18 13:00:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2262614016. Throughput: 0: 42340.8. Samples: 2262777280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:00:56,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 13:00:58,642][12883] Updated weights for policy 0, policy_version 138103 (0.0035) +[2024-06-18 13:01:01,952][12883] Updated weights for policy 0, policy_version 138113 (0.0025) +[2024-06-18 13:01:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2262843392. Throughput: 0: 42476.5. Samples: 2262901820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:01,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 13:01:06,168][12883] Updated weights for policy 0, policy_version 138123 (0.0034) +[2024-06-18 13:01:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2263056384. Throughput: 0: 42526.5. Samples: 2263160460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:06,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 13:01:09,796][12883] Updated weights for policy 0, policy_version 138133 (0.0043) +[2024-06-18 13:01:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2263236608. Throughput: 0: 42487.5. Samples: 2263415760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:11,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 13:01:13,903][12883] Updated weights for policy 0, policy_version 138143 (0.0027) +[2024-06-18 13:01:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2263482368. Throughput: 0: 42522.3. Samples: 2263539480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:16,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 13:01:17,569][12883] Updated weights for policy 0, policy_version 138153 (0.0036) +[2024-06-18 13:01:21,424][12883] Updated weights for policy 0, policy_version 138163 (0.0034) +[2024-06-18 13:01:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42326.8, 300 sec: 42431.8). Total num frames: 2263678976. Throughput: 0: 42451.4. Samples: 2263797480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:21,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 13:01:25,500][12883] Updated weights for policy 0, policy_version 138173 (0.0038) +[2024-06-18 13:01:27,000][12645] Fps is (10 sec: 39297.2, 60 sec: 42595.6, 300 sec: 42430.9). Total num frames: 2263875584. Throughput: 0: 42607.5. Samples: 2264051940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:27,000][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 13:01:29,080][12883] Updated weights for policy 0, policy_version 138183 (0.0036) +[2024-06-18 13:01:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2264121344. Throughput: 0: 42622.2. Samples: 2264180900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:31,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 13:01:32,948][12883] Updated weights for policy 0, policy_version 138193 (0.0035) +[2024-06-18 13:01:36,909][12883] Updated weights for policy 0, policy_version 138203 (0.0044) +[2024-06-18 13:01:36,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2264317952. Throughput: 0: 42648.9. Samples: 2264438060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:36,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 13:01:40,510][12883] Updated weights for policy 0, policy_version 138213 (0.0030) +[2024-06-18 13:01:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 2264530944. Throughput: 0: 42552.5. Samples: 2264692140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:01:41,994][12645] Avg episode reward: [(0, '0.659')] +[2024-06-18 13:01:44,462][12883] Updated weights for policy 0, policy_version 138223 (0.0033) +[2024-06-18 13:01:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42543.8). Total num frames: 2264760320. Throughput: 0: 42629.3. Samples: 2264820140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:01:46,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 13:01:48,256][12883] Updated weights for policy 0, policy_version 138233 (0.0027) +[2024-06-18 13:01:51,997][12645] Fps is (10 sec: 40944.8, 60 sec: 42322.7, 300 sec: 42375.7). Total num frames: 2264940544. Throughput: 0: 42603.2. Samples: 2265077760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:01:51,998][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 13:01:52,303][12883] Updated weights for policy 0, policy_version 138243 (0.0025) +[2024-06-18 13:01:55,685][12883] Updated weights for policy 0, policy_version 138253 (0.0030) +[2024-06-18 13:01:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2265169920. Throughput: 0: 42520.3. Samples: 2265329180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:01:56,994][12645] Avg episode reward: [(0, '0.649')] +[2024-06-18 13:01:59,974][12883] Updated weights for policy 0, policy_version 138263 (0.0043) +[2024-06-18 13:02:01,994][12645] Fps is (10 sec: 45892.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2265399296. Throughput: 0: 42837.4. Samples: 2265467160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:01,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 13:02:03,795][12883] Updated weights for policy 0, policy_version 138273 (0.0034) +[2024-06-18 13:02:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2265579520. Throughput: 0: 42685.4. Samples: 2265718320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:06,994][12645] Avg episode reward: [(0, '0.741')] +[2024-06-18 13:02:07,739][12883] Updated weights for policy 0, policy_version 138283 (0.0033) +[2024-06-18 13:02:11,161][12883] Updated weights for policy 0, policy_version 138293 (0.0051) +[2024-06-18 13:02:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2265825280. Throughput: 0: 42630.7. Samples: 2265970060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:11,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 13:02:15,225][12862] Signal inference workers to stop experience collection... (33100 times) +[2024-06-18 13:02:15,226][12862] Signal inference workers to resume experience collection... (33100 times) +[2024-06-18 13:02:15,252][12883] InferenceWorker_p0-w0: stopping experience collection (33100 times) +[2024-06-18 13:02:15,252][12883] InferenceWorker_p0-w0: resuming experience collection (33100 times) +[2024-06-18 13:02:15,391][12883] Updated weights for policy 0, policy_version 138303 (0.0043) +[2024-06-18 13:02:16,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2266054656. Throughput: 0: 42732.0. Samples: 2266103840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:16,996][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 13:02:18,615][12883] Updated weights for policy 0, policy_version 138313 (0.0035) +[2024-06-18 13:02:21,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 2266202112. Throughput: 0: 42684.5. Samples: 2266358860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:21,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 13:02:22,899][12883] Updated weights for policy 0, policy_version 138323 (0.0022) +[2024-06-18 13:02:26,050][12883] Updated weights for policy 0, policy_version 138333 (0.0025) +[2024-06-18 13:02:26,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42875.8, 300 sec: 42542.8). Total num frames: 2266447872. Throughput: 0: 42640.3. Samples: 2266610960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:26,994][12645] Avg episode reward: [(0, '0.291')] +[2024-06-18 13:02:30,526][12883] Updated weights for policy 0, policy_version 138343 (0.0027) +[2024-06-18 13:02:31,994][12645] Fps is (10 sec: 47514.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2266677248. Throughput: 0: 42861.0. Samples: 2266748880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:31,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 13:02:34,313][12883] Updated weights for policy 0, policy_version 138353 (0.0030) +[2024-06-18 13:02:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2266857472. Throughput: 0: 42696.4. Samples: 2266998940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:36,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 13:02:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138358_2266857472.pth... +[2024-06-18 13:02:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137738_2256699392.pth +[2024-06-18 13:02:38,054][12883] Updated weights for policy 0, policy_version 138363 (0.0022) +[2024-06-18 13:02:41,943][12883] Updated weights for policy 0, policy_version 138373 (0.0038) +[2024-06-18 13:02:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2267103232. Throughput: 0: 42815.2. Samples: 2267255860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 13:02:41,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 13:02:46,201][12883] Updated weights for policy 0, policy_version 138383 (0.0035) +[2024-06-18 13:02:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2267299840. Throughput: 0: 42694.7. Samples: 2267388420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:02:46,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 13:02:49,580][12883] Updated weights for policy 0, policy_version 138393 (0.0030) +[2024-06-18 13:02:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42874.1, 300 sec: 42542.9). Total num frames: 2267512832. Throughput: 0: 42733.9. Samples: 2267641340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:02:51,994][12645] Avg episode reward: [(0, '0.650')] +[2024-06-18 13:02:53,755][12883] Updated weights for policy 0, policy_version 138403 (0.0032) +[2024-06-18 13:02:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.8). Total num frames: 2267725824. Throughput: 0: 42726.8. Samples: 2267892760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:02:56,994][12645] Avg episode reward: [(0, '0.668')] +[2024-06-18 13:02:57,255][12883] Updated weights for policy 0, policy_version 138413 (0.0039) +[2024-06-18 13:03:01,221][12883] Updated weights for policy 0, policy_version 138423 (0.0027) +[2024-06-18 13:03:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2267938816. Throughput: 0: 42666.8. Samples: 2268023840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:01,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 13:03:04,940][12883] Updated weights for policy 0, policy_version 138433 (0.0032) +[2024-06-18 13:03:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42542.8). Total num frames: 2268151808. Throughput: 0: 42665.8. Samples: 2268278820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:06,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 13:03:09,010][12883] Updated weights for policy 0, policy_version 138443 (0.0023) +[2024-06-18 13:03:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2268381184. Throughput: 0: 42496.6. Samples: 2268523300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:11,994][12645] Avg episode reward: [(0, '0.666')] +[2024-06-18 13:03:12,799][12883] Updated weights for policy 0, policy_version 138453 (0.0029) +[2024-06-18 13:03:16,880][12883] Updated weights for policy 0, policy_version 138463 (0.0031) +[2024-06-18 13:03:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2268577792. Throughput: 0: 42387.1. Samples: 2268656300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:16,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 13:03:20,415][12883] Updated weights for policy 0, policy_version 138473 (0.0035) +[2024-06-18 13:03:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2268774400. Throughput: 0: 42387.6. Samples: 2268906380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:21,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 13:03:24,566][12883] Updated weights for policy 0, policy_version 138483 (0.0031) +[2024-06-18 13:03:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2269020160. Throughput: 0: 42218.2. Samples: 2269155680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:26,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 13:03:27,933][12883] Updated weights for policy 0, policy_version 138493 (0.0035) +[2024-06-18 13:03:31,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2269216768. Throughput: 0: 42377.0. Samples: 2269295380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:31,994][12645] Avg episode reward: [(0, '0.629')] +[2024-06-18 13:03:32,099][12883] Updated weights for policy 0, policy_version 138503 (0.0033) +[2024-06-18 13:03:35,553][12883] Updated weights for policy 0, policy_version 138513 (0.0040) +[2024-06-18 13:03:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2269413376. Throughput: 0: 42257.7. Samples: 2269542940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:36,994][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 13:03:39,642][12883] Updated weights for policy 0, policy_version 138523 (0.0032) +[2024-06-18 13:03:41,454][12862] Signal inference workers to stop experience collection... (33150 times) +[2024-06-18 13:03:41,459][12862] Signal inference workers to resume experience collection... (33150 times) +[2024-06-18 13:03:41,495][12883] InferenceWorker_p0-w0: stopping experience collection (33150 times) +[2024-06-18 13:03:41,495][12883] InferenceWorker_p0-w0: resuming experience collection (33150 times) +[2024-06-18 13:03:41,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2269675520. Throughput: 0: 42343.5. Samples: 2269798220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:41,994][12645] Avg episode reward: [(0, '0.695')] +[2024-06-18 13:03:43,323][12883] Updated weights for policy 0, policy_version 138533 (0.0022) +[2024-06-18 13:03:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2269855744. Throughput: 0: 42500.4. Samples: 2269936360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:03:46,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 13:03:47,222][12883] Updated weights for policy 0, policy_version 138543 (0.0028) +[2024-06-18 13:03:51,413][12883] Updated weights for policy 0, policy_version 138553 (0.0029) +[2024-06-18 13:03:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2270068736. Throughput: 0: 42389.4. Samples: 2270186340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:03:51,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 13:03:55,578][12883] Updated weights for policy 0, policy_version 138563 (0.0035) +[2024-06-18 13:03:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2270314496. Throughput: 0: 42618.6. Samples: 2270441140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:03:56,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 13:03:59,188][12883] Updated weights for policy 0, policy_version 138573 (0.0042) +[2024-06-18 13:04:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42376.3). Total num frames: 2270461952. Throughput: 0: 42481.7. Samples: 2270567980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:01,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 13:04:03,177][12883] Updated weights for policy 0, policy_version 138583 (0.0029) +[2024-06-18 13:04:06,685][12883] Updated weights for policy 0, policy_version 138593 (0.0025) +[2024-06-18 13:04:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2270707712. Throughput: 0: 42487.1. Samples: 2270818300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:06,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 13:04:10,727][12883] Updated weights for policy 0, policy_version 138603 (0.0045) +[2024-06-18 13:04:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.2, 300 sec: 42487.6). Total num frames: 2270920704. Throughput: 0: 42780.0. Samples: 2271080780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:11,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 13:04:14,353][12883] Updated weights for policy 0, policy_version 138613 (0.0036) +[2024-06-18 13:04:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2271100928. Throughput: 0: 42494.1. Samples: 2271207620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:16,994][12645] Avg episode reward: [(0, '0.701')] +[2024-06-18 13:04:18,241][12883] Updated weights for policy 0, policy_version 138623 (0.0037) +[2024-06-18 13:04:21,934][12883] Updated weights for policy 0, policy_version 138633 (0.0042) +[2024-06-18 13:04:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2271363072. Throughput: 0: 42608.4. Samples: 2271460320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:21,994][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 13:04:25,854][12883] Updated weights for policy 0, policy_version 138643 (0.0030) +[2024-06-18 13:04:26,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2271576064. Throughput: 0: 42799.9. Samples: 2271724220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:26,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 13:04:29,612][12883] Updated weights for policy 0, policy_version 138653 (0.0035) +[2024-06-18 13:04:31,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2271756288. Throughput: 0: 42525.0. Samples: 2271849980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:31,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 13:04:33,637][12883] Updated weights for policy 0, policy_version 138663 (0.0035) +[2024-06-18 13:04:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2271985664. Throughput: 0: 42581.2. Samples: 2272102500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:36,994][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 13:04:37,191][12883] Updated weights for policy 0, policy_version 138673 (0.0039) +[2024-06-18 13:04:37,193][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138673_2272018432.pth... +[2024-06-18 13:04:37,275][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138049_2261794816.pth +[2024-06-18 13:04:41,640][12883] Updated weights for policy 0, policy_version 138683 (0.0028) +[2024-06-18 13:04:41,637][12862] Signal inference workers to stop experience collection... (33200 times) +[2024-06-18 13:04:41,646][12862] Signal inference workers to resume experience collection... (33200 times) +[2024-06-18 13:04:41,660][12883] InferenceWorker_p0-w0: stopping experience collection (33200 times) +[2024-06-18 13:04:41,660][12883] InferenceWorker_p0-w0: resuming experience collection (33200 times) +[2024-06-18 13:04:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2272215040. Throughput: 0: 42750.3. Samples: 2272364900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:41,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 13:04:44,792][12883] Updated weights for policy 0, policy_version 138693 (0.0022) +[2024-06-18 13:04:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2272411648. Throughput: 0: 42593.7. Samples: 2272484700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:04:46,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 13:04:49,434][12883] Updated weights for policy 0, policy_version 138703 (0.0034) +[2024-06-18 13:04:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2272624640. Throughput: 0: 42586.7. Samples: 2272734700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:04:51,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 13:04:52,785][12883] Updated weights for policy 0, policy_version 138713 (0.0031) +[2024-06-18 13:04:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 2272821248. Throughput: 0: 42647.7. Samples: 2272999920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:04:56,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 13:04:57,015][12883] Updated weights for policy 0, policy_version 138723 (0.0026) +[2024-06-18 13:05:00,356][12883] Updated weights for policy 0, policy_version 138733 (0.0041) +[2024-06-18 13:05:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 2273050624. Throughput: 0: 42549.3. Samples: 2273122340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:02,000][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 13:05:04,817][12883] Updated weights for policy 0, policy_version 138743 (0.0027) +[2024-06-18 13:05:06,994][12645] Fps is (10 sec: 44235.1, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 2273263616. Throughput: 0: 42494.9. Samples: 2273372600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:07,000][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 13:05:08,434][12883] Updated weights for policy 0, policy_version 138753 (0.0029) +[2024-06-18 13:05:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2273460224. Throughput: 0: 42595.3. Samples: 2273641000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:11,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 13:05:12,234][12883] Updated weights for policy 0, policy_version 138763 (0.0028) +[2024-06-18 13:05:15,962][12883] Updated weights for policy 0, policy_version 138773 (0.0035) +[2024-06-18 13:05:16,994][12645] Fps is (10 sec: 42599.9, 60 sec: 43144.6, 300 sec: 42543.2). Total num frames: 2273689600. Throughput: 0: 42441.8. Samples: 2273759860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:16,994][12645] Avg episode reward: [(0, '0.624')] +[2024-06-18 13:05:19,916][12883] Updated weights for policy 0, policy_version 138783 (0.0032) +[2024-06-18 13:05:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2273902592. Throughput: 0: 42440.0. Samples: 2274012300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:21,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 13:05:23,599][12883] Updated weights for policy 0, policy_version 138793 (0.0033) +[2024-06-18 13:05:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 2274099200. Throughput: 0: 42455.6. Samples: 2274275400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:26,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 13:05:27,384][12883] Updated weights for policy 0, policy_version 138803 (0.0026) +[2024-06-18 13:05:31,168][12883] Updated weights for policy 0, policy_version 138813 (0.0030) +[2024-06-18 13:05:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2274344960. Throughput: 0: 42584.0. Samples: 2274400980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:31,994][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 13:05:34,872][12883] Updated weights for policy 0, policy_version 138823 (0.0036) +[2024-06-18 13:05:36,526][12862] Signal inference workers to stop experience collection... (33250 times) +[2024-06-18 13:05:36,527][12862] Signal inference workers to resume experience collection... (33250 times) +[2024-06-18 13:05:36,568][12883] InferenceWorker_p0-w0: stopping experience collection (33250 times) +[2024-06-18 13:05:36,568][12883] InferenceWorker_p0-w0: resuming experience collection (33250 times) +[2024-06-18 13:05:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2274557952. Throughput: 0: 42756.0. Samples: 2274658720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:36,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 13:05:38,755][12883] Updated weights for policy 0, policy_version 138833 (0.0038) +[2024-06-18 13:05:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2274754560. Throughput: 0: 42716.8. Samples: 2274922180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:41,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 13:05:42,599][12883] Updated weights for policy 0, policy_version 138843 (0.0048) +[2024-06-18 13:05:46,420][12883] Updated weights for policy 0, policy_version 138853 (0.0026) +[2024-06-18 13:05:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2274983936. Throughput: 0: 42681.4. Samples: 2275043000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:46,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 13:05:50,121][12883] Updated weights for policy 0, policy_version 138863 (0.0036) +[2024-06-18 13:05:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2275180544. Throughput: 0: 42796.7. Samples: 2275298440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) +[2024-06-18 13:05:51,994][12645] Avg episode reward: [(0, '0.729')] +[2024-06-18 13:05:54,143][12883] Updated weights for policy 0, policy_version 138873 (0.0033) +[2024-06-18 13:05:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2275393536. Throughput: 0: 42614.6. Samples: 2275558660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:05:56,994][12645] Avg episode reward: [(0, '0.772')] +[2024-06-18 13:05:58,203][12883] Updated weights for policy 0, policy_version 138883 (0.0039) +[2024-06-18 13:06:01,913][12883] Updated weights for policy 0, policy_version 138893 (0.0053) +[2024-06-18 13:06:01,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 2275622912. Throughput: 0: 42760.0. Samples: 2275684160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:01,997][12645] Avg episode reward: [(0, '0.758')] +[2024-06-18 13:06:05,811][12883] Updated weights for policy 0, policy_version 138903 (0.0032) +[2024-06-18 13:06:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 2275819520. Throughput: 0: 42763.5. Samples: 2275936660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:06,994][12645] Avg episode reward: [(0, '0.758')] +[2024-06-18 13:06:09,715][12883] Updated weights for policy 0, policy_version 138913 (0.0032) +[2024-06-18 13:06:11,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2276032512. Throughput: 0: 42672.8. Samples: 2276195680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:11,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 13:06:13,419][12883] Updated weights for policy 0, policy_version 138923 (0.0045) +[2024-06-18 13:06:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2276245504. Throughput: 0: 42601.4. Samples: 2276318040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:16,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 13:06:17,362][12883] Updated weights for policy 0, policy_version 138933 (0.0040) +[2024-06-18 13:06:20,939][12883] Updated weights for policy 0, policy_version 138943 (0.0038) +[2024-06-18 13:06:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42654.8). Total num frames: 2276458496. Throughput: 0: 42576.3. Samples: 2276574660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:21,994][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 13:06:25,033][12883] Updated weights for policy 0, policy_version 138953 (0.0033) +[2024-06-18 13:06:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2276655104. Throughput: 0: 42386.2. Samples: 2276829560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:26,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 13:06:28,787][12883] Updated weights for policy 0, policy_version 138963 (0.0036) +[2024-06-18 13:06:31,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2276884480. Throughput: 0: 42573.9. Samples: 2276958820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:31,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 13:06:32,960][12883] Updated weights for policy 0, policy_version 138973 (0.0035) +[2024-06-18 13:06:36,486][12883] Updated weights for policy 0, policy_version 138983 (0.0027) +[2024-06-18 13:06:37,000][12645] Fps is (10 sec: 44209.1, 60 sec: 42320.9, 300 sec: 42597.5). Total num frames: 2277097472. Throughput: 0: 42469.7. Samples: 2277209840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:37,000][12645] Avg episode reward: [(0, '0.641')] +[2024-06-18 13:06:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138983_2277097472.pth... +[2024-06-18 13:06:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138358_2266857472.pth +[2024-06-18 13:06:40,419][12883] Updated weights for policy 0, policy_version 138993 (0.0039) +[2024-06-18 13:06:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2277310464. Throughput: 0: 42369.2. Samples: 2277465280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:41,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 13:06:44,112][12883] Updated weights for policy 0, policy_version 139003 (0.0039) +[2024-06-18 13:06:46,994][12645] Fps is (10 sec: 42625.2, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 2277523456. Throughput: 0: 42349.7. Samples: 2277589800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:46,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 13:06:48,085][12883] Updated weights for policy 0, policy_version 139013 (0.0042) +[2024-06-18 13:06:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2277736448. Throughput: 0: 42394.3. Samples: 2277844400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:51,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 13:06:52,222][12883] Updated weights for policy 0, policy_version 139023 (0.0022) +[2024-06-18 13:06:56,328][12883] Updated weights for policy 0, policy_version 139033 (0.0041) +[2024-06-18 13:06:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2277933056. Throughput: 0: 42333.8. Samples: 2278100700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 13:06:56,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 13:06:59,889][12883] Updated weights for policy 0, policy_version 139043 (0.0041) +[2024-06-18 13:07:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41780.8, 300 sec: 42542.9). Total num frames: 2278129664. Throughput: 0: 42349.5. Samples: 2278223760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:01,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 13:07:03,935][12883] Updated weights for policy 0, policy_version 139053 (0.0036) +[2024-06-18 13:07:05,861][12862] Signal inference workers to stop experience collection... (33300 times) +[2024-06-18 13:07:05,904][12883] InferenceWorker_p0-w0: stopping experience collection (33300 times) +[2024-06-18 13:07:05,920][12862] Signal inference workers to resume experience collection... (33300 times) +[2024-06-18 13:07:05,932][12883] InferenceWorker_p0-w0: resuming experience collection (33300 times) +[2024-06-18 13:07:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2278391808. Throughput: 0: 42397.9. Samples: 2278482560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:06,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 13:07:07,581][12883] Updated weights for policy 0, policy_version 139063 (0.0036) +[2024-06-18 13:07:11,646][12883] Updated weights for policy 0, policy_version 139073 (0.0030) +[2024-06-18 13:07:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2278572032. Throughput: 0: 42377.3. Samples: 2278736540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:11,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 13:07:15,323][12883] Updated weights for policy 0, policy_version 139083 (0.0026) +[2024-06-18 13:07:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2278768640. Throughput: 0: 42302.1. Samples: 2278862420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:16,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 13:07:19,508][12883] Updated weights for policy 0, policy_version 139093 (0.0032) +[2024-06-18 13:07:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2279014400. Throughput: 0: 42508.5. Samples: 2279122460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:21,994][12645] Avg episode reward: [(0, '0.103')] +[2024-06-18 13:07:22,932][12883] Updated weights for policy 0, policy_version 139103 (0.0033) +[2024-06-18 13:07:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2279211008. Throughput: 0: 42523.2. Samples: 2279378820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:26,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 13:07:27,104][12883] Updated weights for policy 0, policy_version 139113 (0.0038) +[2024-06-18 13:07:30,642][12883] Updated weights for policy 0, policy_version 139123 (0.0035) +[2024-06-18 13:07:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2279407616. Throughput: 0: 42450.6. Samples: 2279500080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:31,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 13:07:34,946][12883] Updated weights for policy 0, policy_version 139133 (0.0037) +[2024-06-18 13:07:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42875.9, 300 sec: 42598.4). Total num frames: 2279669760. Throughput: 0: 42526.6. Samples: 2279758100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:36,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 13:07:38,486][12883] Updated weights for policy 0, policy_version 139143 (0.0043) +[2024-06-18 13:07:41,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2279849984. Throughput: 0: 42561.4. Samples: 2280016060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:41,996][12645] Avg episode reward: [(0, '0.624')] +[2024-06-18 13:07:42,737][12883] Updated weights for policy 0, policy_version 139153 (0.0031) +[2024-06-18 13:07:46,251][12883] Updated weights for policy 0, policy_version 139163 (0.0038) +[2024-06-18 13:07:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2280062976. Throughput: 0: 42571.1. Samples: 2280139460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:46,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 13:07:50,413][12883] Updated weights for policy 0, policy_version 139173 (0.0039) +[2024-06-18 13:07:51,994][12645] Fps is (10 sec: 44246.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2280292352. Throughput: 0: 42602.2. Samples: 2280399660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:51,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 13:07:54,071][12883] Updated weights for policy 0, policy_version 139183 (0.0028) +[2024-06-18 13:07:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2280488960. Throughput: 0: 42672.9. Samples: 2280656820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) +[2024-06-18 13:07:56,994][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 13:07:58,032][12883] Updated weights for policy 0, policy_version 139193 (0.0027) +[2024-06-18 13:08:01,663][12883] Updated weights for policy 0, policy_version 139203 (0.0034) +[2024-06-18 13:08:01,995][12645] Fps is (10 sec: 40955.7, 60 sec: 42870.6, 300 sec: 42542.7). Total num frames: 2280701952. Throughput: 0: 42619.9. Samples: 2280780360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:01,995][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 13:08:05,822][12883] Updated weights for policy 0, policy_version 139213 (0.0028) +[2024-06-18 13:08:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2280914944. Throughput: 0: 42594.6. Samples: 2281039220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:06,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 13:08:09,227][12883] Updated weights for policy 0, policy_version 139223 (0.0029) +[2024-06-18 13:08:11,994][12645] Fps is (10 sec: 44241.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2281144320. Throughput: 0: 42627.5. Samples: 2281297060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:11,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 13:08:13,329][12883] Updated weights for policy 0, policy_version 139233 (0.0028) +[2024-06-18 13:08:16,855][12883] Updated weights for policy 0, policy_version 139243 (0.0033) +[2024-06-18 13:08:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2281357312. Throughput: 0: 42741.2. Samples: 2281423440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:16,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 13:08:20,875][12883] Updated weights for policy 0, policy_version 139253 (0.0030) +[2024-06-18 13:08:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2281553920. Throughput: 0: 42732.4. Samples: 2281681060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:21,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 13:08:24,450][12883] Updated weights for policy 0, policy_version 139263 (0.0027) +[2024-06-18 13:08:26,994][12645] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2281799680. Throughput: 0: 42725.8. Samples: 2281938620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:26,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 13:08:28,589][12883] Updated weights for policy 0, policy_version 139273 (0.0030) +[2024-06-18 13:08:31,981][12883] Updated weights for policy 0, policy_version 139283 (0.0035) +[2024-06-18 13:08:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2282012672. Throughput: 0: 42892.3. Samples: 2282069620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:31,995][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 13:08:36,219][12883] Updated weights for policy 0, policy_version 139293 (0.0035) +[2024-06-18 13:08:36,996][12645] Fps is (10 sec: 39312.3, 60 sec: 42050.7, 300 sec: 42431.5). Total num frames: 2282192896. Throughput: 0: 42705.9. Samples: 2282321520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:36,997][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 13:08:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139295_2282209280.pth... +[2024-06-18 13:08:37,105][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138673_2272018432.pth +[2024-06-18 13:08:39,915][12883] Updated weights for policy 0, policy_version 139303 (0.0035) +[2024-06-18 13:08:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2282422272. Throughput: 0: 42641.4. Samples: 2282575680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:41,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 13:08:43,847][12883] Updated weights for policy 0, policy_version 139313 (0.0024) +[2024-06-18 13:08:45,215][12862] Signal inference workers to stop experience collection... (33350 times) +[2024-06-18 13:08:45,215][12862] Signal inference workers to resume experience collection... (33350 times) +[2024-06-18 13:08:45,229][12883] InferenceWorker_p0-w0: stopping experience collection (33350 times) +[2024-06-18 13:08:45,229][12883] InferenceWorker_p0-w0: resuming experience collection (33350 times) +[2024-06-18 13:08:46,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2282635264. Throughput: 0: 42759.3. Samples: 2282704480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:46,994][12645] Avg episode reward: [(0, '0.679')] +[2024-06-18 13:08:47,474][12883] Updated weights for policy 0, policy_version 139323 (0.0046) +[2024-06-18 13:08:51,569][12883] Updated weights for policy 0, policy_version 139333 (0.0031) +[2024-06-18 13:08:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2282831872. Throughput: 0: 42603.6. Samples: 2282956380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:51,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 13:08:55,182][12883] Updated weights for policy 0, policy_version 139343 (0.0025) +[2024-06-18 13:08:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2283061248. Throughput: 0: 42514.8. Samples: 2283210220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:08:56,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 13:08:59,230][12883] Updated weights for policy 0, policy_version 139353 (0.0027) +[2024-06-18 13:09:01,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43145.4, 300 sec: 42654.0). Total num frames: 2283290624. Throughput: 0: 42561.1. Samples: 2283338680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:09:01,994][12645] Avg episode reward: [(0, '0.704')] +[2024-06-18 13:09:02,999][12883] Updated weights for policy 0, policy_version 139363 (0.0039) +[2024-06-18 13:09:06,827][12883] Updated weights for policy 0, policy_version 139373 (0.0038) +[2024-06-18 13:09:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2283487232. Throughput: 0: 42608.0. Samples: 2283598420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:06,994][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 13:09:10,678][12883] Updated weights for policy 0, policy_version 139383 (0.0032) +[2024-06-18 13:09:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2283700224. Throughput: 0: 42492.4. Samples: 2283850780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:11,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 13:09:14,522][12883] Updated weights for policy 0, policy_version 139393 (0.0030) +[2024-06-18 13:09:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2283896832. Throughput: 0: 42347.2. Samples: 2283975240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:16,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 13:09:18,273][12883] Updated weights for policy 0, policy_version 139403 (0.0035) +[2024-06-18 13:09:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2284109824. Throughput: 0: 42531.0. Samples: 2284235320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:21,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 13:09:22,159][12883] Updated weights for policy 0, policy_version 139413 (0.0036) +[2024-06-18 13:09:25,796][12883] Updated weights for policy 0, policy_version 139423 (0.0031) +[2024-06-18 13:09:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2284339200. Throughput: 0: 42489.7. Samples: 2284487720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:26,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 13:09:29,785][12883] Updated weights for policy 0, policy_version 139433 (0.0031) +[2024-06-18 13:09:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2284535808. Throughput: 0: 42571.1. Samples: 2284620180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:31,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 13:09:33,366][12883] Updated weights for policy 0, policy_version 139443 (0.0032) +[2024-06-18 13:09:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 2284748800. Throughput: 0: 42632.1. Samples: 2284874820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:36,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 13:09:37,557][12883] Updated weights for policy 0, policy_version 139453 (0.0027) +[2024-06-18 13:09:41,047][12883] Updated weights for policy 0, policy_version 139463 (0.0025) +[2024-06-18 13:09:41,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2284994560. Throughput: 0: 42578.7. Samples: 2285126260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:41,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 13:09:45,080][12883] Updated weights for policy 0, policy_version 139473 (0.0033) +[2024-06-18 13:09:47,000][12645] Fps is (10 sec: 44208.6, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 2285191168. Throughput: 0: 42879.7. Samples: 2285268540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:47,001][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 13:09:48,793][12883] Updated weights for policy 0, policy_version 139483 (0.0028) +[2024-06-18 13:09:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2285387776. Throughput: 0: 42735.2. Samples: 2285521500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:51,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 13:09:52,979][12883] Updated weights for policy 0, policy_version 139493 (0.0034) +[2024-06-18 13:09:56,440][12862] Signal inference workers to stop experience collection... (33400 times) +[2024-06-18 13:09:56,440][12862] Signal inference workers to resume experience collection... (33400 times) +[2024-06-18 13:09:56,486][12883] InferenceWorker_p0-w0: stopping experience collection (33400 times) +[2024-06-18 13:09:56,486][12883] InferenceWorker_p0-w0: resuming experience collection (33400 times) +[2024-06-18 13:09:56,575][12883] Updated weights for policy 0, policy_version 139503 (0.0022) +[2024-06-18 13:09:56,994][12645] Fps is (10 sec: 45903.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2285649920. Throughput: 0: 42855.4. Samples: 2285779280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:09:56,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 13:10:00,545][12883] Updated weights for policy 0, policy_version 139513 (0.0033) +[2024-06-18 13:10:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2285813760. Throughput: 0: 43004.4. Samples: 2285910440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) +[2024-06-18 13:10:01,995][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 13:10:04,279][12883] Updated weights for policy 0, policy_version 139523 (0.0039) +[2024-06-18 13:10:06,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2286043136. Throughput: 0: 42877.9. Samples: 2286164820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:06,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 13:10:08,293][12883] Updated weights for policy 0, policy_version 139533 (0.0036) +[2024-06-18 13:10:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2286256128. Throughput: 0: 42890.7. Samples: 2286417800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:11,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 13:10:12,049][12883] Updated weights for policy 0, policy_version 139543 (0.0048) +[2024-06-18 13:10:15,980][12883] Updated weights for policy 0, policy_version 139553 (0.0033) +[2024-06-18 13:10:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2286469120. Throughput: 0: 42837.4. Samples: 2286547860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:16,994][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 13:10:19,642][12883] Updated weights for policy 0, policy_version 139563 (0.0034) +[2024-06-18 13:10:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2286698496. Throughput: 0: 42908.4. Samples: 2286805700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:21,994][12645] Avg episode reward: [(0, '0.616')] +[2024-06-18 13:10:23,648][12883] Updated weights for policy 0, policy_version 139573 (0.0028) +[2024-06-18 13:10:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2286895104. Throughput: 0: 43053.7. Samples: 2287063680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:26,994][12645] Avg episode reward: [(0, '0.678')] +[2024-06-18 13:10:27,226][12883] Updated weights for policy 0, policy_version 139583 (0.0041) +[2024-06-18 13:10:31,513][12883] Updated weights for policy 0, policy_version 139593 (0.0041) +[2024-06-18 13:10:31,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 2287108096. Throughput: 0: 42654.0. Samples: 2287187800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:31,997][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 13:10:35,103][12883] Updated weights for policy 0, policy_version 139603 (0.0033) +[2024-06-18 13:10:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2287337472. Throughput: 0: 42741.2. Samples: 2287444860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:36,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 13:10:37,142][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139609_2287353856.pth... +[2024-06-18 13:10:37,199][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138983_2277097472.pth +[2024-06-18 13:10:39,035][12883] Updated weights for policy 0, policy_version 139613 (0.0028) +[2024-06-18 13:10:41,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2287550464. Throughput: 0: 42662.2. Samples: 2287699080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:41,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 13:10:43,282][12883] Updated weights for policy 0, policy_version 139623 (0.0032) +[2024-06-18 13:10:46,882][12883] Updated weights for policy 0, policy_version 139633 (0.0029) +[2024-06-18 13:10:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42602.8, 300 sec: 42598.4). Total num frames: 2287747072. Throughput: 0: 42700.9. Samples: 2287831980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:46,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 13:10:51,042][12883] Updated weights for policy 0, policy_version 139643 (0.0027) +[2024-06-18 13:10:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2287960064. Throughput: 0: 42872.9. Samples: 2288094100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:51,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 13:10:54,312][12883] Updated weights for policy 0, policy_version 139653 (0.0045) +[2024-06-18 13:10:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 2288205824. Throughput: 0: 42664.9. Samples: 2288337720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:10:56,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 13:10:58,623][12883] Updated weights for policy 0, policy_version 139663 (0.0033) +[2024-06-18 13:11:01,841][12883] Updated weights for policy 0, policy_version 139673 (0.0036) +[2024-06-18 13:11:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2288402432. Throughput: 0: 42711.5. Samples: 2288469880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:11:01,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 13:11:06,176][12883] Updated weights for policy 0, policy_version 139683 (0.0029) +[2024-06-18 13:11:06,996][12645] Fps is (10 sec: 39310.7, 60 sec: 42596.4, 300 sec: 42598.0). Total num frames: 2288599040. Throughput: 0: 42737.8. Samples: 2288729020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:11:06,997][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 13:11:09,526][12883] Updated weights for policy 0, policy_version 139693 (0.0032) +[2024-06-18 13:11:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2288828416. Throughput: 0: 42529.8. Samples: 2288977520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:11,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 13:11:13,797][12883] Updated weights for policy 0, policy_version 139703 (0.0031) +[2024-06-18 13:11:16,994][12645] Fps is (10 sec: 42610.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2289025024. Throughput: 0: 42703.4. Samples: 2289109360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:16,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 13:11:17,423][12883] Updated weights for policy 0, policy_version 139713 (0.0038) +[2024-06-18 13:11:17,800][12862] Signal inference workers to stop experience collection... (33450 times) +[2024-06-18 13:11:17,800][12862] Signal inference workers to resume experience collection... (33450 times) +[2024-06-18 13:11:17,812][12883] InferenceWorker_p0-w0: stopping experience collection (33450 times) +[2024-06-18 13:11:17,812][12883] InferenceWorker_p0-w0: resuming experience collection (33450 times) +[2024-06-18 13:11:21,395][12883] Updated weights for policy 0, policy_version 139723 (0.0040) +[2024-06-18 13:11:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2289238016. Throughput: 0: 42620.6. Samples: 2289362780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:21,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 13:11:24,992][12883] Updated weights for policy 0, policy_version 139733 (0.0025) +[2024-06-18 13:11:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2289467392. Throughput: 0: 42651.5. Samples: 2289618400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:26,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 13:11:29,003][12883] Updated weights for policy 0, policy_version 139743 (0.0040) +[2024-06-18 13:11:31,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42873.0, 300 sec: 42654.8). Total num frames: 2289680384. Throughput: 0: 42513.2. Samples: 2289745080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:31,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 13:11:32,573][12883] Updated weights for policy 0, policy_version 139753 (0.0035) +[2024-06-18 13:11:36,782][12883] Updated weights for policy 0, policy_version 139763 (0.0056) +[2024-06-18 13:11:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2289876992. Throughput: 0: 42368.9. Samples: 2290000700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:36,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 13:11:40,162][12883] Updated weights for policy 0, policy_version 139773 (0.0027) +[2024-06-18 13:11:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2290106368. Throughput: 0: 42638.3. Samples: 2290256440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:41,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 13:11:44,495][12883] Updated weights for policy 0, policy_version 139783 (0.0030) +[2024-06-18 13:11:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2290319360. Throughput: 0: 42580.9. Samples: 2290386020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:46,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 13:11:47,802][12883] Updated weights for policy 0, policy_version 139793 (0.0036) +[2024-06-18 13:11:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2290499584. Throughput: 0: 42408.9. Samples: 2290637300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:51,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 13:11:52,236][12883] Updated weights for policy 0, policy_version 139803 (0.0037) +[2024-06-18 13:11:55,349][12883] Updated weights for policy 0, policy_version 139813 (0.0037) +[2024-06-18 13:11:57,000][12645] Fps is (10 sec: 42572.0, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 2290745344. Throughput: 0: 42496.7. Samples: 2290890140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:11:57,001][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 13:11:59,896][12883] Updated weights for policy 0, policy_version 139823 (0.0040) +[2024-06-18 13:12:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2290925568. Throughput: 0: 42520.9. Samples: 2291022800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:12:01,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 13:12:03,430][12883] Updated weights for policy 0, policy_version 139833 (0.0032) +[2024-06-18 13:12:06,996][12645] Fps is (10 sec: 40976.4, 60 sec: 42598.8, 300 sec: 42653.6). Total num frames: 2291154944. Throughput: 0: 42494.7. Samples: 2291275140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:12:06,996][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 13:12:07,686][12883] Updated weights for policy 0, policy_version 139843 (0.0030) +[2024-06-18 13:12:10,999][12883] Updated weights for policy 0, policy_version 139853 (0.0031) +[2024-06-18 13:12:11,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2291384320. Throughput: 0: 42485.9. Samples: 2291530260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:12:11,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 13:12:15,314][12883] Updated weights for policy 0, policy_version 139863 (0.0031) +[2024-06-18 13:12:16,995][12645] Fps is (10 sec: 40964.0, 60 sec: 42324.5, 300 sec: 42542.7). Total num frames: 2291564544. Throughput: 0: 42554.5. Samples: 2291660080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:16,995][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 13:12:18,543][12883] Updated weights for policy 0, policy_version 139873 (0.0040) +[2024-06-18 13:12:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2291793920. Throughput: 0: 42551.6. Samples: 2291915520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:21,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 13:12:22,825][12883] Updated weights for policy 0, policy_version 139883 (0.0040) +[2024-06-18 13:12:26,223][12883] Updated weights for policy 0, policy_version 139893 (0.0033) +[2024-06-18 13:12:26,996][12645] Fps is (10 sec: 45870.6, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 2292023296. Throughput: 0: 42527.6. Samples: 2292170280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:26,996][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 13:12:30,514][12883] Updated weights for policy 0, policy_version 139903 (0.0038) +[2024-06-18 13:12:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2292219904. Throughput: 0: 42486.2. Samples: 2292297900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:31,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 13:12:34,103][12883] Updated weights for policy 0, policy_version 139913 (0.0022) +[2024-06-18 13:12:36,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 2292432896. Throughput: 0: 42638.5. Samples: 2292556040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:36,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 13:12:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139920_2292449280.pth... +[2024-06-18 13:12:37,051][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139295_2282209280.pth +[2024-06-18 13:12:38,010][12883] Updated weights for policy 0, policy_version 139923 (0.0033) +[2024-06-18 13:12:41,712][12883] Updated weights for policy 0, policy_version 139933 (0.0032) +[2024-06-18 13:12:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2292678656. Throughput: 0: 42769.0. Samples: 2292814480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:41,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 13:12:45,543][12883] Updated weights for policy 0, policy_version 139943 (0.0037) +[2024-06-18 13:12:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2292875264. Throughput: 0: 42706.3. Samples: 2292944580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:46,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 13:12:49,406][12883] Updated weights for policy 0, policy_version 139953 (0.0032) +[2024-06-18 13:12:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2293088256. Throughput: 0: 42716.8. Samples: 2293197300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:51,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 13:12:53,071][12883] Updated weights for policy 0, policy_version 139963 (0.0051) +[2024-06-18 13:12:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42602.8, 300 sec: 42709.6). Total num frames: 2293301248. Throughput: 0: 42965.3. Samples: 2293463700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:12:56,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 13:12:57,283][12883] Updated weights for policy 0, policy_version 139973 (0.0039) +[2024-06-18 13:12:57,902][12862] Signal inference workers to stop experience collection... (33500 times) +[2024-06-18 13:12:57,902][12862] Signal inference workers to resume experience collection... (33500 times) +[2024-06-18 13:12:57,922][12883] InferenceWorker_p0-w0: stopping experience collection (33500 times) +[2024-06-18 13:12:57,922][12883] InferenceWorker_p0-w0: resuming experience collection (33500 times) +[2024-06-18 13:13:00,787][12883] Updated weights for policy 0, policy_version 139983 (0.0035) +[2024-06-18 13:13:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2293514240. Throughput: 0: 42874.9. Samples: 2293589400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:13:01,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 13:13:04,806][12883] Updated weights for policy 0, policy_version 139993 (0.0035) +[2024-06-18 13:13:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 2293727232. Throughput: 0: 42873.3. Samples: 2293844820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:13:06,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 13:13:08,248][12883] Updated weights for policy 0, policy_version 140003 (0.0034) +[2024-06-18 13:13:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2293923840. Throughput: 0: 43163.0. Samples: 2294112520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:13:11,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 13:13:12,602][12883] Updated weights for policy 0, policy_version 140013 (0.0051) +[2024-06-18 13:13:16,161][12883] Updated weights for policy 0, policy_version 140023 (0.0023) +[2024-06-18 13:13:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43145.5, 300 sec: 42709.5). Total num frames: 2294153216. Throughput: 0: 42928.2. Samples: 2294229660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:16,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 13:13:20,150][12883] Updated weights for policy 0, policy_version 140033 (0.0023) +[2024-06-18 13:13:21,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2294382592. Throughput: 0: 42892.1. Samples: 2294486180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:21,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 13:13:24,003][12883] Updated weights for policy 0, policy_version 140043 (0.0035) +[2024-06-18 13:13:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2294579200. Throughput: 0: 42968.9. Samples: 2294748080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:26,994][12645] Avg episode reward: [(0, '0.654')] +[2024-06-18 13:13:27,683][12883] Updated weights for policy 0, policy_version 140053 (0.0033) +[2024-06-18 13:13:31,440][12883] Updated weights for policy 0, policy_version 140063 (0.0032) +[2024-06-18 13:13:31,996][12645] Fps is (10 sec: 42589.0, 60 sec: 43143.0, 300 sec: 42765.0). Total num frames: 2294808576. Throughput: 0: 42916.9. Samples: 2294875940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:31,997][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 13:13:35,805][12883] Updated weights for policy 0, policy_version 140073 (0.0041) +[2024-06-18 13:13:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2295021568. Throughput: 0: 43078.5. Samples: 2295135840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:36,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 13:13:39,062][12883] Updated weights for policy 0, policy_version 140083 (0.0028) +[2024-06-18 13:13:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2295218176. Throughput: 0: 42696.1. Samples: 2295385020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:41,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 13:13:43,393][12883] Updated weights for policy 0, policy_version 140093 (0.0025) +[2024-06-18 13:13:46,768][12883] Updated weights for policy 0, policy_version 140103 (0.0041) +[2024-06-18 13:13:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2295463936. Throughput: 0: 42612.4. Samples: 2295506960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:46,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 13:13:51,176][12883] Updated weights for policy 0, policy_version 140113 (0.0036) +[2024-06-18 13:13:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2295644160. Throughput: 0: 42791.6. Samples: 2295770440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:51,994][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 13:13:54,311][12883] Updated weights for policy 0, policy_version 140123 (0.0023) +[2024-06-18 13:13:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2295840768. Throughput: 0: 42502.7. Samples: 2296025140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:13:56,994][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 13:13:58,748][12883] Updated weights for policy 0, policy_version 140133 (0.0042) +[2024-06-18 13:14:01,950][12883] Updated weights for policy 0, policy_version 140143 (0.0033) +[2024-06-18 13:14:01,996][12645] Fps is (10 sec: 45864.6, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 2296102912. Throughput: 0: 42581.3. Samples: 2296145920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:14:01,996][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 13:14:06,347][12883] Updated weights for policy 0, policy_version 140153 (0.0038) +[2024-06-18 13:14:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2296283136. Throughput: 0: 42686.7. Samples: 2296407080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:14:06,994][12645] Avg episode reward: [(0, '0.735')] +[2024-06-18 13:14:09,506][12883] Updated weights for policy 0, policy_version 140163 (0.0035) +[2024-06-18 13:14:11,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2296496128. Throughput: 0: 42513.8. Samples: 2296661200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:14:11,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 13:14:12,814][12862] Signal inference workers to stop experience collection... (33550 times) +[2024-06-18 13:14:12,841][12883] InferenceWorker_p0-w0: stopping experience collection (33550 times) +[2024-06-18 13:14:12,876][12862] Signal inference workers to resume experience collection... (33550 times) +[2024-06-18 13:14:12,878][12883] InferenceWorker_p0-w0: resuming experience collection (33550 times) +[2024-06-18 13:14:14,029][12883] Updated weights for policy 0, policy_version 140173 (0.0041) +[2024-06-18 13:14:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2296741888. Throughput: 0: 42559.9. Samples: 2296791040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:14:16,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 13:14:17,292][12883] Updated weights for policy 0, policy_version 140183 (0.0043) +[2024-06-18 13:14:21,692][12883] Updated weights for policy 0, policy_version 140193 (0.0047) +[2024-06-18 13:14:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2296922112. Throughput: 0: 42483.6. Samples: 2297047600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:14:21,994][12645] Avg episode reward: [(0, '0.287')] +[2024-06-18 13:14:25,291][12883] Updated weights for policy 0, policy_version 140203 (0.0032) +[2024-06-18 13:14:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2297118720. Throughput: 0: 42548.0. Samples: 2297299680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:14:26,994][12645] Avg episode reward: [(0, '0.797')] +[2024-06-18 13:14:29,211][12883] Updated weights for policy 0, policy_version 140213 (0.0026) +[2024-06-18 13:14:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 2297380864. Throughput: 0: 42686.3. Samples: 2297427840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:14:31,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 13:14:32,885][12883] Updated weights for policy 0, policy_version 140223 (0.0042) +[2024-06-18 13:14:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2297561088. Throughput: 0: 42647.1. Samples: 2297689560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:14:36,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 13:14:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140232_2297561088.pth... +[2024-06-18 13:14:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139609_2287353856.pth +[2024-06-18 13:14:37,364][12883] Updated weights for policy 0, policy_version 140233 (0.0035) +[2024-06-18 13:14:40,401][12883] Updated weights for policy 0, policy_version 140243 (0.0035) +[2024-06-18 13:14:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 2297774080. Throughput: 0: 42631.6. Samples: 2297943560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:14:41,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 13:14:45,002][12883] Updated weights for policy 0, policy_version 140253 (0.0032) +[2024-06-18 13:14:46,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2298036224. Throughput: 0: 42957.7. Samples: 2298078920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:14:46,994][12645] Avg episode reward: [(0, '0.659')] +[2024-06-18 13:14:47,850][12883] Updated weights for policy 0, policy_version 140263 (0.0035) +[2024-06-18 13:14:52,000][12645] Fps is (10 sec: 42571.5, 60 sec: 42593.9, 300 sec: 42542.0). Total num frames: 2298200064. Throughput: 0: 42868.7. Samples: 2298336440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:14:52,001][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 13:14:52,421][12883] Updated weights for policy 0, policy_version 140273 (0.0025) +[2024-06-18 13:14:55,743][12883] Updated weights for policy 0, policy_version 140283 (0.0033) +[2024-06-18 13:14:56,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2298413056. Throughput: 0: 42754.6. Samples: 2298585160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:14:56,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 13:14:59,988][12883] Updated weights for policy 0, policy_version 140293 (0.0052) +[2024-06-18 13:15:01,994][12645] Fps is (10 sec: 45903.9, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 2298658816. Throughput: 0: 42828.0. Samples: 2298718300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:15:01,994][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 13:15:03,820][12883] Updated weights for policy 0, policy_version 140303 (0.0032) +[2024-06-18 13:15:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2298839040. Throughput: 0: 42749.3. Samples: 2298971320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:15:06,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 13:15:07,635][12883] Updated weights for policy 0, policy_version 140313 (0.0034) +[2024-06-18 13:15:11,439][12883] Updated weights for policy 0, policy_version 140323 (0.0028) +[2024-06-18 13:15:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 2299068416. Throughput: 0: 42807.4. Samples: 2299226020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:15:12,003][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 13:15:15,404][12883] Updated weights for policy 0, policy_version 140333 (0.0044) +[2024-06-18 13:15:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2299281408. Throughput: 0: 42830.3. Samples: 2299355200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) +[2024-06-18 13:15:16,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 13:15:19,066][12883] Updated weights for policy 0, policy_version 140343 (0.0032) +[2024-06-18 13:15:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2299478016. Throughput: 0: 42657.7. Samples: 2299609160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:15:21,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 13:15:23,082][12883] Updated weights for policy 0, policy_version 140353 (0.0033) +[2024-06-18 13:15:26,481][12883] Updated weights for policy 0, policy_version 140363 (0.0035) +[2024-06-18 13:15:26,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 2299707392. Throughput: 0: 42693.6. Samples: 2299864780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:15:26,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 13:15:30,900][12883] Updated weights for policy 0, policy_version 140373 (0.0030) +[2024-06-18 13:15:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2299920384. Throughput: 0: 42671.1. Samples: 2299999120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:15:31,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 13:15:32,063][12862] Signal inference workers to stop experience collection... (33600 times) +[2024-06-18 13:15:32,064][12862] Signal inference workers to resume experience collection... (33600 times) +[2024-06-18 13:15:32,089][12883] InferenceWorker_p0-w0: stopping experience collection (33600 times) +[2024-06-18 13:15:32,089][12883] InferenceWorker_p0-w0: resuming experience collection (33600 times) +[2024-06-18 13:15:34,126][12883] Updated weights for policy 0, policy_version 140383 (0.0033) +[2024-06-18 13:15:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2300116992. Throughput: 0: 42479.6. Samples: 2300247760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:15:36,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 13:15:38,604][12883] Updated weights for policy 0, policy_version 140393 (0.0035) +[2024-06-18 13:15:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2300346368. Throughput: 0: 42623.9. Samples: 2300503240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:15:41,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 13:15:42,281][12883] Updated weights for policy 0, policy_version 140403 (0.0035) +[2024-06-18 13:15:46,377][12883] Updated weights for policy 0, policy_version 140413 (0.0033) +[2024-06-18 13:15:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2300559360. Throughput: 0: 42591.6. Samples: 2300634920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:15:46,994][12645] Avg episode reward: [(0, '0.780')] +[2024-06-18 13:15:49,909][12883] Updated weights for policy 0, policy_version 140423 (0.0032) +[2024-06-18 13:15:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42875.9, 300 sec: 42598.4). Total num frames: 2300772352. Throughput: 0: 42513.3. Samples: 2300884420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:15:51,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 13:15:54,019][12883] Updated weights for policy 0, policy_version 140433 (0.0033) +[2024-06-18 13:15:56,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2300985344. Throughput: 0: 42694.1. Samples: 2301147340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:15:56,996][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 13:15:57,511][12883] Updated weights for policy 0, policy_version 140443 (0.0029) +[2024-06-18 13:16:01,636][12883] Updated weights for policy 0, policy_version 140453 (0.0031) +[2024-06-18 13:16:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.9). Total num frames: 2301198336. Throughput: 0: 42664.4. Samples: 2301275100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:16:01,994][12645] Avg episode reward: [(0, '0.692')] +[2024-06-18 13:16:05,259][12883] Updated weights for policy 0, policy_version 140463 (0.0028) +[2024-06-18 13:16:06,994][12645] Fps is (10 sec: 44246.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2301427712. Throughput: 0: 42731.2. Samples: 2301532060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:16:06,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 13:16:09,172][12883] Updated weights for policy 0, policy_version 140473 (0.0029) +[2024-06-18 13:16:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2301640704. Throughput: 0: 42626.8. Samples: 2301782980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:16:11,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 13:16:12,996][12883] Updated weights for policy 0, policy_version 140483 (0.0045) +[2024-06-18 13:16:16,881][12883] Updated weights for policy 0, policy_version 140493 (0.0033) +[2024-06-18 13:16:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2301837312. Throughput: 0: 42471.1. Samples: 2301910320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:16:16,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 13:16:20,913][12883] Updated weights for policy 0, policy_version 140503 (0.0037) +[2024-06-18 13:16:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2302050304. Throughput: 0: 42764.9. Samples: 2302172180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:16:21,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 13:16:24,833][12883] Updated weights for policy 0, policy_version 140513 (0.0041) +[2024-06-18 13:16:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2302279680. Throughput: 0: 42482.2. Samples: 2302414940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:16:26,994][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 13:16:28,650][12883] Updated weights for policy 0, policy_version 140523 (0.0033) +[2024-06-18 13:16:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2302443520. Throughput: 0: 42447.6. Samples: 2302545060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:16:31,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 13:16:32,862][12883] Updated weights for policy 0, policy_version 140533 (0.0035) +[2024-06-18 13:16:36,517][12883] Updated weights for policy 0, policy_version 140543 (0.0031) +[2024-06-18 13:16:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2302672896. Throughput: 0: 42513.8. Samples: 2302797540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:16:36,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 13:16:37,093][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140545_2302689280.pth... +[2024-06-18 13:16:37,153][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139920_2292449280.pth +[2024-06-18 13:16:40,625][12883] Updated weights for policy 0, policy_version 140553 (0.0038) +[2024-06-18 13:16:41,310][12862] Signal inference workers to stop experience collection... (33650 times) +[2024-06-18 13:16:41,360][12883] InferenceWorker_p0-w0: stopping experience collection (33650 times) +[2024-06-18 13:16:41,429][12862] Signal inference workers to resume experience collection... (33650 times) +[2024-06-18 13:16:41,429][12883] InferenceWorker_p0-w0: resuming experience collection (33650 times) +[2024-06-18 13:16:41,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2302918656. Throughput: 0: 42182.6. Samples: 2303045460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:16:41,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 13:16:44,401][12883] Updated weights for policy 0, policy_version 140563 (0.0034) +[2024-06-18 13:16:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2303098880. Throughput: 0: 42304.4. Samples: 2303178800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:16:46,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 13:16:48,341][12883] Updated weights for policy 0, policy_version 140573 (0.0034) +[2024-06-18 13:16:51,996][12645] Fps is (10 sec: 37674.5, 60 sec: 42050.8, 300 sec: 42543.4). Total num frames: 2303295488. Throughput: 0: 42040.2. Samples: 2303423960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:16:51,997][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 13:16:52,048][12883] Updated weights for policy 0, policy_version 140583 (0.0034) +[2024-06-18 13:16:55,885][12883] Updated weights for policy 0, policy_version 140593 (0.0036) +[2024-06-18 13:16:56,995][12645] Fps is (10 sec: 42592.9, 60 sec: 42325.9, 300 sec: 42709.3). Total num frames: 2303524864. Throughput: 0: 42263.1. Samples: 2303684880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:16:56,996][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 13:16:59,852][12883] Updated weights for policy 0, policy_version 140603 (0.0030) +[2024-06-18 13:17:01,996][12645] Fps is (10 sec: 45875.2, 60 sec: 42596.8, 300 sec: 42709.5). Total num frames: 2303754240. Throughput: 0: 42250.8. Samples: 2303811700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:17:01,996][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 13:17:03,863][12883] Updated weights for policy 0, policy_version 140613 (0.0028) +[2024-06-18 13:17:06,997][12645] Fps is (10 sec: 40952.4, 60 sec: 41777.0, 300 sec: 42542.4). Total num frames: 2303934464. Throughput: 0: 41860.2. Samples: 2304056020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:17:06,997][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 13:17:07,413][12883] Updated weights for policy 0, policy_version 140623 (0.0034) +[2024-06-18 13:17:11,763][12883] Updated weights for policy 0, policy_version 140633 (0.0031) +[2024-06-18 13:17:11,994][12645] Fps is (10 sec: 39330.1, 60 sec: 41779.1, 300 sec: 42654.1). Total num frames: 2304147456. Throughput: 0: 42406.1. Samples: 2304323220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:17:11,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 13:17:15,019][12883] Updated weights for policy 0, policy_version 140643 (0.0027) +[2024-06-18 13:17:16,994][12645] Fps is (10 sec: 42611.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2304360448. Throughput: 0: 42243.0. Samples: 2304446000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:17:16,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 13:17:19,357][12883] Updated weights for policy 0, policy_version 140653 (0.0038) +[2024-06-18 13:17:21,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42050.7, 300 sec: 42542.9). Total num frames: 2304573440. Throughput: 0: 42155.2. Samples: 2304694620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:17:21,997][12645] Avg episode reward: [(0, '0.282')] +[2024-06-18 13:17:22,621][12883] Updated weights for policy 0, policy_version 140663 (0.0036) +[2024-06-18 13:17:26,889][12883] Updated weights for policy 0, policy_version 140673 (0.0024) +[2024-06-18 13:17:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2304786432. Throughput: 0: 42519.0. Samples: 2304958820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:17:26,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 13:17:30,154][12883] Updated weights for policy 0, policy_version 140683 (0.0032) +[2024-06-18 13:17:31,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2304999424. Throughput: 0: 42287.2. Samples: 2305081720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:17:31,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 13:17:34,482][12883] Updated weights for policy 0, policy_version 140693 (0.0039) +[2024-06-18 13:17:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2305228800. Throughput: 0: 42487.4. Samples: 2305335800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:17:36,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 13:17:38,014][12883] Updated weights for policy 0, policy_version 140703 (0.0040) +[2024-06-18 13:17:41,997][12645] Fps is (10 sec: 42583.7, 60 sec: 41776.8, 300 sec: 42542.4). Total num frames: 2305425408. Throughput: 0: 42470.5. Samples: 2305596140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:17:41,998][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 13:17:42,172][12883] Updated weights for policy 0, policy_version 140713 (0.0026) +[2024-06-18 13:17:45,630][12883] Updated weights for policy 0, policy_version 140723 (0.0027) +[2024-06-18 13:17:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2305654784. Throughput: 0: 42500.0. Samples: 2305724100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:17:46,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 13:17:48,052][12862] Signal inference workers to stop experience collection... (33700 times) +[2024-06-18 13:17:48,053][12862] Signal inference workers to resume experience collection... (33700 times) +[2024-06-18 13:17:48,075][12883] InferenceWorker_p0-w0: stopping experience collection (33700 times) +[2024-06-18 13:17:48,075][12883] InferenceWorker_p0-w0: resuming experience collection (33700 times) +[2024-06-18 13:17:49,598][12883] Updated weights for policy 0, policy_version 140733 (0.0036) +[2024-06-18 13:17:51,994][12645] Fps is (10 sec: 42612.6, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 2305851392. Throughput: 0: 42674.5. Samples: 2305976240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:17:51,994][12645] Avg episode reward: [(0, '0.735')] +[2024-06-18 13:17:53,516][12883] Updated weights for policy 0, policy_version 140743 (0.0027) +[2024-06-18 13:17:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42326.3, 300 sec: 42542.9). Total num frames: 2306064384. Throughput: 0: 42621.9. Samples: 2306241200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:17:56,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 13:17:57,283][12883] Updated weights for policy 0, policy_version 140753 (0.0031) +[2024-06-18 13:18:00,936][12883] Updated weights for policy 0, policy_version 140763 (0.0040) +[2024-06-18 13:18:01,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 2306310144. Throughput: 0: 42723.5. Samples: 2306368560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:18:01,994][12645] Avg episode reward: [(0, '0.207')] +[2024-06-18 13:18:04,934][12883] Updated weights for policy 0, policy_version 140773 (0.0032) +[2024-06-18 13:18:06,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42873.7, 300 sec: 42653.9). Total num frames: 2306506752. Throughput: 0: 42787.5. Samples: 2306619960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:18:06,994][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 13:18:08,703][12883] Updated weights for policy 0, policy_version 140783 (0.0052) +[2024-06-18 13:18:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2306703360. Throughput: 0: 42774.6. Samples: 2306883680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:18:11,994][12645] Avg episode reward: [(0, '0.670')] +[2024-06-18 13:18:12,552][12883] Updated weights for policy 0, policy_version 140793 (0.0043) +[2024-06-18 13:18:16,305][12883] Updated weights for policy 0, policy_version 140803 (0.0036) +[2024-06-18 13:18:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2306932736. Throughput: 0: 42757.8. Samples: 2307005820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:18:16,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 13:18:20,112][12883] Updated weights for policy 0, policy_version 140813 (0.0037) +[2024-06-18 13:18:21,994][12645] Fps is (10 sec: 45875.9, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 2307162112. Throughput: 0: 42771.2. Samples: 2307260500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:18:21,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 13:18:24,294][12883] Updated weights for policy 0, policy_version 140823 (0.0036) +[2024-06-18 13:18:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 2307342336. Throughput: 0: 42886.4. Samples: 2307525880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) +[2024-06-18 13:18:26,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 13:18:27,747][12883] Updated weights for policy 0, policy_version 140833 (0.0035) +[2024-06-18 13:18:31,866][12883] Updated weights for policy 0, policy_version 140843 (0.0039) +[2024-06-18 13:18:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2307571712. Throughput: 0: 42703.5. Samples: 2307645760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:18:31,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 13:18:35,507][12883] Updated weights for policy 0, policy_version 140853 (0.0031) +[2024-06-18 13:18:36,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2307801088. Throughput: 0: 42844.4. Samples: 2307904240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:18:36,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 13:18:37,146][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140858_2307817472.pth... +[2024-06-18 13:18:37,198][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140232_2297561088.pth +[2024-06-18 13:18:39,328][12883] Updated weights for policy 0, policy_version 140863 (0.0041) +[2024-06-18 13:18:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42873.9, 300 sec: 42487.3). Total num frames: 2307997696. Throughput: 0: 42732.9. Samples: 2308164180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:18:41,994][12645] Avg episode reward: [(0, '0.714')] +[2024-06-18 13:18:43,233][12883] Updated weights for policy 0, policy_version 140873 (0.0036) +[2024-06-18 13:18:46,833][12883] Updated weights for policy 0, policy_version 140883 (0.0031) +[2024-06-18 13:18:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2308227072. Throughput: 0: 42640.5. Samples: 2308287380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:18:46,994][12645] Avg episode reward: [(0, '0.624')] +[2024-06-18 13:18:48,003][12862] Signal inference workers to stop experience collection... (33750 times) +[2024-06-18 13:18:48,003][12862] Signal inference workers to resume experience collection... (33750 times) +[2024-06-18 13:18:48,021][12883] InferenceWorker_p0-w0: stopping experience collection (33750 times) +[2024-06-18 13:18:48,021][12883] InferenceWorker_p0-w0: resuming experience collection (33750 times) +[2024-06-18 13:18:50,957][12883] Updated weights for policy 0, policy_version 140893 (0.0026) +[2024-06-18 13:18:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2308440064. Throughput: 0: 42908.1. Samples: 2308550820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:18:51,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 13:18:54,658][12883] Updated weights for policy 0, policy_version 140903 (0.0037) +[2024-06-18 13:18:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42487.6). Total num frames: 2308636672. Throughput: 0: 42732.9. Samples: 2308806660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:18:56,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 13:18:58,694][12883] Updated weights for policy 0, policy_version 140913 (0.0035) +[2024-06-18 13:19:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2308866048. Throughput: 0: 42822.6. Samples: 2308932840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:19:01,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 13:19:02,291][12883] Updated weights for policy 0, policy_version 140923 (0.0037) +[2024-06-18 13:19:06,112][12883] Updated weights for policy 0, policy_version 140933 (0.0026) +[2024-06-18 13:19:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 2309095424. Throughput: 0: 43005.2. Samples: 2309195740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:19:06,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 13:19:09,828][12883] Updated weights for policy 0, policy_version 140943 (0.0030) +[2024-06-18 13:19:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 2309275648. Throughput: 0: 42887.6. Samples: 2309455820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:19:11,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 13:19:13,632][12883] Updated weights for policy 0, policy_version 140953 (0.0040) +[2024-06-18 13:19:17,000][12645] Fps is (10 sec: 40934.6, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 2309505024. Throughput: 0: 42854.4. Samples: 2309574480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:19:17,001][12645] Avg episode reward: [(0, '0.650')] +[2024-06-18 13:19:17,509][12883] Updated weights for policy 0, policy_version 140963 (0.0049) +[2024-06-18 13:19:21,150][12883] Updated weights for policy 0, policy_version 140973 (0.0045) +[2024-06-18 13:19:21,994][12645] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2309750784. Throughput: 0: 42999.6. Samples: 2309839220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:19:21,994][12645] Avg episode reward: [(0, '0.659')] +[2024-06-18 13:19:25,138][12883] Updated weights for policy 0, policy_version 140983 (0.0043) +[2024-06-18 13:19:26,994][12645] Fps is (10 sec: 40986.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2309914624. Throughput: 0: 42919.5. Samples: 2310095560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:19:26,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 13:19:28,703][12883] Updated weights for policy 0, policy_version 140993 (0.0035) +[2024-06-18 13:19:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2310144000. Throughput: 0: 42962.3. Samples: 2310220680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 13:19:31,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 13:19:32,877][12883] Updated weights for policy 0, policy_version 141003 (0.0038) +[2024-06-18 13:19:36,438][12883] Updated weights for policy 0, policy_version 141013 (0.0029) +[2024-06-18 13:19:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2310373376. Throughput: 0: 42919.6. Samples: 2310482200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:19:36,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 13:19:40,746][12883] Updated weights for policy 0, policy_version 141023 (0.0028) +[2024-06-18 13:19:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2310569984. Throughput: 0: 42988.1. Samples: 2310741120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:19:41,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 13:19:43,813][12883] Updated weights for policy 0, policy_version 141033 (0.0033) +[2024-06-18 13:19:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 2310799360. Throughput: 0: 43134.7. Samples: 2310873900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:19:46,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 13:19:48,256][12883] Updated weights for policy 0, policy_version 141043 (0.0036) +[2024-06-18 13:19:51,752][12883] Updated weights for policy 0, policy_version 141053 (0.0041) +[2024-06-18 13:19:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2311028736. Throughput: 0: 42998.0. Samples: 2311130640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:19:51,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 13:19:55,889][12883] Updated weights for policy 0, policy_version 141063 (0.0042) +[2024-06-18 13:19:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2311225344. Throughput: 0: 42972.8. Samples: 2311389600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:19:56,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 13:19:59,181][12883] Updated weights for policy 0, policy_version 141073 (0.0028) +[2024-06-18 13:20:01,783][12862] Signal inference workers to stop experience collection... (33800 times) +[2024-06-18 13:20:01,784][12862] Signal inference workers to resume experience collection... (33800 times) +[2024-06-18 13:20:01,834][12883] InferenceWorker_p0-w0: stopping experience collection (33800 times) +[2024-06-18 13:20:01,834][12883] InferenceWorker_p0-w0: resuming experience collection (33800 times) +[2024-06-18 13:20:01,994][12645] Fps is (10 sec: 42597.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2311454720. Throughput: 0: 43168.9. Samples: 2311516820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:20:01,994][12645] Avg episode reward: [(0, '0.284')] +[2024-06-18 13:20:03,410][12883] Updated weights for policy 0, policy_version 141083 (0.0037) +[2024-06-18 13:20:06,770][12883] Updated weights for policy 0, policy_version 141093 (0.0028) +[2024-06-18 13:20:07,000][12645] Fps is (10 sec: 44209.2, 60 sec: 42867.1, 300 sec: 42708.6). Total num frames: 2311667712. Throughput: 0: 43071.8. Samples: 2311777720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:20:07,000][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 13:20:11,150][12883] Updated weights for policy 0, policy_version 141103 (0.0037) +[2024-06-18 13:20:11,994][12645] Fps is (10 sec: 40961.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2311864320. Throughput: 0: 43069.3. Samples: 2312033680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:20:11,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 13:20:14,307][12883] Updated weights for policy 0, policy_version 141113 (0.0025) +[2024-06-18 13:20:16,994][12645] Fps is (10 sec: 44264.3, 60 sec: 43422.1, 300 sec: 42820.6). Total num frames: 2312110080. Throughput: 0: 43015.4. Samples: 2312156380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:20:16,994][12645] Avg episode reward: [(0, '0.669')] +[2024-06-18 13:20:19,023][12883] Updated weights for policy 0, policy_version 141123 (0.0031) +[2024-06-18 13:20:21,981][12883] Updated weights for policy 0, policy_version 141133 (0.0033) +[2024-06-18 13:20:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2312323072. Throughput: 0: 43065.3. Samples: 2312420140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:20:21,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 13:20:26,536][12883] Updated weights for policy 0, policy_version 141143 (0.0046) +[2024-06-18 13:20:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2312519680. Throughput: 0: 43155.5. Samples: 2312683120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:20:26,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 13:20:29,758][12883] Updated weights for policy 0, policy_version 141153 (0.0022) +[2024-06-18 13:20:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2312749056. Throughput: 0: 42978.8. Samples: 2312807940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 13:20:31,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 13:20:34,117][12883] Updated weights for policy 0, policy_version 141163 (0.0043) +[2024-06-18 13:20:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2312929280. Throughput: 0: 42922.2. Samples: 2313062140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:20:36,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 13:20:37,040][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141171_2312945664.pth... +[2024-06-18 13:20:37,112][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140545_2302689280.pth +[2024-06-18 13:20:37,387][12883] Updated weights for policy 0, policy_version 141173 (0.0034) +[2024-06-18 13:20:41,694][12883] Updated weights for policy 0, policy_version 141183 (0.0035) +[2024-06-18 13:20:41,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2313142272. Throughput: 0: 42935.9. Samples: 2313321720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:20:41,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 13:20:44,941][12883] Updated weights for policy 0, policy_version 141193 (0.0040) +[2024-06-18 13:20:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2313388032. Throughput: 0: 42914.4. Samples: 2313447960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:20:46,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 13:20:49,285][12883] Updated weights for policy 0, policy_version 141203 (0.0032) +[2024-06-18 13:20:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 2313568256. Throughput: 0: 42842.4. Samples: 2313705360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:20:51,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 13:20:52,891][12883] Updated weights for policy 0, policy_version 141213 (0.0029) +[2024-06-18 13:20:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2313781248. Throughput: 0: 42928.3. Samples: 2313965460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:20:56,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 13:20:57,271][12883] Updated weights for policy 0, policy_version 141223 (0.0030) +[2024-06-18 13:21:00,433][12883] Updated weights for policy 0, policy_version 141233 (0.0032) +[2024-06-18 13:21:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2314027008. Throughput: 0: 42877.4. Samples: 2314085860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:21:01,994][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 13:21:04,761][12883] Updated weights for policy 0, policy_version 141243 (0.0037) +[2024-06-18 13:21:06,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 2314223616. Throughput: 0: 42817.0. Samples: 2314346900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:21:06,994][12645] Avg episode reward: [(0, '0.733')] +[2024-06-18 13:21:08,107][12883] Updated weights for policy 0, policy_version 141253 (0.0032) +[2024-06-18 13:21:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2314436608. Throughput: 0: 42724.5. Samples: 2314605720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:21:11,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 13:21:12,256][12883] Updated weights for policy 0, policy_version 141263 (0.0032) +[2024-06-18 13:21:15,597][12883] Updated weights for policy 0, policy_version 141273 (0.0035) +[2024-06-18 13:21:16,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2314649600. Throughput: 0: 42809.5. Samples: 2314734380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:21:16,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 13:21:19,643][12883] Updated weights for policy 0, policy_version 141283 (0.0033) +[2024-06-18 13:21:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2314878976. Throughput: 0: 42928.5. Samples: 2314993920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:21:21,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 13:21:23,555][12883] Updated weights for policy 0, policy_version 141293 (0.0041) +[2024-06-18 13:21:24,676][12862] Signal inference workers to stop experience collection... (33850 times) +[2024-06-18 13:21:24,712][12883] InferenceWorker_p0-w0: stopping experience collection (33850 times) +[2024-06-18 13:21:24,722][12862] Signal inference workers to resume experience collection... (33850 times) +[2024-06-18 13:21:24,727][12883] InferenceWorker_p0-w0: resuming experience collection (33850 times) +[2024-06-18 13:21:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2315075584. Throughput: 0: 42914.3. Samples: 2315252860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:21:26,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 13:21:27,367][12883] Updated weights for policy 0, policy_version 141303 (0.0021) +[2024-06-18 13:21:31,221][12883] Updated weights for policy 0, policy_version 141313 (0.0030) +[2024-06-18 13:21:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2315304960. Throughput: 0: 42873.8. Samples: 2315377280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:21:31,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 13:21:35,103][12883] Updated weights for policy 0, policy_version 141323 (0.0036) +[2024-06-18 13:21:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2315517952. Throughput: 0: 42847.9. Samples: 2315633520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 13:21:36,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 13:21:38,926][12883] Updated weights for policy 0, policy_version 141333 (0.0031) +[2024-06-18 13:21:41,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42870.0, 300 sec: 42764.7). Total num frames: 2315714560. Throughput: 0: 42724.7. Samples: 2315888160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:21:41,996][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 13:21:42,751][12883] Updated weights for policy 0, policy_version 141343 (0.0028) +[2024-06-18 13:21:46,548][12883] Updated weights for policy 0, policy_version 141353 (0.0027) +[2024-06-18 13:21:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 2315943936. Throughput: 0: 42914.1. Samples: 2316017000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:21:46,994][12645] Avg episode reward: [(0, '0.673')] +[2024-06-18 13:21:50,387][12883] Updated weights for policy 0, policy_version 141363 (0.0033) +[2024-06-18 13:21:51,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43417.6, 300 sec: 42876.3). Total num frames: 2316173312. Throughput: 0: 42907.9. Samples: 2316277760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:21:51,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 13:21:54,074][12883] Updated weights for policy 0, policy_version 141373 (0.0031) +[2024-06-18 13:21:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2316369920. Throughput: 0: 42942.2. Samples: 2316538120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:21:56,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 13:21:57,869][12883] Updated weights for policy 0, policy_version 141383 (0.0039) +[2024-06-18 13:22:01,583][12883] Updated weights for policy 0, policy_version 141393 (0.0030) +[2024-06-18 13:22:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42932.1). Total num frames: 2316599296. Throughput: 0: 42844.7. Samples: 2316662380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:01,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 13:22:05,654][12883] Updated weights for policy 0, policy_version 141403 (0.0040) +[2024-06-18 13:22:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2316812288. Throughput: 0: 42978.2. Samples: 2316927940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:06,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 13:22:09,071][12883] Updated weights for policy 0, policy_version 141413 (0.0040) +[2024-06-18 13:22:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2317025280. Throughput: 0: 42862.7. Samples: 2317181680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:11,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 13:22:13,302][12883] Updated weights for policy 0, policy_version 141423 (0.0034) +[2024-06-18 13:22:16,675][12883] Updated weights for policy 0, policy_version 141433 (0.0029) +[2024-06-18 13:22:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42987.5). Total num frames: 2317254656. Throughput: 0: 42988.9. Samples: 2317311780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:16,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 13:22:20,898][12883] Updated weights for policy 0, policy_version 141443 (0.0037) +[2024-06-18 13:22:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42869.9, 300 sec: 42931.3). Total num frames: 2317451264. Throughput: 0: 42970.8. Samples: 2317567300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:21,996][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 13:22:24,372][12883] Updated weights for policy 0, policy_version 141453 (0.0039) +[2024-06-18 13:22:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2317647872. Throughput: 0: 43051.8. Samples: 2317825400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:26,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 13:22:28,649][12883] Updated weights for policy 0, policy_version 141463 (0.0042) +[2024-06-18 13:22:31,966][12883] Updated weights for policy 0, policy_version 141473 (0.0021) +[2024-06-18 13:22:31,994][12645] Fps is (10 sec: 44246.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2317893632. Throughput: 0: 42980.6. Samples: 2317951120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:31,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 13:22:36,264][12883] Updated weights for policy 0, policy_version 141483 (0.0040) +[2024-06-18 13:22:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42932.1). Total num frames: 2318090240. Throughput: 0: 42929.7. Samples: 2318209600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:36,994][12645] Avg episode reward: [(0, '0.723')] +[2024-06-18 13:22:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141485_2318090240.pth... +[2024-06-18 13:22:37,058][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140858_2307817472.pth +[2024-06-18 13:22:39,599][12883] Updated weights for policy 0, policy_version 141493 (0.0031) +[2024-06-18 13:22:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 2318303232. Throughput: 0: 42775.1. Samples: 2318463000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) +[2024-06-18 13:22:41,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 13:22:43,831][12883] Updated weights for policy 0, policy_version 141503 (0.0028) +[2024-06-18 13:22:46,165][12862] Signal inference workers to stop experience collection... (33900 times) +[2024-06-18 13:22:46,165][12862] Signal inference workers to resume experience collection... (33900 times) +[2024-06-18 13:22:46,187][12883] InferenceWorker_p0-w0: stopping experience collection (33900 times) +[2024-06-18 13:22:46,214][12883] InferenceWorker_p0-w0: resuming experience collection (33900 times) +[2024-06-18 13:22:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2318516224. Throughput: 0: 42812.7. Samples: 2318588960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:22:46,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 13:22:47,382][12883] Updated weights for policy 0, policy_version 141513 (0.0035) +[2024-06-18 13:22:51,510][12883] Updated weights for policy 0, policy_version 141523 (0.0036) +[2024-06-18 13:22:51,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.9, 300 sec: 42931.3). Total num frames: 2318729216. Throughput: 0: 42672.2. Samples: 2318848280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:22:51,996][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 13:22:54,991][12883] Updated weights for policy 0, policy_version 141533 (0.0034) +[2024-06-18 13:22:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2318942208. Throughput: 0: 42588.4. Samples: 2319098160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:22:57,003][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 13:22:59,183][12883] Updated weights for policy 0, policy_version 141543 (0.0039) +[2024-06-18 13:23:01,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2319171584. Throughput: 0: 42709.9. Samples: 2319233720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:01,994][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 13:23:02,630][12883] Updated weights for policy 0, policy_version 141553 (0.0028) +[2024-06-18 13:23:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2319351808. Throughput: 0: 42715.4. Samples: 2319489400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:06,994][12645] Avg episode reward: [(0, '0.648')] +[2024-06-18 13:23:07,201][12883] Updated weights for policy 0, policy_version 141563 (0.0045) +[2024-06-18 13:23:10,370][12883] Updated weights for policy 0, policy_version 141573 (0.0024) +[2024-06-18 13:23:11,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2319581184. Throughput: 0: 42412.4. Samples: 2319733960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:11,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 13:23:14,926][12883] Updated weights for policy 0, policy_version 141583 (0.0038) +[2024-06-18 13:23:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2319810560. Throughput: 0: 42632.9. Samples: 2319869600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:16,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 13:23:17,832][12883] Updated weights for policy 0, policy_version 141593 (0.0037) +[2024-06-18 13:23:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42053.8, 300 sec: 42820.5). Total num frames: 2319974400. Throughput: 0: 42432.4. Samples: 2320119060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:21,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 13:23:22,644][12883] Updated weights for policy 0, policy_version 141603 (0.0039) +[2024-06-18 13:23:25,771][12883] Updated weights for policy 0, policy_version 141613 (0.0026) +[2024-06-18 13:23:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2320236544. Throughput: 0: 42253.7. Samples: 2320364420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:26,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 13:23:30,120][12883] Updated weights for policy 0, policy_version 141623 (0.0028) +[2024-06-18 13:23:31,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2320433152. Throughput: 0: 42674.0. Samples: 2320509280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:31,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 13:23:33,339][12883] Updated weights for policy 0, policy_version 141633 (0.0021) +[2024-06-18 13:23:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2320613376. Throughput: 0: 42354.9. Samples: 2320754160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:37,000][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 13:23:37,878][12883] Updated weights for policy 0, policy_version 141643 (0.0026) +[2024-06-18 13:23:41,333][12883] Updated weights for policy 0, policy_version 141653 (0.0037) +[2024-06-18 13:23:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2320891904. Throughput: 0: 42363.6. Samples: 2321004520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) +[2024-06-18 13:23:41,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 13:23:45,476][12883] Updated weights for policy 0, policy_version 141663 (0.0036) +[2024-06-18 13:23:46,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 2321055744. Throughput: 0: 42546.2. Samples: 2321148400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:23:46,996][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 13:23:48,949][12883] Updated weights for policy 0, policy_version 141673 (0.0034) +[2024-06-18 13:23:51,994][12645] Fps is (10 sec: 36044.7, 60 sec: 42053.7, 300 sec: 42765.0). Total num frames: 2321252352. Throughput: 0: 42171.1. Samples: 2321387100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:23:51,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 13:23:53,576][12883] Updated weights for policy 0, policy_version 141683 (0.0047) +[2024-06-18 13:23:55,595][12862] Signal inference workers to stop experience collection... (33950 times) +[2024-06-18 13:23:55,595][12862] Signal inference workers to resume experience collection... (33950 times) +[2024-06-18 13:23:55,643][12883] InferenceWorker_p0-w0: stopping experience collection (33950 times) +[2024-06-18 13:23:55,648][12883] InferenceWorker_p0-w0: resuming experience collection (33950 times) +[2024-06-18 13:23:56,605][12883] Updated weights for policy 0, policy_version 141693 (0.0046) +[2024-06-18 13:23:56,994][12645] Fps is (10 sec: 45885.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2321514496. Throughput: 0: 42426.8. Samples: 2321643160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:23:56,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 13:24:01,101][12883] Updated weights for policy 0, policy_version 141703 (0.0033) +[2024-06-18 13:24:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 2321678336. Throughput: 0: 42452.5. Samples: 2321779960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:01,994][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 13:24:04,263][12883] Updated weights for policy 0, policy_version 141713 (0.0030) +[2024-06-18 13:24:06,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2321907712. Throughput: 0: 42408.4. Samples: 2322027440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:06,994][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 13:24:08,708][12883] Updated weights for policy 0, policy_version 141723 (0.0041) +[2024-06-18 13:24:11,966][12883] Updated weights for policy 0, policy_version 141733 (0.0038) +[2024-06-18 13:24:11,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42877.0). Total num frames: 2322153472. Throughput: 0: 42751.2. Samples: 2322288220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:11,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 13:24:16,139][12883] Updated weights for policy 0, policy_version 141743 (0.0030) +[2024-06-18 13:24:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2322333696. Throughput: 0: 42416.3. Samples: 2322418020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:16,994][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 13:24:19,752][12883] Updated weights for policy 0, policy_version 141753 (0.0038) +[2024-06-18 13:24:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2322563072. Throughput: 0: 42513.4. Samples: 2322667260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:21,994][12645] Avg episode reward: [(0, '0.192')] +[2024-06-18 13:24:23,865][12883] Updated weights for policy 0, policy_version 141763 (0.0025) +[2024-06-18 13:24:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2322776064. Throughput: 0: 42873.8. Samples: 2322933840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:26,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 13:24:27,434][12883] Updated weights for policy 0, policy_version 141773 (0.0030) +[2024-06-18 13:24:31,280][12883] Updated weights for policy 0, policy_version 141783 (0.0042) +[2024-06-18 13:24:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2322972672. Throughput: 0: 42427.4. Samples: 2323057540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:31,994][12645] Avg episode reward: [(0, '0.669')] +[2024-06-18 13:24:35,137][12883] Updated weights for policy 0, policy_version 141793 (0.0027) +[2024-06-18 13:24:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2323202048. Throughput: 0: 42804.0. Samples: 2323313280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:36,994][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 13:24:37,097][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141798_2323218432.pth... +[2024-06-18 13:24:37,156][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141171_2312945664.pth +[2024-06-18 13:24:38,777][12883] Updated weights for policy 0, policy_version 141803 (0.0032) +[2024-06-18 13:24:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 2323398656. Throughput: 0: 43026.7. Samples: 2323579360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:41,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 13:24:42,846][12883] Updated weights for policy 0, policy_version 141813 (0.0035) +[2024-06-18 13:24:46,904][12883] Updated weights for policy 0, policy_version 141823 (0.0035) +[2024-06-18 13:24:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2323628032. Throughput: 0: 42633.7. Samples: 2323698480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) +[2024-06-18 13:24:46,994][12645] Avg episode reward: [(0, '0.797')] +[2024-06-18 13:24:50,421][12883] Updated weights for policy 0, policy_version 141833 (0.0038) +[2024-06-18 13:24:51,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2323857408. Throughput: 0: 42915.6. Samples: 2323958640. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:24:51,998][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 13:24:54,333][12883] Updated weights for policy 0, policy_version 141843 (0.0027) +[2024-06-18 13:24:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 2324037632. Throughput: 0: 42946.6. Samples: 2324220820. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:24:56,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 13:24:58,285][12883] Updated weights for policy 0, policy_version 141853 (0.0033) +[2024-06-18 13:25:01,883][12883] Updated weights for policy 0, policy_version 141863 (0.0035) +[2024-06-18 13:25:02,000][12645] Fps is (10 sec: 42571.8, 60 sec: 43413.0, 300 sec: 42765.0). Total num frames: 2324283392. Throughput: 0: 42700.4. Samples: 2324339800. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:02,001][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 13:25:05,876][12883] Updated weights for policy 0, policy_version 141873 (0.0027) +[2024-06-18 13:25:06,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2324496384. Throughput: 0: 43041.2. Samples: 2324604120. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:06,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 13:25:08,589][12862] Signal inference workers to stop experience collection... (34000 times) +[2024-06-18 13:25:08,589][12862] Signal inference workers to resume experience collection... (34000 times) +[2024-06-18 13:25:08,625][12883] InferenceWorker_p0-w0: stopping experience collection (34000 times) +[2024-06-18 13:25:08,625][12883] InferenceWorker_p0-w0: resuming experience collection (34000 times) +[2024-06-18 13:25:09,527][12883] Updated weights for policy 0, policy_version 141883 (0.0038) +[2024-06-18 13:25:11,994][12645] Fps is (10 sec: 40984.1, 60 sec: 42325.0, 300 sec: 42653.9). Total num frames: 2324692992. Throughput: 0: 42881.0. Samples: 2324863500. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:11,995][12645] Avg episode reward: [(0, '0.754')] +[2024-06-18 13:25:13,644][12883] Updated weights for policy 0, policy_version 141893 (0.0036) +[2024-06-18 13:25:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2324922368. Throughput: 0: 42809.3. Samples: 2324983960. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:16,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 13:25:17,088][12883] Updated weights for policy 0, policy_version 141903 (0.0042) +[2024-06-18 13:25:21,718][12883] Updated weights for policy 0, policy_version 141913 (0.0035) +[2024-06-18 13:25:21,994][12645] Fps is (10 sec: 42600.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2325118976. Throughput: 0: 42760.9. Samples: 2325237520. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:21,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 13:25:24,945][12883] Updated weights for policy 0, policy_version 141923 (0.0038) +[2024-06-18 13:25:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2325331968. Throughput: 0: 42590.2. Samples: 2325495920. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:26,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 13:25:29,298][12883] Updated weights for policy 0, policy_version 141933 (0.0038) +[2024-06-18 13:25:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2325544960. Throughput: 0: 42803.4. Samples: 2325624640. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:31,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 13:25:32,393][12883] Updated weights for policy 0, policy_version 141943 (0.0036) +[2024-06-18 13:25:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2325741568. Throughput: 0: 42581.8. Samples: 2325874820. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:36,994][12645] Avg episode reward: [(0, '0.784')] +[2024-06-18 13:25:37,047][12883] Updated weights for policy 0, policy_version 141953 (0.0029) +[2024-06-18 13:25:40,210][12883] Updated weights for policy 0, policy_version 141963 (0.0027) +[2024-06-18 13:25:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2325954560. Throughput: 0: 42456.8. Samples: 2326131380. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:41,994][12645] Avg episode reward: [(0, '0.695')] +[2024-06-18 13:25:44,696][12883] Updated weights for policy 0, policy_version 141973 (0.0038) +[2024-06-18 13:25:46,998][12645] Fps is (10 sec: 45853.0, 60 sec: 42868.0, 300 sec: 42819.9). Total num frames: 2326200320. Throughput: 0: 42756.5. Samples: 2326263780. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:46,999][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 13:25:47,655][12883] Updated weights for policy 0, policy_version 141983 (0.0028) +[2024-06-18 13:25:51,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42050.7, 300 sec: 42709.2). Total num frames: 2326380544. Throughput: 0: 42498.4. Samples: 2326516640. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) +[2024-06-18 13:25:51,996][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 13:25:52,332][12883] Updated weights for policy 0, policy_version 141993 (0.0041) +[2024-06-18 13:25:55,327][12883] Updated weights for policy 0, policy_version 142003 (0.0037) +[2024-06-18 13:25:56,994][12645] Fps is (10 sec: 40979.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2326609920. Throughput: 0: 42494.5. Samples: 2326775740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:25:56,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 13:25:59,866][12883] Updated weights for policy 0, policy_version 142013 (0.0029) +[2024-06-18 13:26:01,994][12645] Fps is (10 sec: 45885.7, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 2326839296. Throughput: 0: 42876.1. Samples: 2326913380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:01,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 13:26:02,873][12883] Updated weights for policy 0, policy_version 142023 (0.0061) +[2024-06-18 13:26:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2327035904. Throughput: 0: 42832.7. Samples: 2327165000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:06,994][12645] Avg episode reward: [(0, '0.689')] +[2024-06-18 13:26:07,371][12883] Updated weights for policy 0, policy_version 142033 (0.0025) +[2024-06-18 13:26:10,857][12883] Updated weights for policy 0, policy_version 142043 (0.0031) +[2024-06-18 13:26:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 2327265280. Throughput: 0: 42676.4. Samples: 2327416360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:11,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 13:26:15,397][12883] Updated weights for policy 0, policy_version 142053 (0.0038) +[2024-06-18 13:26:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2327478272. Throughput: 0: 42789.8. Samples: 2327550180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:16,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 13:26:18,406][12883] Updated weights for policy 0, policy_version 142063 (0.0038) +[2024-06-18 13:26:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2327691264. Throughput: 0: 43031.5. Samples: 2327811240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:21,996][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 13:26:22,838][12883] Updated weights for policy 0, policy_version 142073 (0.0042) +[2024-06-18 13:26:26,060][12883] Updated weights for policy 0, policy_version 142083 (0.0034) +[2024-06-18 13:26:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2327920640. Throughput: 0: 42816.2. Samples: 2328058100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:26,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 13:26:30,831][12883] Updated weights for policy 0, policy_version 142093 (0.0031) +[2024-06-18 13:26:31,861][12862] Signal inference workers to stop experience collection... (34050 times) +[2024-06-18 13:26:31,902][12883] InferenceWorker_p0-w0: stopping experience collection (34050 times) +[2024-06-18 13:26:31,918][12862] Signal inference workers to resume experience collection... (34050 times) +[2024-06-18 13:26:31,924][12883] InferenceWorker_p0-w0: resuming experience collection (34050 times) +[2024-06-18 13:26:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2328117248. Throughput: 0: 42824.1. Samples: 2328190660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:31,996][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 13:26:33,810][12883] Updated weights for policy 0, policy_version 142103 (0.0035) +[2024-06-18 13:26:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2328330240. Throughput: 0: 42981.7. Samples: 2328450720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:36,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 13:26:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142110_2328330240.pth... +[2024-06-18 13:26:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141485_2318090240.pth +[2024-06-18 13:26:38,261][12883] Updated weights for policy 0, policy_version 142113 (0.0029) +[2024-06-18 13:26:41,524][12883] Updated weights for policy 0, policy_version 142123 (0.0046) +[2024-06-18 13:26:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2328559616. Throughput: 0: 42751.2. Samples: 2328699540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:41,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 13:26:46,017][12883] Updated weights for policy 0, policy_version 142133 (0.0038) +[2024-06-18 13:26:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42328.7, 300 sec: 42598.4). Total num frames: 2328739840. Throughput: 0: 42587.9. Samples: 2328829840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:46,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 13:26:49,171][12883] Updated weights for policy 0, policy_version 142143 (0.0044) +[2024-06-18 13:26:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 2328969216. Throughput: 0: 42749.4. Samples: 2329088720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) +[2024-06-18 13:26:51,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 13:26:53,599][12883] Updated weights for policy 0, policy_version 142153 (0.0040) +[2024-06-18 13:26:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2329182208. Throughput: 0: 42867.1. Samples: 2329345380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:26:56,998][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 13:26:57,040][12883] Updated weights for policy 0, policy_version 142163 (0.0042) +[2024-06-18 13:27:01,190][12883] Updated weights for policy 0, policy_version 142173 (0.0031) +[2024-06-18 13:27:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2329395200. Throughput: 0: 42762.8. Samples: 2329474500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:01,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 13:27:04,730][12883] Updated weights for policy 0, policy_version 142183 (0.0031) +[2024-06-18 13:27:06,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2329608192. Throughput: 0: 42690.3. Samples: 2329732300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:06,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 13:27:08,754][12883] Updated weights for policy 0, policy_version 142193 (0.0043) +[2024-06-18 13:27:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2329821184. Throughput: 0: 42723.1. Samples: 2329980640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:11,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 13:27:12,415][12883] Updated weights for policy 0, policy_version 142203 (0.0032) +[2024-06-18 13:27:16,404][12883] Updated weights for policy 0, policy_version 142213 (0.0034) +[2024-06-18 13:27:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2330017792. Throughput: 0: 42670.3. Samples: 2330110820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:16,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 13:27:20,402][12883] Updated weights for policy 0, policy_version 142223 (0.0034) +[2024-06-18 13:27:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2330263552. Throughput: 0: 42761.8. Samples: 2330375000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:21,994][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 13:27:24,222][12883] Updated weights for policy 0, policy_version 142233 (0.0022) +[2024-06-18 13:27:26,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2330492928. Throughput: 0: 42813.4. Samples: 2330626140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:26,994][12645] Avg episode reward: [(0, '0.616')] +[2024-06-18 13:27:27,907][12883] Updated weights for policy 0, policy_version 142243 (0.0037) +[2024-06-18 13:27:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2330656768. Throughput: 0: 42679.6. Samples: 2330750420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:31,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 13:27:32,022][12883] Updated weights for policy 0, policy_version 142253 (0.0037) +[2024-06-18 13:27:35,763][12883] Updated weights for policy 0, policy_version 142263 (0.0036) +[2024-06-18 13:27:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2330918912. Throughput: 0: 42677.9. Samples: 2331009220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:37,004][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 13:27:39,641][12883] Updated weights for policy 0, policy_version 142273 (0.0025) +[2024-06-18 13:27:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2331115520. Throughput: 0: 42606.3. Samples: 2331262660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:41,994][12645] Avg episode reward: [(0, '0.235')] +[2024-06-18 13:27:43,310][12883] Updated weights for policy 0, policy_version 142283 (0.0022) +[2024-06-18 13:27:46,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2331295744. Throughput: 0: 42596.4. Samples: 2331391340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:46,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 13:27:47,539][12883] Updated weights for policy 0, policy_version 142293 (0.0036) +[2024-06-18 13:27:50,907][12883] Updated weights for policy 0, policy_version 142303 (0.0031) +[2024-06-18 13:27:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2331525120. Throughput: 0: 42609.3. Samples: 2331649720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:51,994][12645] Avg episode reward: [(0, '0.669')] +[2024-06-18 13:27:55,099][12883] Updated weights for policy 0, policy_version 142313 (0.0023) +[2024-06-18 13:27:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2331738112. Throughput: 0: 42676.0. Samples: 2331901060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) +[2024-06-18 13:27:56,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 13:27:58,424][12883] Updated weights for policy 0, policy_version 142323 (0.0033) +[2024-06-18 13:27:59,729][12862] Signal inference workers to stop experience collection... (34100 times) +[2024-06-18 13:27:59,782][12862] Signal inference workers to resume experience collection... (34100 times) +[2024-06-18 13:27:59,783][12883] InferenceWorker_p0-w0: stopping experience collection (34100 times) +[2024-06-18 13:27:59,815][12883] InferenceWorker_p0-w0: resuming experience collection (34100 times) +[2024-06-18 13:28:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2331951104. Throughput: 0: 42706.8. Samples: 2332032620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:01,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 13:28:02,595][12883] Updated weights for policy 0, policy_version 142333 (0.0031) +[2024-06-18 13:28:06,061][12883] Updated weights for policy 0, policy_version 142343 (0.0036) +[2024-06-18 13:28:06,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 2332180480. Throughput: 0: 42469.4. Samples: 2332286220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:06,996][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 13:28:10,549][12883] Updated weights for policy 0, policy_version 142353 (0.0033) +[2024-06-18 13:28:11,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2332393472. Throughput: 0: 42567.4. Samples: 2332541680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:11,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 13:28:13,649][12883] Updated weights for policy 0, policy_version 142363 (0.0039) +[2024-06-18 13:28:16,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2332590080. Throughput: 0: 42704.8. Samples: 2332672140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:16,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 13:28:18,299][12883] Updated weights for policy 0, policy_version 142373 (0.0040) +[2024-06-18 13:28:21,328][12883] Updated weights for policy 0, policy_version 142383 (0.0037) +[2024-06-18 13:28:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2332819456. Throughput: 0: 42590.6. Samples: 2332925800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:21,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 13:28:25,850][12883] Updated weights for policy 0, policy_version 142393 (0.0028) +[2024-06-18 13:28:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2333016064. Throughput: 0: 42665.2. Samples: 2333182600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:26,994][12645] Avg episode reward: [(0, '0.245')] +[2024-06-18 13:28:29,071][12883] Updated weights for policy 0, policy_version 142403 (0.0040) +[2024-06-18 13:28:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2333212672. Throughput: 0: 42611.1. Samples: 2333308840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:31,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 13:28:33,402][12883] Updated weights for policy 0, policy_version 142413 (0.0034) +[2024-06-18 13:28:36,923][12883] Updated weights for policy 0, policy_version 142423 (0.0034) +[2024-06-18 13:28:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2333458432. Throughput: 0: 42571.5. Samples: 2333565440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:36,994][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 13:28:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142423_2333458432.pth... +[2024-06-18 13:28:37,054][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141798_2323218432.pth +[2024-06-18 13:28:40,984][12883] Updated weights for policy 0, policy_version 142433 (0.0043) +[2024-06-18 13:28:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2333671424. Throughput: 0: 42609.8. Samples: 2333818500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:41,994][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 13:28:44,421][12883] Updated weights for policy 0, policy_version 142443 (0.0031) +[2024-06-18 13:28:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2333851648. Throughput: 0: 42561.6. Samples: 2333947900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:46,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 13:28:48,594][12883] Updated weights for policy 0, policy_version 142453 (0.0024) +[2024-06-18 13:28:51,942][12883] Updated weights for policy 0, policy_version 142463 (0.0031) +[2024-06-18 13:28:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2334113792. Throughput: 0: 42680.8. Samples: 2334206760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:51,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 13:28:56,353][12883] Updated weights for policy 0, policy_version 142473 (0.0038) +[2024-06-18 13:28:56,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2334310400. Throughput: 0: 42691.7. Samples: 2334462800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 13:28:56,994][12645] Avg episode reward: [(0, '0.672')] +[2024-06-18 13:28:59,531][12883] Updated weights for policy 0, policy_version 142483 (0.0041) +[2024-06-18 13:29:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2334507008. Throughput: 0: 42556.0. Samples: 2334587160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:01,994][12645] Avg episode reward: [(0, '0.917')] +[2024-06-18 13:29:01,996][12862] Saving new best policy, reward=0.917! +[2024-06-18 13:29:03,925][12883] Updated weights for policy 0, policy_version 142493 (0.0026) +[2024-06-18 13:29:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2334752768. Throughput: 0: 42748.1. Samples: 2334849460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:06,994][12645] Avg episode reward: [(0, '0.649')] +[2024-06-18 13:29:07,717][12883] Updated weights for policy 0, policy_version 142503 (0.0048) +[2024-06-18 13:29:11,515][12883] Updated weights for policy 0, policy_version 142513 (0.0027) +[2024-06-18 13:29:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2334949376. Throughput: 0: 42636.6. Samples: 2335101240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:11,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 13:29:15,395][12883] Updated weights for policy 0, policy_version 142523 (0.0040) +[2024-06-18 13:29:16,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2335145984. Throughput: 0: 42655.4. Samples: 2335228340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:16,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 13:29:19,346][12883] Updated weights for policy 0, policy_version 142533 (0.0037) +[2024-06-18 13:29:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2335375360. Throughput: 0: 42690.4. Samples: 2335486500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:21,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 13:29:22,835][12883] Updated weights for policy 0, policy_version 142543 (0.0040) +[2024-06-18 13:29:26,986][12883] Updated weights for policy 0, policy_version 142553 (0.0030) +[2024-06-18 13:29:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2335588352. Throughput: 0: 42825.5. Samples: 2335745660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:26,994][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 13:29:30,705][12883] Updated weights for policy 0, policy_version 142563 (0.0037) +[2024-06-18 13:29:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2335801344. Throughput: 0: 42621.0. Samples: 2335865840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:31,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 13:29:34,628][12883] Updated weights for policy 0, policy_version 142573 (0.0040) +[2024-06-18 13:29:36,994][12645] Fps is (10 sec: 42599.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2336014336. Throughput: 0: 42632.9. Samples: 2336125240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:36,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 13:29:38,293][12883] Updated weights for policy 0, policy_version 142583 (0.0023) +[2024-06-18 13:29:40,453][12862] Signal inference workers to stop experience collection... (34150 times) +[2024-06-18 13:29:40,454][12862] Signal inference workers to resume experience collection... (34150 times) +[2024-06-18 13:29:40,466][12883] InferenceWorker_p0-w0: stopping experience collection (34150 times) +[2024-06-18 13:29:40,485][12883] InferenceWorker_p0-w0: resuming experience collection (34150 times) +[2024-06-18 13:29:41,995][12645] Fps is (10 sec: 42592.9, 60 sec: 42597.4, 300 sec: 42709.3). Total num frames: 2336227328. Throughput: 0: 42522.7. Samples: 2336376380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:41,996][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 13:29:42,586][12883] Updated weights for policy 0, policy_version 142593 (0.0022) +[2024-06-18 13:29:45,954][12883] Updated weights for policy 0, policy_version 142603 (0.0044) +[2024-06-18 13:29:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2336440320. Throughput: 0: 42648.0. Samples: 2336506320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:46,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 13:29:50,462][12883] Updated weights for policy 0, policy_version 142613 (0.0033) +[2024-06-18 13:29:51,994][12645] Fps is (10 sec: 40965.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2336636928. Throughput: 0: 42575.5. Samples: 2336765360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:51,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 13:29:53,596][12883] Updated weights for policy 0, policy_version 142623 (0.0025) +[2024-06-18 13:29:57,000][12645] Fps is (10 sec: 40934.7, 60 sec: 42320.9, 300 sec: 42598.4). Total num frames: 2336849920. Throughput: 0: 42501.2. Samples: 2337014060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:29:57,001][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 13:29:58,144][12883] Updated weights for policy 0, policy_version 142633 (0.0033) +[2024-06-18 13:30:01,454][12883] Updated weights for policy 0, policy_version 142643 (0.0028) +[2024-06-18 13:30:01,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2337079296. Throughput: 0: 42568.8. Samples: 2337143940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:30:01,995][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 13:30:05,857][12883] Updated weights for policy 0, policy_version 142653 (0.0038) +[2024-06-18 13:30:06,994][12645] Fps is (10 sec: 42625.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 2337275904. Throughput: 0: 42528.4. Samples: 2337400280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:06,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 13:30:09,256][12883] Updated weights for policy 0, policy_version 142663 (0.0032) +[2024-06-18 13:30:11,996][12645] Fps is (10 sec: 42589.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2337505280. Throughput: 0: 42348.3. Samples: 2337651420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:11,997][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 13:30:13,604][12883] Updated weights for policy 0, policy_version 142673 (0.0038) +[2024-06-18 13:30:16,877][12883] Updated weights for policy 0, policy_version 142683 (0.0040) +[2024-06-18 13:30:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2337718272. Throughput: 0: 42610.2. Samples: 2337783300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:16,994][12645] Avg episode reward: [(0, '0.664')] +[2024-06-18 13:30:21,142][12883] Updated weights for policy 0, policy_version 142693 (0.0034) +[2024-06-18 13:30:21,996][12645] Fps is (10 sec: 40960.1, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 2337914880. Throughput: 0: 42514.3. Samples: 2338038480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:21,996][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 13:30:24,478][12883] Updated weights for policy 0, policy_version 142703 (0.0037) +[2024-06-18 13:30:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2338160640. Throughput: 0: 42611.9. Samples: 2338293860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:26,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 13:30:28,556][12883] Updated weights for policy 0, policy_version 142713 (0.0031) +[2024-06-18 13:30:31,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2338357248. Throughput: 0: 42723.2. Samples: 2338428860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:31,994][12645] Avg episode reward: [(0, '0.695')] +[2024-06-18 13:30:32,031][12883] Updated weights for policy 0, policy_version 142723 (0.0032) +[2024-06-18 13:30:36,402][12883] Updated weights for policy 0, policy_version 142733 (0.0031) +[2024-06-18 13:30:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2338553856. Throughput: 0: 42668.4. Samples: 2338685440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:36,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 13:30:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142735_2338570240.pth... +[2024-06-18 13:30:37,130][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142110_2328330240.pth +[2024-06-18 13:30:39,745][12883] Updated weights for policy 0, policy_version 142743 (0.0027) +[2024-06-18 13:30:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42599.3, 300 sec: 42654.6). Total num frames: 2338783232. Throughput: 0: 42789.0. Samples: 2338939300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:41,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 13:30:44,032][12883] Updated weights for policy 0, policy_version 142753 (0.0039) +[2024-06-18 13:30:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 2338996224. Throughput: 0: 42688.2. Samples: 2339064900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:46,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 13:30:47,571][12883] Updated weights for policy 0, policy_version 142763 (0.0031) +[2024-06-18 13:30:51,858][12883] Updated weights for policy 0, policy_version 142773 (0.0033) +[2024-06-18 13:30:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2339192832. Throughput: 0: 42716.7. Samples: 2339322540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:51,994][12645] Avg episode reward: [(0, '0.848')] +[2024-06-18 13:30:55,156][12883] Updated weights for policy 0, policy_version 142783 (0.0030) +[2024-06-18 13:30:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43149.0, 300 sec: 42709.5). Total num frames: 2339438592. Throughput: 0: 42747.9. Samples: 2339574980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:30:56,994][12645] Avg episode reward: [(0, '0.739')] +[2024-06-18 13:30:59,362][12883] Updated weights for policy 0, policy_version 142793 (0.0038) +[2024-06-18 13:30:59,595][12862] Signal inference workers to stop experience collection... (34200 times) +[2024-06-18 13:30:59,633][12883] InferenceWorker_p0-w0: stopping experience collection (34200 times) +[2024-06-18 13:30:59,642][12862] Signal inference workers to resume experience collection... (34200 times) +[2024-06-18 13:30:59,653][12883] InferenceWorker_p0-w0: resuming experience collection (34200 times) +[2024-06-18 13:31:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2339618816. Throughput: 0: 42843.0. Samples: 2339711240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:31:01,995][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 13:31:02,700][12883] Updated weights for policy 0, policy_version 142803 (0.0030) +[2024-06-18 13:31:06,834][12883] Updated weights for policy 0, policy_version 142813 (0.0030) +[2024-06-18 13:31:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2339848192. Throughput: 0: 42851.5. Samples: 2339966700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 13:31:06,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 13:31:10,904][12883] Updated weights for policy 0, policy_version 142823 (0.0044) +[2024-06-18 13:31:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2340077568. Throughput: 0: 42767.6. Samples: 2340218400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:11,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 13:31:14,453][12883] Updated weights for policy 0, policy_version 142833 (0.0039) +[2024-06-18 13:31:16,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 2340257792. Throughput: 0: 42656.1. Samples: 2340348480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:16,997][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 13:31:18,423][12883] Updated weights for policy 0, policy_version 142843 (0.0028) +[2024-06-18 13:31:21,958][12883] Updated weights for policy 0, policy_version 142853 (0.0039) +[2024-06-18 13:31:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 2340503552. Throughput: 0: 42522.2. Samples: 2340598940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:21,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 13:31:26,083][12883] Updated weights for policy 0, policy_version 142863 (0.0029) +[2024-06-18 13:31:26,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2340700160. Throughput: 0: 42716.9. Samples: 2340861560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:26,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 13:31:29,623][12883] Updated weights for policy 0, policy_version 142873 (0.0032) +[2024-06-18 13:31:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2340913152. Throughput: 0: 42818.7. Samples: 2340991740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:31,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 13:31:33,813][12883] Updated weights for policy 0, policy_version 142883 (0.0035) +[2024-06-18 13:31:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2341142528. Throughput: 0: 42777.4. Samples: 2341247520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:36,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 13:31:37,109][12883] Updated weights for policy 0, policy_version 142893 (0.0042) +[2024-06-18 13:31:41,297][12883] Updated weights for policy 0, policy_version 142903 (0.0040) +[2024-06-18 13:31:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2341322752. Throughput: 0: 42912.0. Samples: 2341506020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:41,994][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 13:31:44,927][12883] Updated weights for policy 0, policy_version 142913 (0.0037) +[2024-06-18 13:31:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2341552128. Throughput: 0: 42634.7. Samples: 2341629800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:46,994][12645] Avg episode reward: [(0, '0.299')] +[2024-06-18 13:31:49,076][12883] Updated weights for policy 0, policy_version 142923 (0.0037) +[2024-06-18 13:31:51,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2341781504. Throughput: 0: 42699.9. Samples: 2341888200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:51,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 13:31:52,529][12883] Updated weights for policy 0, policy_version 142933 (0.0035) +[2024-06-18 13:31:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2341961728. Throughput: 0: 42831.9. Samples: 2342145840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:31:56,995][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 13:31:57,083][12883] Updated weights for policy 0, policy_version 142943 (0.0031) +[2024-06-18 13:32:00,269][12883] Updated weights for policy 0, policy_version 142953 (0.0029) +[2024-06-18 13:32:01,996][12645] Fps is (10 sec: 42589.1, 60 sec: 43143.0, 300 sec: 42709.1). Total num frames: 2342207488. Throughput: 0: 42722.7. Samples: 2342271000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:32:01,996][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 13:32:04,634][12883] Updated weights for policy 0, policy_version 142963 (0.0039) +[2024-06-18 13:32:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2342387712. Throughput: 0: 42868.0. Samples: 2342528000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:32:06,995][12645] Avg episode reward: [(0, '0.607')] +[2024-06-18 13:32:08,119][12883] Updated weights for policy 0, policy_version 142973 (0.0029) +[2024-06-18 13:32:11,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2342600704. Throughput: 0: 42619.5. Samples: 2342779440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 13:32:11,994][12645] Avg episode reward: [(0, '0.657')] +[2024-06-18 13:32:12,209][12883] Updated weights for policy 0, policy_version 142983 (0.0027) +[2024-06-18 13:32:15,729][12862] Signal inference workers to stop experience collection... (34250 times) +[2024-06-18 13:32:15,779][12883] InferenceWorker_p0-w0: stopping experience collection (34250 times) +[2024-06-18 13:32:15,788][12862] Signal inference workers to resume experience collection... (34250 times) +[2024-06-18 13:32:15,796][12883] InferenceWorker_p0-w0: resuming experience collection (34250 times) +[2024-06-18 13:32:15,921][12883] Updated weights for policy 0, policy_version 142993 (0.0029) +[2024-06-18 13:32:16,995][12645] Fps is (10 sec: 45867.5, 60 sec: 43144.9, 300 sec: 42653.7). Total num frames: 2342846464. Throughput: 0: 42608.1. Samples: 2342909180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:16,996][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 13:32:19,862][12883] Updated weights for policy 0, policy_version 143003 (0.0033) +[2024-06-18 13:32:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2343043072. Throughput: 0: 42648.9. Samples: 2343166720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:21,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 13:32:23,633][12883] Updated weights for policy 0, policy_version 143013 (0.0043) +[2024-06-18 13:32:26,994][12645] Fps is (10 sec: 39328.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2343239680. Throughput: 0: 42367.8. Samples: 2343412580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:26,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 13:32:27,758][12883] Updated weights for policy 0, policy_version 143023 (0.0049) +[2024-06-18 13:32:31,454][12883] Updated weights for policy 0, policy_version 143033 (0.0025) +[2024-06-18 13:32:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2343469056. Throughput: 0: 42518.8. Samples: 2343543140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:31,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 13:32:35,852][12883] Updated weights for policy 0, policy_version 143043 (0.0044) +[2024-06-18 13:32:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 2343665664. Throughput: 0: 42356.4. Samples: 2343794240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:36,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 13:32:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143046_2343665664.pth... +[2024-06-18 13:32:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142423_2333458432.pth +[2024-06-18 13:32:39,510][12883] Updated weights for policy 0, policy_version 143053 (0.0029) +[2024-06-18 13:32:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2343895040. Throughput: 0: 42094.8. Samples: 2344040100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:41,994][12645] Avg episode reward: [(0, '0.664')] +[2024-06-18 13:32:43,426][12883] Updated weights for policy 0, policy_version 143063 (0.0041) +[2024-06-18 13:32:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2344091648. Throughput: 0: 42268.3. Samples: 2344172980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:46,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 13:32:47,236][12883] Updated weights for policy 0, policy_version 143073 (0.0029) +[2024-06-18 13:32:51,318][12883] Updated weights for policy 0, policy_version 143083 (0.0033) +[2024-06-18 13:32:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2344321024. Throughput: 0: 42214.3. Samples: 2344427640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:51,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 13:32:55,039][12883] Updated weights for policy 0, policy_version 143093 (0.0031) +[2024-06-18 13:32:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2344534016. Throughput: 0: 42152.5. Samples: 2344676300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:32:56,995][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 13:32:58,842][12883] Updated weights for policy 0, policy_version 143103 (0.0042) +[2024-06-18 13:33:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42053.8, 300 sec: 42543.2). Total num frames: 2344730624. Throughput: 0: 42159.8. Samples: 2344806300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:33:01,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 13:33:02,626][12883] Updated weights for policy 0, policy_version 143113 (0.0031) +[2024-06-18 13:33:06,217][12883] Updated weights for policy 0, policy_version 143123 (0.0031) +[2024-06-18 13:33:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2344927232. Throughput: 0: 42213.8. Samples: 2345066340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:33:06,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 13:33:10,306][12883] Updated weights for policy 0, policy_version 143133 (0.0046) +[2024-06-18 13:33:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2345172992. Throughput: 0: 42381.0. Samples: 2345319720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 13:33:11,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 13:33:13,890][12883] Updated weights for policy 0, policy_version 143143 (0.0035) +[2024-06-18 13:33:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41780.4, 300 sec: 42487.3). Total num frames: 2345353216. Throughput: 0: 42334.2. Samples: 2345448180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:16,995][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 13:33:18,022][12883] Updated weights for policy 0, policy_version 143153 (0.0039) +[2024-06-18 13:33:21,537][12883] Updated weights for policy 0, policy_version 143163 (0.0034) +[2024-06-18 13:33:22,000][12645] Fps is (10 sec: 40934.2, 60 sec: 42320.9, 300 sec: 42597.5). Total num frames: 2345582592. Throughput: 0: 42328.9. Samples: 2345699300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:22,000][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 13:33:25,574][12883] Updated weights for policy 0, policy_version 143173 (0.0029) +[2024-06-18 13:33:26,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2345811968. Throughput: 0: 42595.5. Samples: 2345956900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:26,994][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 13:33:29,113][12883] Updated weights for policy 0, policy_version 143183 (0.0035) +[2024-06-18 13:33:31,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2346008576. Throughput: 0: 42580.9. Samples: 2346089120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:31,994][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 13:33:33,302][12883] Updated weights for policy 0, policy_version 143193 (0.0034) +[2024-06-18 13:33:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2346221568. Throughput: 0: 42495.9. Samples: 2346339960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:36,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 13:33:37,174][12883] Updated weights for policy 0, policy_version 143203 (0.0027) +[2024-06-18 13:33:37,916][12862] Signal inference workers to stop experience collection... (34300 times) +[2024-06-18 13:33:37,916][12862] Signal inference workers to resume experience collection... (34300 times) +[2024-06-18 13:33:37,968][12883] InferenceWorker_p0-w0: stopping experience collection (34300 times) +[2024-06-18 13:33:37,968][12883] InferenceWorker_p0-w0: resuming experience collection (34300 times) +[2024-06-18 13:33:41,216][12883] Updated weights for policy 0, policy_version 143213 (0.0032) +[2024-06-18 13:33:41,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2346434560. Throughput: 0: 42581.8. Samples: 2346592480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:41,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 13:33:44,870][12883] Updated weights for policy 0, policy_version 143223 (0.0032) +[2024-06-18 13:33:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2346631168. Throughput: 0: 42510.2. Samples: 2346719260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:46,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 13:33:49,054][12883] Updated weights for policy 0, policy_version 143233 (0.0040) +[2024-06-18 13:33:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 2346860544. Throughput: 0: 42411.4. Samples: 2346974860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:51,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 13:33:52,669][12883] Updated weights for policy 0, policy_version 143243 (0.0027) +[2024-06-18 13:33:56,516][12883] Updated weights for policy 0, policy_version 143253 (0.0040) +[2024-06-18 13:33:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2347089920. Throughput: 0: 42629.2. Samples: 2347238040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:33:56,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 13:34:00,409][12883] Updated weights for policy 0, policy_version 143263 (0.0053) +[2024-06-18 13:34:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2347270144. Throughput: 0: 42545.8. Samples: 2347362740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:34:01,994][12645] Avg episode reward: [(0, '0.248')] +[2024-06-18 13:34:04,038][12883] Updated weights for policy 0, policy_version 143273 (0.0051) +[2024-06-18 13:34:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2347515904. Throughput: 0: 42615.5. Samples: 2347616740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:34:06,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 13:34:08,014][12883] Updated weights for policy 0, policy_version 143283 (0.0028) +[2024-06-18 13:34:11,850][12883] Updated weights for policy 0, policy_version 143293 (0.0034) +[2024-06-18 13:34:11,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2347712512. Throughput: 0: 42549.4. Samples: 2347871620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:34:11,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 13:34:15,582][12883] Updated weights for policy 0, policy_version 143303 (0.0032) +[2024-06-18 13:34:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2347925504. Throughput: 0: 42329.7. Samples: 2347993960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 13:34:16,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 13:34:19,512][12883] Updated weights for policy 0, policy_version 143313 (0.0041) +[2024-06-18 13:34:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 2348154880. Throughput: 0: 42549.4. Samples: 2348254680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:34:21,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 13:34:23,615][12883] Updated weights for policy 0, policy_version 143323 (0.0046) +[2024-06-18 13:34:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2348335104. Throughput: 0: 42615.6. Samples: 2348510180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:34:26,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 13:34:27,445][12883] Updated weights for policy 0, policy_version 143333 (0.0032) +[2024-06-18 13:34:31,206][12883] Updated weights for policy 0, policy_version 143343 (0.0034) +[2024-06-18 13:34:31,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2348564480. Throughput: 0: 42473.9. Samples: 2348630680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:34:31,997][12645] Avg episode reward: [(0, '0.397')] +[2024-06-18 13:34:35,113][12883] Updated weights for policy 0, policy_version 143353 (0.0035) +[2024-06-18 13:34:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42543.1). Total num frames: 2348777472. Throughput: 0: 42557.5. Samples: 2348889940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:34:36,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 13:34:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143359_2348793856.pth... +[2024-06-18 13:34:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142735_2338570240.pth +[2024-06-18 13:34:38,704][12883] Updated weights for policy 0, policy_version 143363 (0.0032) +[2024-06-18 13:34:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2348974080. Throughput: 0: 42371.6. Samples: 2349144760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:34:41,994][12645] Avg episode reward: [(0, '0.633')] +[2024-06-18 13:34:43,042][12883] Updated weights for policy 0, policy_version 143373 (0.0038) +[2024-06-18 13:34:46,633][12883] Updated weights for policy 0, policy_version 143383 (0.0039) +[2024-06-18 13:34:46,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2349187072. Throughput: 0: 42284.6. Samples: 2349265640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:34:46,996][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 13:34:50,675][12883] Updated weights for policy 0, policy_version 143393 (0.0044) +[2024-06-18 13:34:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42654.8). Total num frames: 2349432832. Throughput: 0: 42364.1. Samples: 2349523120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:34:51,994][12645] Avg episode reward: [(0, '0.757')] +[2024-06-18 13:34:54,301][12883] Updated weights for policy 0, policy_version 143403 (0.0032) +[2024-06-18 13:34:56,994][12645] Fps is (10 sec: 40968.7, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 2349596672. Throughput: 0: 42379.8. Samples: 2349778720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:34:56,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 13:34:58,270][12883] Updated weights for policy 0, policy_version 143413 (0.0038) +[2024-06-18 13:35:01,876][12883] Updated weights for policy 0, policy_version 143423 (0.0038) +[2024-06-18 13:35:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2349842432. Throughput: 0: 42281.8. Samples: 2349896640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:35:01,994][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 13:35:06,189][12883] Updated weights for policy 0, policy_version 143433 (0.0054) +[2024-06-18 13:35:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 42543.2). Total num frames: 2350055424. Throughput: 0: 42306.7. Samples: 2350158480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:35:06,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 13:35:09,796][12883] Updated weights for policy 0, policy_version 143443 (0.0026) +[2024-06-18 13:35:11,996][12645] Fps is (10 sec: 39313.2, 60 sec: 42050.6, 300 sec: 42431.5). Total num frames: 2350235648. Throughput: 0: 42264.9. Samples: 2350412200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:35:11,997][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 13:35:13,966][12883] Updated weights for policy 0, policy_version 143453 (0.0038) +[2024-06-18 13:35:15,214][12862] Signal inference workers to stop experience collection... (34350 times) +[2024-06-18 13:35:15,215][12862] Signal inference workers to resume experience collection... (34350 times) +[2024-06-18 13:35:15,239][12883] InferenceWorker_p0-w0: stopping experience collection (34350 times) +[2024-06-18 13:35:15,240][12883] InferenceWorker_p0-w0: resuming experience collection (34350 times) +[2024-06-18 13:35:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 2350465024. Throughput: 0: 42310.1. Samples: 2350534540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:35:16,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 13:35:17,583][12883] Updated weights for policy 0, policy_version 143463 (0.0034) +[2024-06-18 13:35:21,519][12883] Updated weights for policy 0, policy_version 143473 (0.0046) +[2024-06-18 13:35:21,994][12645] Fps is (10 sec: 45885.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2350694400. Throughput: 0: 42383.0. Samples: 2350797180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) +[2024-06-18 13:35:21,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 13:35:25,163][12883] Updated weights for policy 0, policy_version 143483 (0.0032) +[2024-06-18 13:35:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2350874624. Throughput: 0: 42400.0. Samples: 2351052760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:35:26,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 13:35:29,070][12883] Updated weights for policy 0, policy_version 143493 (0.0036) +[2024-06-18 13:35:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2351120384. Throughput: 0: 42483.9. Samples: 2351177320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:35:31,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 13:35:32,666][12883] Updated weights for policy 0, policy_version 143503 (0.0038) +[2024-06-18 13:35:36,728][12883] Updated weights for policy 0, policy_version 143513 (0.0033) +[2024-06-18 13:35:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2351333376. Throughput: 0: 42701.3. Samples: 2351444680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:35:37,000][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 13:35:40,815][12883] Updated weights for policy 0, policy_version 143523 (0.0032) +[2024-06-18 13:35:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2351529984. Throughput: 0: 42544.5. Samples: 2351693220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:35:41,994][12645] Avg episode reward: [(0, '0.611')] +[2024-06-18 13:35:44,240][12883] Updated weights for policy 0, policy_version 143533 (0.0037) +[2024-06-18 13:35:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2351759360. Throughput: 0: 42690.3. Samples: 2351817700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:35:46,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 13:35:48,531][12883] Updated weights for policy 0, policy_version 143543 (0.0038) +[2024-06-18 13:35:51,784][12883] Updated weights for policy 0, policy_version 143553 (0.0031) +[2024-06-18 13:35:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2351972352. Throughput: 0: 42761.7. Samples: 2352082760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:35:51,994][12645] Avg episode reward: [(0, '0.178')] +[2024-06-18 13:35:56,173][12883] Updated weights for policy 0, policy_version 143563 (0.0042) +[2024-06-18 13:35:56,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 2352152576. Throughput: 0: 42735.6. Samples: 2352335300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:35:56,996][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 13:35:59,425][12883] Updated weights for policy 0, policy_version 143573 (0.0032) +[2024-06-18 13:36:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2352381952. Throughput: 0: 42571.5. Samples: 2352450260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:36:01,994][12645] Avg episode reward: [(0, '0.691')] +[2024-06-18 13:36:04,317][12883] Updated weights for policy 0, policy_version 143583 (0.0031) +[2024-06-18 13:36:06,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2352611328. Throughput: 0: 42613.9. Samples: 2352714800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:36:06,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 13:36:07,041][12883] Updated weights for policy 0, policy_version 143593 (0.0034) +[2024-06-18 13:36:11,892][12883] Updated weights for policy 0, policy_version 143603 (0.0037) +[2024-06-18 13:36:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42600.0, 300 sec: 42487.7). Total num frames: 2352791552. Throughput: 0: 42608.5. Samples: 2352970140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:36:11,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 13:36:14,641][12883] Updated weights for policy 0, policy_version 143613 (0.0032) +[2024-06-18 13:36:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 2353004544. Throughput: 0: 42572.1. Samples: 2353093060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:36:16,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 13:36:19,307][12883] Updated weights for policy 0, policy_version 143623 (0.0033) +[2024-06-18 13:36:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2353250304. Throughput: 0: 42467.1. Samples: 2353355700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) +[2024-06-18 13:36:21,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 13:36:22,341][12883] Updated weights for policy 0, policy_version 143633 (0.0039) +[2024-06-18 13:36:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2353430528. Throughput: 0: 42597.0. Samples: 2353610080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:36:26,994][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 13:36:27,049][12883] Updated weights for policy 0, policy_version 143643 (0.0032) +[2024-06-18 13:36:29,963][12883] Updated weights for policy 0, policy_version 143653 (0.0024) +[2024-06-18 13:36:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 2353659904. Throughput: 0: 42595.0. Samples: 2353734480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:36:31,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 13:36:34,927][12883] Updated weights for policy 0, policy_version 143663 (0.0031) +[2024-06-18 13:36:36,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2353889280. Throughput: 0: 42497.9. Samples: 2353995160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:36:36,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 13:36:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143670_2353889280.pth... +[2024-06-18 13:36:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143046_2343665664.pth +[2024-06-18 13:36:37,825][12862] Signal inference workers to stop experience collection... (34400 times) +[2024-06-18 13:36:37,825][12862] Signal inference workers to resume experience collection... (34400 times) +[2024-06-18 13:36:37,872][12883] InferenceWorker_p0-w0: stopping experience collection (34400 times) +[2024-06-18 13:36:37,873][12883] InferenceWorker_p0-w0: resuming experience collection (34400 times) +[2024-06-18 13:36:37,959][12883] Updated weights for policy 0, policy_version 143673 (0.0023) +[2024-06-18 13:36:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2354085888. Throughput: 0: 42680.8. Samples: 2354255840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:36:41,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 13:36:42,379][12883] Updated weights for policy 0, policy_version 143683 (0.0033) +[2024-06-18 13:36:45,556][12883] Updated weights for policy 0, policy_version 143693 (0.0039) +[2024-06-18 13:36:46,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2354315264. Throughput: 0: 42859.5. Samples: 2354378940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:36:46,994][12645] Avg episode reward: [(0, '0.642')] +[2024-06-18 13:36:49,925][12883] Updated weights for policy 0, policy_version 143703 (0.0042) +[2024-06-18 13:36:52,000][12645] Fps is (10 sec: 44208.9, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 2354528256. Throughput: 0: 42669.1. Samples: 2354635180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:36:52,001][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 13:36:53,164][12883] Updated weights for policy 0, policy_version 143713 (0.0030) +[2024-06-18 13:36:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42873.0, 300 sec: 42432.1). Total num frames: 2354724864. Throughput: 0: 42827.1. Samples: 2354897360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:36:56,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 13:36:57,333][12883] Updated weights for policy 0, policy_version 143723 (0.0053) +[2024-06-18 13:37:01,051][12883] Updated weights for policy 0, policy_version 143733 (0.0033) +[2024-06-18 13:37:01,994][12645] Fps is (10 sec: 42625.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2354954240. Throughput: 0: 42962.1. Samples: 2355026360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:37:01,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 13:37:05,055][12883] Updated weights for policy 0, policy_version 143743 (0.0035) +[2024-06-18 13:37:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 2355150848. Throughput: 0: 42816.4. Samples: 2355282440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:37:06,995][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 13:37:08,607][12883] Updated weights for policy 0, policy_version 143753 (0.0034) +[2024-06-18 13:37:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42432.0). Total num frames: 2355363840. Throughput: 0: 42826.2. Samples: 2355537260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:37:11,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 13:37:12,773][12883] Updated weights for policy 0, policy_version 143763 (0.0030) +[2024-06-18 13:37:16,312][12883] Updated weights for policy 0, policy_version 143773 (0.0034) +[2024-06-18 13:37:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2355593216. Throughput: 0: 42865.8. Samples: 2355663440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:37:16,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 13:37:20,441][12883] Updated weights for policy 0, policy_version 143783 (0.0032) +[2024-06-18 13:37:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2355822592. Throughput: 0: 42819.6. Samples: 2355922040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:37:21,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 13:37:23,959][12883] Updated weights for policy 0, policy_version 143793 (0.0039) +[2024-06-18 13:37:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 2356019200. Throughput: 0: 42695.4. Samples: 2356177140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) +[2024-06-18 13:37:26,994][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 13:37:28,132][12883] Updated weights for policy 0, policy_version 143803 (0.0033) +[2024-06-18 13:37:31,582][12883] Updated weights for policy 0, policy_version 143813 (0.0022) +[2024-06-18 13:37:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2356248576. Throughput: 0: 42726.8. Samples: 2356301640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:37:31,994][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 13:37:35,630][12883] Updated weights for policy 0, policy_version 143823 (0.0033) +[2024-06-18 13:37:36,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2356477952. Throughput: 0: 42962.3. Samples: 2356568220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:37:36,994][12645] Avg episode reward: [(0, '0.789')] +[2024-06-18 13:37:39,120][12883] Updated weights for policy 0, policy_version 143833 (0.0030) +[2024-06-18 13:37:41,998][12645] Fps is (10 sec: 39303.4, 60 sec: 42595.1, 300 sec: 42542.2). Total num frames: 2356641792. Throughput: 0: 42626.7. Samples: 2356815760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:37:41,999][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 13:37:42,096][12862] Signal inference workers to stop experience collection... (34450 times) +[2024-06-18 13:37:42,152][12862] Signal inference workers to resume experience collection... (34450 times) +[2024-06-18 13:37:42,152][12883] InferenceWorker_p0-w0: stopping experience collection (34450 times) +[2024-06-18 13:37:42,173][12883] InferenceWorker_p0-w0: resuming experience collection (34450 times) +[2024-06-18 13:37:43,244][12883] Updated weights for policy 0, policy_version 143843 (0.0026) +[2024-06-18 13:37:46,710][12883] Updated weights for policy 0, policy_version 143853 (0.0028) +[2024-06-18 13:37:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2356887552. Throughput: 0: 42599.1. Samples: 2356943320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:37:46,994][12645] Avg episode reward: [(0, '0.718')] +[2024-06-18 13:37:50,798][12883] Updated weights for policy 0, policy_version 143863 (0.0037) +[2024-06-18 13:37:51,994][12645] Fps is (10 sec: 44257.6, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 2357084160. Throughput: 0: 42701.6. Samples: 2357204000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:37:51,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 13:37:54,558][12883] Updated weights for policy 0, policy_version 143873 (0.0033) +[2024-06-18 13:37:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2357297152. Throughput: 0: 42763.0. Samples: 2357461600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:37:56,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 13:37:58,850][12883] Updated weights for policy 0, policy_version 143883 (0.0038) +[2024-06-18 13:38:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2357510144. Throughput: 0: 42711.6. Samples: 2357585460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:38:01,994][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 13:38:02,202][12883] Updated weights for policy 0, policy_version 143893 (0.0039) +[2024-06-18 13:38:06,381][12883] Updated weights for policy 0, policy_version 143903 (0.0032) +[2024-06-18 13:38:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.7, 300 sec: 42542.9). Total num frames: 2357723136. Throughput: 0: 42828.4. Samples: 2357849320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:38:06,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 13:38:09,845][12883] Updated weights for policy 0, policy_version 143913 (0.0035) +[2024-06-18 13:38:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2357936128. Throughput: 0: 42789.0. Samples: 2358102640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:38:11,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 13:38:13,942][12883] Updated weights for policy 0, policy_version 143923 (0.0035) +[2024-06-18 13:38:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 2358165504. Throughput: 0: 42800.5. Samples: 2358227660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:38:16,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 13:38:17,458][12883] Updated weights for policy 0, policy_version 143933 (0.0030) +[2024-06-18 13:38:21,709][12883] Updated weights for policy 0, policy_version 143943 (0.0031) +[2024-06-18 13:38:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2358362112. Throughput: 0: 42721.0. Samples: 2358490660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:38:21,994][12645] Avg episode reward: [(0, '0.238')] +[2024-06-18 13:38:25,350][12883] Updated weights for policy 0, policy_version 143953 (0.0038) +[2024-06-18 13:38:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2358591488. Throughput: 0: 42928.5. Samples: 2358747340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:38:26,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 13:38:29,206][12883] Updated weights for policy 0, policy_version 143963 (0.0031) +[2024-06-18 13:38:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2358820864. Throughput: 0: 42984.1. Samples: 2358877600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) +[2024-06-18 13:38:31,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 13:38:32,905][12883] Updated weights for policy 0, policy_version 143973 (0.0037) +[2024-06-18 13:38:36,866][12883] Updated weights for policy 0, policy_version 143983 (0.0032) +[2024-06-18 13:38:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 2359017472. Throughput: 0: 42840.5. Samples: 2359131820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:38:36,994][12645] Avg episode reward: [(0, '0.641')] +[2024-06-18 13:38:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143983_2359017472.pth... +[2024-06-18 13:38:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143359_2348793856.pth +[2024-06-18 13:38:40,482][12883] Updated weights for policy 0, policy_version 143993 (0.0031) +[2024-06-18 13:38:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43147.9, 300 sec: 42709.5). Total num frames: 2359230464. Throughput: 0: 42833.0. Samples: 2359389080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:38:41,994][12645] Avg episode reward: [(0, '0.641')] +[2024-06-18 13:38:44,449][12883] Updated weights for policy 0, policy_version 144003 (0.0025) +[2024-06-18 13:38:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2359459840. Throughput: 0: 42958.1. Samples: 2359518580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:38:46,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 13:38:48,142][12883] Updated weights for policy 0, policy_version 144013 (0.0039) +[2024-06-18 13:38:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2359656448. Throughput: 0: 42753.4. Samples: 2359773220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:38:51,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 13:38:52,134][12883] Updated weights for policy 0, policy_version 144023 (0.0042) +[2024-06-18 13:38:56,166][12883] Updated weights for policy 0, policy_version 144033 (0.0033) +[2024-06-18 13:38:56,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2359869440. Throughput: 0: 42716.1. Samples: 2360024860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:38:56,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 13:38:59,889][12883] Updated weights for policy 0, policy_version 144043 (0.0045) +[2024-06-18 13:39:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2360082432. Throughput: 0: 42797.3. Samples: 2360153540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:39:01,994][12645] Avg episode reward: [(0, '0.713')] +[2024-06-18 13:39:03,712][12883] Updated weights for policy 0, policy_version 144053 (0.0040) +[2024-06-18 13:39:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2360295424. Throughput: 0: 42608.9. Samples: 2360408060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:39:06,996][12645] Avg episode reward: [(0, '0.380')] +[2024-06-18 13:39:07,381][12883] Updated weights for policy 0, policy_version 144063 (0.0032) +[2024-06-18 13:39:11,440][12883] Updated weights for policy 0, policy_version 144073 (0.0043) +[2024-06-18 13:39:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2360524800. Throughput: 0: 42553.4. Samples: 2360662240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:39:11,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 13:39:15,345][12883] Updated weights for policy 0, policy_version 144083 (0.0054) +[2024-06-18 13:39:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2360705024. Throughput: 0: 42416.4. Samples: 2360786340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:39:16,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 13:39:19,349][12883] Updated weights for policy 0, policy_version 144093 (0.0023) +[2024-06-18 13:39:19,631][12862] Signal inference workers to stop experience collection... (34500 times) +[2024-06-18 13:39:19,679][12883] InferenceWorker_p0-w0: stopping experience collection (34500 times) +[2024-06-18 13:39:19,684][12862] Signal inference workers to resume experience collection... (34500 times) +[2024-06-18 13:39:19,693][12883] InferenceWorker_p0-w0: resuming experience collection (34500 times) +[2024-06-18 13:39:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2360934400. Throughput: 0: 42518.7. Samples: 2361045160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:39:21,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 13:39:22,806][12883] Updated weights for policy 0, policy_version 144103 (0.0027) +[2024-06-18 13:39:26,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2361131008. Throughput: 0: 42588.1. Samples: 2361305540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:39:26,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 13:39:27,055][12883] Updated weights for policy 0, policy_version 144113 (0.0030) +[2024-06-18 13:39:30,581][12883] Updated weights for policy 0, policy_version 144123 (0.0026) +[2024-06-18 13:39:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2361360384. Throughput: 0: 42424.9. Samples: 2361427700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:39:31,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 13:39:34,499][12883] Updated weights for policy 0, policy_version 144133 (0.0033) +[2024-06-18 13:39:36,996][12645] Fps is (10 sec: 45864.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2361589760. Throughput: 0: 42580.0. Samples: 2361689420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:39:36,997][12645] Avg episode reward: [(0, '0.722')] +[2024-06-18 13:39:38,137][12883] Updated weights for policy 0, policy_version 144143 (0.0039) +[2024-06-18 13:39:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2361769984. Throughput: 0: 42696.4. Samples: 2361946200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:39:41,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 13:39:42,210][12883] Updated weights for policy 0, policy_version 144153 (0.0033) +[2024-06-18 13:39:45,755][12883] Updated weights for policy 0, policy_version 144163 (0.0040) +[2024-06-18 13:39:46,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2361999360. Throughput: 0: 42580.3. Samples: 2362069660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:39:46,994][12645] Avg episode reward: [(0, '0.660')] +[2024-06-18 13:39:49,730][12883] Updated weights for policy 0, policy_version 144173 (0.0043) +[2024-06-18 13:39:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2362212352. Throughput: 0: 42697.0. Samples: 2362329420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:39:51,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 13:39:53,327][12883] Updated weights for policy 0, policy_version 144183 (0.0026) +[2024-06-18 13:39:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 2362425344. Throughput: 0: 42758.0. Samples: 2362586360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:39:57,008][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 13:39:57,707][12883] Updated weights for policy 0, policy_version 144193 (0.0026) +[2024-06-18 13:40:00,929][12883] Updated weights for policy 0, policy_version 144203 (0.0028) +[2024-06-18 13:40:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2362638336. Throughput: 0: 42838.8. Samples: 2362714080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:40:01,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 13:40:05,213][12883] Updated weights for policy 0, policy_version 144213 (0.0029) +[2024-06-18 13:40:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 2362867712. Throughput: 0: 42791.4. Samples: 2362970780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:40:06,994][12645] Avg episode reward: [(0, '0.709')] +[2024-06-18 13:40:08,755][12883] Updated weights for policy 0, policy_version 144223 (0.0025) +[2024-06-18 13:40:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2363047936. Throughput: 0: 42737.2. Samples: 2363228720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:40:11,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 13:40:12,850][12883] Updated weights for policy 0, policy_version 144233 (0.0048) +[2024-06-18 13:40:16,272][12883] Updated weights for policy 0, policy_version 144243 (0.0036) +[2024-06-18 13:40:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2363293696. Throughput: 0: 42813.8. Samples: 2363354320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:40:16,994][12645] Avg episode reward: [(0, '0.711')] +[2024-06-18 13:40:20,484][12883] Updated weights for policy 0, policy_version 144253 (0.0024) +[2024-06-18 13:40:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2363490304. Throughput: 0: 42648.3. Samples: 2363608500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:40:21,994][12645] Avg episode reward: [(0, '0.193')] +[2024-06-18 13:40:23,927][12883] Updated weights for policy 0, policy_version 144263 (0.0040) +[2024-06-18 13:40:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2363670528. Throughput: 0: 42769.7. Samples: 2363870840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:40:26,994][12645] Avg episode reward: [(0, '0.705')] +[2024-06-18 13:40:28,574][12883] Updated weights for policy 0, policy_version 144273 (0.0031) +[2024-06-18 13:40:31,417][12883] Updated weights for policy 0, policy_version 144283 (0.0042) +[2024-06-18 13:40:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2363932672. Throughput: 0: 42689.5. Samples: 2363990680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:40:31,994][12645] Avg episode reward: [(0, '0.805')] +[2024-06-18 13:40:36,031][12883] Updated weights for policy 0, policy_version 144293 (0.0024) +[2024-06-18 13:40:36,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 2364129280. Throughput: 0: 42695.1. Samples: 2364250700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) +[2024-06-18 13:40:36,994][12645] Avg episode reward: [(0, '0.622')] +[2024-06-18 13:40:37,073][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144296_2364145664.pth... +[2024-06-18 13:40:37,117][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143670_2353889280.pth +[2024-06-18 13:40:39,348][12883] Updated weights for policy 0, policy_version 144303 (0.0048) +[2024-06-18 13:40:41,996][12645] Fps is (10 sec: 37674.9, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 2364309504. Throughput: 0: 42726.0. Samples: 2364509120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:40:41,996][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 13:40:43,586][12883] Updated weights for policy 0, policy_version 144313 (0.0027) +[2024-06-18 13:40:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2364571648. Throughput: 0: 42663.1. Samples: 2364633920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:40:46,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 13:40:47,048][12883] Updated weights for policy 0, policy_version 144323 (0.0036) +[2024-06-18 13:40:51,415][12883] Updated weights for policy 0, policy_version 144333 (0.0039) +[2024-06-18 13:40:51,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 2364768256. Throughput: 0: 42747.6. Samples: 2364894420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:40:51,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 13:40:54,625][12862] Signal inference workers to stop experience collection... (34550 times) +[2024-06-18 13:40:54,626][12862] Signal inference workers to resume experience collection... (34550 times) +[2024-06-18 13:40:54,674][12883] InferenceWorker_p0-w0: stopping experience collection (34550 times) +[2024-06-18 13:40:54,674][12883] InferenceWorker_p0-w0: resuming experience collection (34550 times) +[2024-06-18 13:40:54,769][12883] Updated weights for policy 0, policy_version 144343 (0.0037) +[2024-06-18 13:40:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2364964864. Throughput: 0: 42612.3. Samples: 2365146280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:40:56,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 13:40:59,067][12883] Updated weights for policy 0, policy_version 144353 (0.0038) +[2024-06-18 13:41:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2365210624. Throughput: 0: 42744.1. Samples: 2365277800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:01,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 13:41:02,227][12883] Updated weights for policy 0, policy_version 144363 (0.0028) +[2024-06-18 13:41:06,527][12883] Updated weights for policy 0, policy_version 144373 (0.0024) +[2024-06-18 13:41:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2365407232. Throughput: 0: 42821.8. Samples: 2365535480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:06,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 13:41:10,457][12883] Updated weights for policy 0, policy_version 144383 (0.0021) +[2024-06-18 13:41:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2365620224. Throughput: 0: 42631.1. Samples: 2365789240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:11,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 13:41:14,165][12883] Updated weights for policy 0, policy_version 144393 (0.0035) +[2024-06-18 13:41:16,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2365865984. Throughput: 0: 42832.7. Samples: 2365918160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:16,995][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 13:41:18,051][12883] Updated weights for policy 0, policy_version 144403 (0.0031) +[2024-06-18 13:41:21,764][12883] Updated weights for policy 0, policy_version 144413 (0.0037) +[2024-06-18 13:41:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2366062592. Throughput: 0: 42809.7. Samples: 2366177140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:21,994][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 13:41:25,695][12883] Updated weights for policy 0, policy_version 144423 (0.0028) +[2024-06-18 13:41:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2366275584. Throughput: 0: 42630.0. Samples: 2366427380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:26,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 13:41:29,646][12883] Updated weights for policy 0, policy_version 144433 (0.0042) +[2024-06-18 13:41:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2366504960. Throughput: 0: 42715.6. Samples: 2366556120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:31,994][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 13:41:33,350][12883] Updated weights for policy 0, policy_version 144443 (0.0031) +[2024-06-18 13:41:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2366685184. Throughput: 0: 42642.2. Samples: 2366813320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:36,994][12645] Avg episode reward: [(0, '0.657')] +[2024-06-18 13:41:37,469][12883] Updated weights for policy 0, policy_version 144453 (0.0033) +[2024-06-18 13:41:41,004][12883] Updated weights for policy 0, policy_version 144463 (0.0031) +[2024-06-18 13:41:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43419.2, 300 sec: 42709.5). Total num frames: 2366914560. Throughput: 0: 42669.4. Samples: 2367066400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) +[2024-06-18 13:41:41,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 13:41:45,090][12883] Updated weights for policy 0, policy_version 144473 (0.0035) +[2024-06-18 13:41:46,995][12645] Fps is (10 sec: 42594.6, 60 sec: 42324.7, 300 sec: 42654.7). Total num frames: 2367111168. Throughput: 0: 42577.7. Samples: 2367193840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:41:46,995][12645] Avg episode reward: [(0, '0.680')] +[2024-06-18 13:41:48,593][12883] Updated weights for policy 0, policy_version 144483 (0.0032) +[2024-06-18 13:41:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2367340544. Throughput: 0: 42585.3. Samples: 2367451820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:41:51,994][12645] Avg episode reward: [(0, '0.702')] +[2024-06-18 13:41:52,766][12883] Updated weights for policy 0, policy_version 144493 (0.0029) +[2024-06-18 13:41:56,467][12883] Updated weights for policy 0, policy_version 144503 (0.0038) +[2024-06-18 13:41:56,994][12645] Fps is (10 sec: 42602.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2367537152. Throughput: 0: 42424.8. Samples: 2367698360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:41:56,994][12645] Avg episode reward: [(0, '0.748')] +[2024-06-18 13:42:00,503][12883] Updated weights for policy 0, policy_version 144513 (0.0038) +[2024-06-18 13:42:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2367750144. Throughput: 0: 42472.1. Samples: 2367829400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:01,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 13:42:04,378][12883] Updated weights for policy 0, policy_version 144523 (0.0037) +[2024-06-18 13:42:06,996][12645] Fps is (10 sec: 42589.6, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 2367963136. Throughput: 0: 42354.0. Samples: 2368083160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:06,996][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 13:42:08,227][12883] Updated weights for policy 0, policy_version 144533 (0.0042) +[2024-06-18 13:42:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2368159744. Throughput: 0: 42294.3. Samples: 2368330620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:11,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 13:42:12,186][12883] Updated weights for policy 0, policy_version 144543 (0.0037) +[2024-06-18 13:42:16,030][12883] Updated weights for policy 0, policy_version 144553 (0.0035) +[2024-06-18 13:42:16,994][12645] Fps is (10 sec: 42607.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2368389120. Throughput: 0: 42300.8. Samples: 2368459660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:16,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 13:42:18,262][12862] Signal inference workers to stop experience collection... (34600 times) +[2024-06-18 13:42:18,300][12883] InferenceWorker_p0-w0: stopping experience collection (34600 times) +[2024-06-18 13:42:18,322][12862] Signal inference workers to resume experience collection... (34600 times) +[2024-06-18 13:42:18,323][12883] InferenceWorker_p0-w0: resuming experience collection (34600 times) +[2024-06-18 13:42:20,371][12883] Updated weights for policy 0, policy_version 144563 (0.0027) +[2024-06-18 13:42:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 2368585728. Throughput: 0: 42202.3. Samples: 2368712420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:21,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 13:42:23,635][12883] Updated weights for policy 0, policy_version 144573 (0.0035) +[2024-06-18 13:42:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2368815104. Throughput: 0: 42200.9. Samples: 2368965440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:26,994][12645] Avg episode reward: [(0, '0.672')] +[2024-06-18 13:42:28,082][12883] Updated weights for policy 0, policy_version 144583 (0.0024) +[2024-06-18 13:42:31,198][12883] Updated weights for policy 0, policy_version 144593 (0.0040) +[2024-06-18 13:42:31,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2369044480. Throughput: 0: 42344.3. Samples: 2369099300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:31,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 13:42:35,654][12883] Updated weights for policy 0, policy_version 144603 (0.0027) +[2024-06-18 13:42:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42599.1). Total num frames: 2369208320. Throughput: 0: 42167.1. Samples: 2369349340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:36,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 13:42:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144605_2369208320.pth... +[2024-06-18 13:42:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143983_2359017472.pth +[2024-06-18 13:42:38,811][12883] Updated weights for policy 0, policy_version 144613 (0.0031) +[2024-06-18 13:42:41,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2369437696. Throughput: 0: 42210.4. Samples: 2369597820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 13:42:41,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 13:42:43,302][12883] Updated weights for policy 0, policy_version 144623 (0.0032) +[2024-06-18 13:42:46,630][12883] Updated weights for policy 0, policy_version 144633 (0.0027) +[2024-06-18 13:42:46,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.9, 300 sec: 42653.9). Total num frames: 2369667072. Throughput: 0: 42243.5. Samples: 2369730360. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:42:46,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 13:42:50,882][12883] Updated weights for policy 0, policy_version 144643 (0.0038) +[2024-06-18 13:42:51,996][12645] Fps is (10 sec: 40950.6, 60 sec: 41777.7, 300 sec: 42542.6). Total num frames: 2369847296. Throughput: 0: 42196.8. Samples: 2369982020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:42:51,996][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 13:42:54,270][12883] Updated weights for policy 0, policy_version 144653 (0.0043) +[2024-06-18 13:42:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2370060288. Throughput: 0: 42340.4. Samples: 2370235940. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:42:56,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 13:42:58,572][12883] Updated weights for policy 0, policy_version 144663 (0.0045) +[2024-06-18 13:43:01,994][12645] Fps is (10 sec: 45885.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2370306048. Throughput: 0: 42225.1. Samples: 2370359780. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:01,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 13:43:02,083][12883] Updated weights for policy 0, policy_version 144673 (0.0035) +[2024-06-18 13:43:06,206][12883] Updated weights for policy 0, policy_version 144683 (0.0043) +[2024-06-18 13:43:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 2370502656. Throughput: 0: 42262.9. Samples: 2370614260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:06,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 13:43:10,247][12883] Updated weights for policy 0, policy_version 144693 (0.0035) +[2024-06-18 13:43:11,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2370699264. Throughput: 0: 42255.1. Samples: 2370866920. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:11,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 13:43:14,098][12883] Updated weights for policy 0, policy_version 144703 (0.0031) +[2024-06-18 13:43:16,996][12645] Fps is (10 sec: 44227.4, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 2370945024. Throughput: 0: 42103.4. Samples: 2370994040. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:16,996][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 13:43:17,741][12883] Updated weights for policy 0, policy_version 144713 (0.0030) +[2024-06-18 13:43:21,985][12883] Updated weights for policy 0, policy_version 144723 (0.0039) +[2024-06-18 13:43:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2371141632. Throughput: 0: 42361.9. Samples: 2371255620. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:22,008][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 13:43:24,137][12862] Signal inference workers to stop experience collection... (34650 times) +[2024-06-18 13:43:24,192][12883] InferenceWorker_p0-w0: stopping experience collection (34650 times) +[2024-06-18 13:43:24,252][12862] Signal inference workers to resume experience collection... (34650 times) +[2024-06-18 13:43:24,252][12883] InferenceWorker_p0-w0: resuming experience collection (34650 times) +[2024-06-18 13:43:25,583][12883] Updated weights for policy 0, policy_version 144733 (0.0028) +[2024-06-18 13:43:26,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2371354624. Throughput: 0: 42473.7. Samples: 2371509140. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:26,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 13:43:29,729][12883] Updated weights for policy 0, policy_version 144743 (0.0031) +[2024-06-18 13:43:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2371567616. Throughput: 0: 42566.0. Samples: 2371645820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:31,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 13:43:33,185][12883] Updated weights for policy 0, policy_version 144753 (0.0033) +[2024-06-18 13:43:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2371764224. Throughput: 0: 42570.6. Samples: 2371897600. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:36,994][12645] Avg episode reward: [(0, '0.688')] +[2024-06-18 13:43:37,530][12883] Updated weights for policy 0, policy_version 144763 (0.0036) +[2024-06-18 13:43:40,779][12883] Updated weights for policy 0, policy_version 144773 (0.0030) +[2024-06-18 13:43:41,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 2372009984. Throughput: 0: 42479.4. Samples: 2372147520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:41,995][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 13:43:45,226][12883] Updated weights for policy 0, policy_version 144783 (0.0040) +[2024-06-18 13:43:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2372206592. Throughput: 0: 42783.9. Samples: 2372285060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) +[2024-06-18 13:43:46,994][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 13:43:48,175][12883] Updated weights for policy 0, policy_version 144793 (0.0036) +[2024-06-18 13:43:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42873.0, 300 sec: 42542.8). Total num frames: 2372419584. Throughput: 0: 42820.9. Samples: 2372541200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:43:51,994][12645] Avg episode reward: [(0, '0.199')] +[2024-06-18 13:43:52,755][12883] Updated weights for policy 0, policy_version 144803 (0.0036) +[2024-06-18 13:43:55,600][12883] Updated weights for policy 0, policy_version 144813 (0.0029) +[2024-06-18 13:43:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2372648960. Throughput: 0: 42928.5. Samples: 2372798700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:43:56,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 13:44:00,357][12883] Updated weights for policy 0, policy_version 144823 (0.0032) +[2024-06-18 13:44:01,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 2372861952. Throughput: 0: 43093.7. Samples: 2372933260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:01,997][12645] Avg episode reward: [(0, '0.247')] +[2024-06-18 13:44:03,311][12883] Updated weights for policy 0, policy_version 144833 (0.0033) +[2024-06-18 13:44:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2373074944. Throughput: 0: 43007.0. Samples: 2373190940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:06,994][12645] Avg episode reward: [(0, '0.331')] +[2024-06-18 13:44:07,852][12883] Updated weights for policy 0, policy_version 144843 (0.0037) +[2024-06-18 13:44:10,974][12883] Updated weights for policy 0, policy_version 144853 (0.0054) +[2024-06-18 13:44:11,994][12645] Fps is (10 sec: 44246.2, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2373304320. Throughput: 0: 43016.2. Samples: 2373444880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:12,000][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 13:44:15,396][12883] Updated weights for policy 0, policy_version 144863 (0.0041) +[2024-06-18 13:44:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42599.8, 300 sec: 42598.4). Total num frames: 2373500928. Throughput: 0: 42873.5. Samples: 2373575140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:16,995][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 13:44:18,613][12883] Updated weights for policy 0, policy_version 144873 (0.0040) +[2024-06-18 13:44:21,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2373697536. Throughput: 0: 43005.2. Samples: 2373832840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:21,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 13:44:23,366][12883] Updated weights for policy 0, policy_version 144883 (0.0029) +[2024-06-18 13:44:26,321][12883] Updated weights for policy 0, policy_version 144893 (0.0029) +[2024-06-18 13:44:26,994][12645] Fps is (10 sec: 45876.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2373959680. Throughput: 0: 42918.4. Samples: 2374078840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:26,994][12645] Avg episode reward: [(0, '0.594')] +[2024-06-18 13:44:30,995][12883] Updated weights for policy 0, policy_version 144903 (0.0033) +[2024-06-18 13:44:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 2374123520. Throughput: 0: 42869.8. Samples: 2374214200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:31,994][12645] Avg episode reward: [(0, '0.594')] +[2024-06-18 13:44:33,943][12883] Updated weights for policy 0, policy_version 144913 (0.0036) +[2024-06-18 13:44:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2374352896. Throughput: 0: 42796.9. Samples: 2374467060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:36,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 13:44:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144919_2374352896.pth... +[2024-06-18 13:44:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144296_2364145664.pth +[2024-06-18 13:44:38,743][12883] Updated weights for policy 0, policy_version 144923 (0.0033) +[2024-06-18 13:44:41,801][12883] Updated weights for policy 0, policy_version 144933 (0.0028) +[2024-06-18 13:44:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2374582272. Throughput: 0: 42759.0. Samples: 2374722860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:41,994][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 13:44:46,159][12883] Updated weights for policy 0, policy_version 144943 (0.0026) +[2024-06-18 13:44:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2374778880. Throughput: 0: 42710.2. Samples: 2374855120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:46,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 13:44:49,565][12883] Updated weights for policy 0, policy_version 144953 (0.0029) +[2024-06-18 13:44:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2374991872. Throughput: 0: 42596.9. Samples: 2375107800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) +[2024-06-18 13:44:51,995][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 13:44:54,065][12883] Updated weights for policy 0, policy_version 144963 (0.0023) +[2024-06-18 13:44:55,917][12862] Signal inference workers to stop experience collection... (34700 times) +[2024-06-18 13:44:55,917][12862] Signal inference workers to resume experience collection... (34700 times) +[2024-06-18 13:44:55,949][12883] InferenceWorker_p0-w0: stopping experience collection (34700 times) +[2024-06-18 13:44:55,949][12883] InferenceWorker_p0-w0: resuming experience collection (34700 times) +[2024-06-18 13:44:56,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 2375221248. Throughput: 0: 42629.2. Samples: 2375363280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:44:56,996][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 13:44:57,122][12883] Updated weights for policy 0, policy_version 144973 (0.0030) +[2024-06-18 13:45:01,775][12883] Updated weights for policy 0, policy_version 144983 (0.0036) +[2024-06-18 13:45:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 2375401472. Throughput: 0: 42663.2. Samples: 2375494980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:01,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 13:45:04,685][12883] Updated weights for policy 0, policy_version 144993 (0.0039) +[2024-06-18 13:45:06,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2375630848. Throughput: 0: 42556.4. Samples: 2375747880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:06,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 13:45:09,543][12883] Updated weights for policy 0, policy_version 145003 (0.0045) +[2024-06-18 13:45:11,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 2375860224. Throughput: 0: 42750.7. Samples: 2376002620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:11,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 13:45:12,521][12883] Updated weights for policy 0, policy_version 145013 (0.0044) +[2024-06-18 13:45:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2376040448. Throughput: 0: 42726.6. Samples: 2376136900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:16,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 13:45:17,121][12883] Updated weights for policy 0, policy_version 145023 (0.0038) +[2024-06-18 13:45:20,136][12883] Updated weights for policy 0, policy_version 145033 (0.0031) +[2024-06-18 13:45:21,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2376269824. Throughput: 0: 42591.9. Samples: 2376383700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:21,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 13:45:24,663][12883] Updated weights for policy 0, policy_version 145043 (0.0034) +[2024-06-18 13:45:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2376499200. Throughput: 0: 42877.7. Samples: 2376652360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:26,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 13:45:27,695][12883] Updated weights for policy 0, policy_version 145053 (0.0031) +[2024-06-18 13:45:31,994][12645] Fps is (10 sec: 42599.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2376695808. Throughput: 0: 42794.4. Samples: 2376780860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:31,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 13:45:32,119][12883] Updated weights for policy 0, policy_version 145063 (0.0042) +[2024-06-18 13:45:35,334][12883] Updated weights for policy 0, policy_version 145073 (0.0029) +[2024-06-18 13:45:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2376925184. Throughput: 0: 42713.3. Samples: 2377029900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:36,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 13:45:39,570][12883] Updated weights for policy 0, policy_version 145083 (0.0030) +[2024-06-18 13:45:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2377138176. Throughput: 0: 43045.8. Samples: 2377300240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:41,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 13:45:42,760][12883] Updated weights for policy 0, policy_version 145093 (0.0038) +[2024-06-18 13:45:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2377351168. Throughput: 0: 42868.6. Samples: 2377424060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:46,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 13:45:47,104][12883] Updated weights for policy 0, policy_version 145103 (0.0035) +[2024-06-18 13:45:50,703][12883] Updated weights for policy 0, policy_version 145113 (0.0040) +[2024-06-18 13:45:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2377580544. Throughput: 0: 42972.9. Samples: 2377681660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) +[2024-06-18 13:45:51,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 13:45:54,918][12883] Updated weights for policy 0, policy_version 145123 (0.0031) +[2024-06-18 13:45:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 2377777152. Throughput: 0: 43187.6. Samples: 2377946060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:45:56,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 13:45:58,261][12883] Updated weights for policy 0, policy_version 145133 (0.0047) +[2024-06-18 13:46:00,579][12862] Signal inference workers to stop experience collection... (34750 times) +[2024-06-18 13:46:00,580][12862] Signal inference workers to resume experience collection... (34750 times) +[2024-06-18 13:46:00,591][12883] InferenceWorker_p0-w0: stopping experience collection (34750 times) +[2024-06-18 13:46:00,621][12883] InferenceWorker_p0-w0: resuming experience collection (34750 times) +[2024-06-18 13:46:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2377990144. Throughput: 0: 42945.2. Samples: 2378069440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:01,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 13:46:02,522][12883] Updated weights for policy 0, policy_version 145143 (0.0023) +[2024-06-18 13:46:05,891][12883] Updated weights for policy 0, policy_version 145153 (0.0045) +[2024-06-18 13:46:06,994][12645] Fps is (10 sec: 45874.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2378235904. Throughput: 0: 43276.0. Samples: 2378331120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:06,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 13:46:10,017][12883] Updated weights for policy 0, policy_version 145163 (0.0021) +[2024-06-18 13:46:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2378416128. Throughput: 0: 42932.9. Samples: 2378584340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:11,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 13:46:13,694][12883] Updated weights for policy 0, policy_version 145173 (0.0038) +[2024-06-18 13:46:16,994][12645] Fps is (10 sec: 39322.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2378629120. Throughput: 0: 42758.2. Samples: 2378704980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:16,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 13:46:17,489][12883] Updated weights for policy 0, policy_version 145183 (0.0029) +[2024-06-18 13:46:21,157][12883] Updated weights for policy 0, policy_version 145193 (0.0030) +[2024-06-18 13:46:21,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43417.8, 300 sec: 42709.5). Total num frames: 2378874880. Throughput: 0: 43179.8. Samples: 2378972980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:21,994][12645] Avg episode reward: [(0, '0.283')] +[2024-06-18 13:46:25,254][12883] Updated weights for policy 0, policy_version 145203 (0.0031) +[2024-06-18 13:46:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2379055104. Throughput: 0: 42925.8. Samples: 2379231900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:26,994][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 13:46:28,759][12883] Updated weights for policy 0, policy_version 145213 (0.0042) +[2024-06-18 13:46:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2379284480. Throughput: 0: 42864.9. Samples: 2379352980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:31,994][12645] Avg episode reward: [(0, '0.802')] +[2024-06-18 13:46:32,848][12883] Updated weights for policy 0, policy_version 145223 (0.0039) +[2024-06-18 13:46:36,277][12883] Updated weights for policy 0, policy_version 145233 (0.0035) +[2024-06-18 13:46:36,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2379530240. Throughput: 0: 43064.1. Samples: 2379619540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:36,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 13:46:37,106][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145236_2379546624.pth... +[2024-06-18 13:46:37,154][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144605_2369208320.pth +[2024-06-18 13:46:41,050][12883] Updated weights for policy 0, policy_version 145243 (0.0036) +[2024-06-18 13:46:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 2379694080. Throughput: 0: 42811.5. Samples: 2379872580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:41,994][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 13:46:43,988][12883] Updated weights for policy 0, policy_version 145253 (0.0037) +[2024-06-18 13:46:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2379923456. Throughput: 0: 42756.2. Samples: 2379993460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:46,994][12645] Avg episode reward: [(0, '0.085')] +[2024-06-18 13:46:48,605][12883] Updated weights for policy 0, policy_version 145263 (0.0032) +[2024-06-18 13:46:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2380136448. Throughput: 0: 42758.8. Samples: 2380255260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:51,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 13:46:52,090][12883] Updated weights for policy 0, policy_version 145273 (0.0037) +[2024-06-18 13:46:56,193][12883] Updated weights for policy 0, policy_version 145283 (0.0032) +[2024-06-18 13:46:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2380333056. Throughput: 0: 42888.9. Samples: 2380514340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) +[2024-06-18 13:46:56,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 13:46:59,722][12883] Updated weights for policy 0, policy_version 145293 (0.0035) +[2024-06-18 13:47:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2380562432. Throughput: 0: 42960.8. Samples: 2380638220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:01,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 13:47:04,157][12883] Updated weights for policy 0, policy_version 145303 (0.0045) +[2024-06-18 13:47:06,437][12862] Signal inference workers to stop experience collection... (34800 times) +[2024-06-18 13:47:06,444][12862] Signal inference workers to resume experience collection... (34800 times) +[2024-06-18 13:47:06,487][12883] InferenceWorker_p0-w0: stopping experience collection (34800 times) +[2024-06-18 13:47:06,487][12883] InferenceWorker_p0-w0: resuming experience collection (34800 times) +[2024-06-18 13:47:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2380775424. Throughput: 0: 42649.7. Samples: 2380892220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:06,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 13:47:07,396][12883] Updated weights for policy 0, policy_version 145313 (0.0043) +[2024-06-18 13:47:11,851][12883] Updated weights for policy 0, policy_version 145323 (0.0032) +[2024-06-18 13:47:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2380972032. Throughput: 0: 42557.2. Samples: 2381146980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:11,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 13:47:15,397][12883] Updated weights for policy 0, policy_version 145333 (0.0041) +[2024-06-18 13:47:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2381217792. Throughput: 0: 42606.2. Samples: 2381270260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:16,994][12645] Avg episode reward: [(0, '0.733')] +[2024-06-18 13:47:19,462][12883] Updated weights for policy 0, policy_version 145343 (0.0048) +[2024-06-18 13:47:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2381398016. Throughput: 0: 42401.3. Samples: 2381527600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:21,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 13:47:23,055][12883] Updated weights for policy 0, policy_version 145353 (0.0028) +[2024-06-18 13:47:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2381611008. Throughput: 0: 42410.1. Samples: 2381781040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:26,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 13:47:27,174][12883] Updated weights for policy 0, policy_version 145363 (0.0044) +[2024-06-18 13:47:30,650][12883] Updated weights for policy 0, policy_version 145373 (0.0040) +[2024-06-18 13:47:31,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2381873152. Throughput: 0: 42600.3. Samples: 2381910480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:31,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 13:47:34,889][12883] Updated weights for policy 0, policy_version 145383 (0.0041) +[2024-06-18 13:47:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 2382036992. Throughput: 0: 42415.6. Samples: 2382163960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:36,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 13:47:38,591][12883] Updated weights for policy 0, policy_version 145393 (0.0042) +[2024-06-18 13:47:41,994][12645] Fps is (10 sec: 36044.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2382233600. Throughput: 0: 42176.4. Samples: 2382412280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:41,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 13:47:42,837][12883] Updated weights for policy 0, policy_version 145403 (0.0035) +[2024-06-18 13:47:46,282][12883] Updated weights for policy 0, policy_version 145413 (0.0042) +[2024-06-18 13:47:46,998][12645] Fps is (10 sec: 44219.2, 60 sec: 42595.6, 300 sec: 42820.3). Total num frames: 2382479360. Throughput: 0: 42208.4. Samples: 2382537760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:46,998][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 13:47:50,771][12883] Updated weights for policy 0, policy_version 145423 (0.0037) +[2024-06-18 13:47:51,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42050.7, 300 sec: 42709.2). Total num frames: 2382659584. Throughput: 0: 42314.4. Samples: 2382796460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:51,996][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 13:47:54,047][12883] Updated weights for policy 0, policy_version 145433 (0.0028) +[2024-06-18 13:47:56,994][12645] Fps is (10 sec: 40975.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2382888960. Throughput: 0: 42246.5. Samples: 2383048080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) +[2024-06-18 13:47:56,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 13:47:58,406][12883] Updated weights for policy 0, policy_version 145443 (0.0037) +[2024-06-18 13:48:01,734][12883] Updated weights for policy 0, policy_version 145453 (0.0024) +[2024-06-18 13:48:01,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2383118336. Throughput: 0: 42399.5. Samples: 2383178240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:01,998][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 13:48:06,054][12883] Updated weights for policy 0, policy_version 145463 (0.0023) +[2024-06-18 13:48:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2383314944. Throughput: 0: 42472.0. Samples: 2383438840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:06,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 13:48:09,266][12883] Updated weights for policy 0, policy_version 145473 (0.0035) +[2024-06-18 13:48:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2383511552. Throughput: 0: 42341.0. Samples: 2383686380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:11,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 13:48:13,671][12883] Updated weights for policy 0, policy_version 145483 (0.0034) +[2024-06-18 13:48:16,979][12883] Updated weights for policy 0, policy_version 145493 (0.0033) +[2024-06-18 13:48:16,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 2383757312. Throughput: 0: 42289.0. Samples: 2383813580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:16,997][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 13:48:21,379][12883] Updated weights for policy 0, policy_version 145503 (0.0037) +[2024-06-18 13:48:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2383953920. Throughput: 0: 42433.4. Samples: 2384073460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:21,994][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 13:48:24,529][12883] Updated weights for policy 0, policy_version 145513 (0.0039) +[2024-06-18 13:48:26,996][12645] Fps is (10 sec: 40960.2, 60 sec: 42596.9, 300 sec: 42709.1). Total num frames: 2384166912. Throughput: 0: 42426.8. Samples: 2384321580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:26,996][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 13:48:29,162][12883] Updated weights for policy 0, policy_version 145523 (0.0030) +[2024-06-18 13:48:30,668][12862] Signal inference workers to stop experience collection... (34850 times) +[2024-06-18 13:48:30,669][12862] Signal inference workers to resume experience collection... (34850 times) +[2024-06-18 13:48:30,689][12883] InferenceWorker_p0-w0: stopping experience collection (34850 times) +[2024-06-18 13:48:30,690][12883] InferenceWorker_p0-w0: resuming experience collection (34850 times) +[2024-06-18 13:48:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 2384379904. Throughput: 0: 42501.9. Samples: 2384450180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:31,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 13:48:32,244][12883] Updated weights for policy 0, policy_version 145533 (0.0028) +[2024-06-18 13:48:36,782][12883] Updated weights for policy 0, policy_version 145543 (0.0033) +[2024-06-18 13:48:36,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2384592896. Throughput: 0: 42520.8. Samples: 2384709800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:36,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 13:48:37,104][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145545_2384609280.pth... +[2024-06-18 13:48:37,167][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144919_2374352896.pth +[2024-06-18 13:48:39,816][12883] Updated weights for policy 0, policy_version 145553 (0.0041) +[2024-06-18 13:48:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2384822272. Throughput: 0: 42465.4. Samples: 2384959020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:42,000][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 13:48:44,525][12883] Updated weights for policy 0, policy_version 145563 (0.0037) +[2024-06-18 13:48:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42328.1, 300 sec: 42709.5). Total num frames: 2385018880. Throughput: 0: 42620.4. Samples: 2385096160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:46,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 13:48:47,403][12883] Updated weights for policy 0, policy_version 145573 (0.0054) +[2024-06-18 13:48:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2385215488. Throughput: 0: 42420.5. Samples: 2385347760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:51,994][12645] Avg episode reward: [(0, '0.209')] +[2024-06-18 13:48:52,099][12883] Updated weights for policy 0, policy_version 145583 (0.0036) +[2024-06-18 13:48:55,296][12883] Updated weights for policy 0, policy_version 145593 (0.0045) +[2024-06-18 13:48:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2385461248. Throughput: 0: 42602.5. Samples: 2385603500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:48:56,994][12645] Avg episode reward: [(0, '0.107')] +[2024-06-18 13:48:59,727][12883] Updated weights for policy 0, policy_version 145603 (0.0040) +[2024-06-18 13:49:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2385657856. Throughput: 0: 42787.0. Samples: 2385738900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 13:49:01,998][12645] Avg episode reward: [(0, '0.202')] +[2024-06-18 13:49:03,020][12883] Updated weights for policy 0, policy_version 145613 (0.0034) +[2024-06-18 13:49:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 2385838080. Throughput: 0: 42505.3. Samples: 2385986200. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:06,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 13:49:07,532][12883] Updated weights for policy 0, policy_version 145623 (0.0024) +[2024-06-18 13:49:10,659][12883] Updated weights for policy 0, policy_version 145633 (0.0033) +[2024-06-18 13:49:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2386083840. Throughput: 0: 42740.7. Samples: 2386244820. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:11,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 13:49:15,370][12883] Updated weights for policy 0, policy_version 145643 (0.0035) +[2024-06-18 13:49:16,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 2386296832. Throughput: 0: 42918.6. Samples: 2386381520. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:16,994][12645] Avg episode reward: [(0, '0.350')] +[2024-06-18 13:49:18,198][12883] Updated weights for policy 0, policy_version 145653 (0.0032) +[2024-06-18 13:49:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2386493440. Throughput: 0: 42559.2. Samples: 2386624960. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:21,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 13:49:23,103][12883] Updated weights for policy 0, policy_version 145663 (0.0030) +[2024-06-18 13:49:25,708][12883] Updated weights for policy 0, policy_version 145673 (0.0023) +[2024-06-18 13:49:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 2386739200. Throughput: 0: 42771.0. Samples: 2386883720. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:26,994][12645] Avg episode reward: [(0, '0.251')] +[2024-06-18 13:49:30,838][12883] Updated weights for policy 0, policy_version 145683 (0.0032) +[2024-06-18 13:49:31,065][12862] Signal inference workers to stop experience collection... (34900 times) +[2024-06-18 13:49:31,071][12862] Signal inference workers to resume experience collection... (34900 times) +[2024-06-18 13:49:31,096][12883] InferenceWorker_p0-w0: stopping experience collection (34900 times) +[2024-06-18 13:49:31,096][12883] InferenceWorker_p0-w0: resuming experience collection (34900 times) +[2024-06-18 13:49:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2386919424. Throughput: 0: 42670.7. Samples: 2387016340. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:31,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 13:49:33,267][12883] Updated weights for policy 0, policy_version 145693 (0.0027) +[2024-06-18 13:49:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2387148800. Throughput: 0: 42708.0. Samples: 2387269620. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:36,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 13:49:38,299][12883] Updated weights for policy 0, policy_version 145703 (0.0025) +[2024-06-18 13:49:41,211][12883] Updated weights for policy 0, policy_version 145713 (0.0032) +[2024-06-18 13:49:41,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2387394560. Throughput: 0: 42705.8. Samples: 2387525260. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:41,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 13:49:45,858][12883] Updated weights for policy 0, policy_version 145723 (0.0034) +[2024-06-18 13:49:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2387574784. Throughput: 0: 42692.0. Samples: 2387660040. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:46,994][12645] Avg episode reward: [(0, '0.763')] +[2024-06-18 13:49:48,699][12883] Updated weights for policy 0, policy_version 145733 (0.0042) +[2024-06-18 13:49:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 2387804160. Throughput: 0: 42851.1. Samples: 2387914500. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:51,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 13:49:53,493][12883] Updated weights for policy 0, policy_version 145743 (0.0034) +[2024-06-18 13:49:56,397][12883] Updated weights for policy 0, policy_version 145753 (0.0028) +[2024-06-18 13:49:56,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2388033536. Throughput: 0: 42804.2. Samples: 2388171000. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:49:56,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 13:50:01,174][12883] Updated weights for policy 0, policy_version 145763 (0.0028) +[2024-06-18 13:50:01,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2388213760. Throughput: 0: 42644.2. Samples: 2388300600. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:50:01,996][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 13:50:04,066][12883] Updated weights for policy 0, policy_version 145773 (0.0036) +[2024-06-18 13:50:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2388426752. Throughput: 0: 42767.9. Samples: 2388549520. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) +[2024-06-18 13:50:06,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 13:50:08,691][12883] Updated weights for policy 0, policy_version 145783 (0.0036) +[2024-06-18 13:50:11,955][12883] Updated weights for policy 0, policy_version 145793 (0.0026) +[2024-06-18 13:50:11,994][12645] Fps is (10 sec: 45885.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2388672512. Throughput: 0: 42745.4. Samples: 2388807260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:11,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 13:50:16,243][12883] Updated weights for policy 0, policy_version 145803 (0.0031) +[2024-06-18 13:50:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2388852736. Throughput: 0: 42769.3. Samples: 2388940960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:16,994][12645] Avg episode reward: [(0, '0.687')] +[2024-06-18 13:50:19,856][12883] Updated weights for policy 0, policy_version 145813 (0.0034) +[2024-06-18 13:50:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2389098496. Throughput: 0: 42777.7. Samples: 2389194620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:21,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 13:50:23,883][12883] Updated weights for policy 0, policy_version 145823 (0.0033) +[2024-06-18 13:50:26,999][12645] Fps is (10 sec: 44211.7, 60 sec: 42594.5, 300 sec: 42708.6). Total num frames: 2389295104. Throughput: 0: 42808.4. Samples: 2389451880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:27,000][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 13:50:27,504][12883] Updated weights for policy 0, policy_version 145833 (0.0040) +[2024-06-18 13:50:31,383][12883] Updated weights for policy 0, policy_version 145843 (0.0029) +[2024-06-18 13:50:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2389491712. Throughput: 0: 42701.7. Samples: 2389581620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:31,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 13:50:35,075][12883] Updated weights for policy 0, policy_version 145853 (0.0045) +[2024-06-18 13:50:36,994][12645] Fps is (10 sec: 45901.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2389753856. Throughput: 0: 42736.4. Samples: 2389837640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:36,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 13:50:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145859_2389753856.pth... +[2024-06-18 13:50:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145236_2379546624.pth +[2024-06-18 13:50:39,365][12883] Updated weights for policy 0, policy_version 145863 (0.0042) +[2024-06-18 13:50:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2389950464. Throughput: 0: 42887.4. Samples: 2390100940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:41,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 13:50:42,687][12883] Updated weights for policy 0, policy_version 145873 (0.0035) +[2024-06-18 13:50:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2390130688. Throughput: 0: 42738.7. Samples: 2390223740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:46,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 13:50:47,025][12883] Updated weights for policy 0, policy_version 145883 (0.0047) +[2024-06-18 13:50:50,383][12883] Updated weights for policy 0, policy_version 145893 (0.0036) +[2024-06-18 13:50:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2390376448. Throughput: 0: 42985.2. Samples: 2390483860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:51,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 13:50:54,925][12883] Updated weights for policy 0, policy_version 145903 (0.0033) +[2024-06-18 13:50:56,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2390589440. Throughput: 0: 42937.3. Samples: 2390739440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:50:56,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 13:50:58,053][12883] Updated weights for policy 0, policy_version 145913 (0.0041) +[2024-06-18 13:50:58,484][12862] Signal inference workers to stop experience collection... (34950 times) +[2024-06-18 13:50:58,488][12862] Signal inference workers to resume experience collection... (34950 times) +[2024-06-18 13:50:58,505][12883] InferenceWorker_p0-w0: stopping experience collection (34950 times) +[2024-06-18 13:50:58,505][12883] InferenceWorker_p0-w0: resuming experience collection (34950 times) +[2024-06-18 13:51:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 2390786048. Throughput: 0: 42766.1. Samples: 2390865440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:51:01,994][12645] Avg episode reward: [(0, '0.709')] +[2024-06-18 13:51:02,481][12883] Updated weights for policy 0, policy_version 145923 (0.0032) +[2024-06-18 13:51:05,693][12883] Updated weights for policy 0, policy_version 145933 (0.0043) +[2024-06-18 13:51:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2391015424. Throughput: 0: 42817.8. Samples: 2391121420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 13:51:06,994][12645] Avg episode reward: [(0, '0.256')] +[2024-06-18 13:51:10,212][12883] Updated weights for policy 0, policy_version 145943 (0.0035) +[2024-06-18 13:51:11,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2391228416. Throughput: 0: 42854.8. Samples: 2391380100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:11,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 13:51:13,356][12883] Updated weights for policy 0, policy_version 145953 (0.0030) +[2024-06-18 13:51:16,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2391408640. Throughput: 0: 42745.3. Samples: 2391505160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:16,994][12645] Avg episode reward: [(0, '0.822')] +[2024-06-18 13:51:17,719][12883] Updated weights for policy 0, policy_version 145963 (0.0030) +[2024-06-18 13:51:20,954][12883] Updated weights for policy 0, policy_version 145973 (0.0046) +[2024-06-18 13:51:21,995][12645] Fps is (10 sec: 42591.3, 60 sec: 42597.3, 300 sec: 42709.2). Total num frames: 2391654400. Throughput: 0: 42868.7. Samples: 2391766800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:21,996][12645] Avg episode reward: [(0, '0.822')] +[2024-06-18 13:51:25,243][12883] Updated weights for policy 0, policy_version 145983 (0.0027) +[2024-06-18 13:51:26,994][12645] Fps is (10 sec: 47514.2, 60 sec: 43148.6, 300 sec: 42709.5). Total num frames: 2391883776. Throughput: 0: 42864.5. Samples: 2392029840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:26,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 13:51:28,603][12883] Updated weights for policy 0, policy_version 145993 (0.0022) +[2024-06-18 13:51:31,994][12645] Fps is (10 sec: 40966.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2392064000. Throughput: 0: 43023.0. Samples: 2392159780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:31,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 13:51:32,733][12883] Updated weights for policy 0, policy_version 146003 (0.0031) +[2024-06-18 13:51:36,349][12883] Updated weights for policy 0, policy_version 146013 (0.0030) +[2024-06-18 13:51:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2392293376. Throughput: 0: 42860.6. Samples: 2392412580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:36,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 13:51:40,227][12883] Updated weights for policy 0, policy_version 146023 (0.0027) +[2024-06-18 13:51:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2392522752. Throughput: 0: 43001.8. Samples: 2392674520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:41,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 13:51:43,937][12883] Updated weights for policy 0, policy_version 146033 (0.0029) +[2024-06-18 13:51:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2392719360. Throughput: 0: 43048.9. Samples: 2392802640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:46,994][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 13:51:47,947][12883] Updated weights for policy 0, policy_version 146043 (0.0037) +[2024-06-18 13:51:51,757][12883] Updated weights for policy 0, policy_version 146053 (0.0032) +[2024-06-18 13:51:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2392932352. Throughput: 0: 42921.6. Samples: 2393052900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:51,995][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 13:51:55,440][12883] Updated weights for policy 0, policy_version 146063 (0.0037) +[2024-06-18 13:51:56,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2393178112. Throughput: 0: 42883.0. Samples: 2393309840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:51:56,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 13:51:59,395][12883] Updated weights for policy 0, policy_version 146073 (0.0026) +[2024-06-18 13:52:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2393358336. Throughput: 0: 43083.1. Samples: 2393443900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:52:01,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 13:52:03,407][12883] Updated weights for policy 0, policy_version 146083 (0.0035) +[2024-06-18 13:52:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2393571328. Throughput: 0: 42756.6. Samples: 2393690780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:52:06,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 13:52:07,350][12883] Updated weights for policy 0, policy_version 146093 (0.0029) +[2024-06-18 13:52:11,221][12883] Updated weights for policy 0, policy_version 146103 (0.0030) +[2024-06-18 13:52:11,994][12645] Fps is (10 sec: 45876.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2393817088. Throughput: 0: 42618.7. Samples: 2393947680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) +[2024-06-18 13:52:11,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 13:52:14,994][12883] Updated weights for policy 0, policy_version 146113 (0.0032) +[2024-06-18 13:52:16,609][12862] Signal inference workers to stop experience collection... (35000 times) +[2024-06-18 13:52:16,609][12862] Signal inference workers to resume experience collection... (35000 times) +[2024-06-18 13:52:16,632][12883] InferenceWorker_p0-w0: stopping experience collection (35000 times) +[2024-06-18 13:52:16,632][12883] InferenceWorker_p0-w0: resuming experience collection (35000 times) +[2024-06-18 13:52:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2393997312. Throughput: 0: 42766.7. Samples: 2394084280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:16,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 13:52:18,692][12883] Updated weights for policy 0, policy_version 146123 (0.0040) +[2024-06-18 13:52:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42599.5, 300 sec: 42709.5). Total num frames: 2394210304. Throughput: 0: 42593.3. Samples: 2394329280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:21,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 13:52:22,670][12883] Updated weights for policy 0, policy_version 146133 (0.0026) +[2024-06-18 13:52:26,199][12883] Updated weights for policy 0, policy_version 146143 (0.0048) +[2024-06-18 13:52:26,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2394456064. Throughput: 0: 42543.6. Samples: 2394588980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:26,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 13:52:30,239][12883] Updated weights for policy 0, policy_version 146153 (0.0039) +[2024-06-18 13:52:31,994][12645] Fps is (10 sec: 42597.1, 60 sec: 42871.2, 300 sec: 42709.4). Total num frames: 2394636288. Throughput: 0: 42693.1. Samples: 2394723840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:31,995][12645] Avg episode reward: [(0, '0.599')] +[2024-06-18 13:52:33,780][12883] Updated weights for policy 0, policy_version 146163 (0.0028) +[2024-06-18 13:52:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2394865664. Throughput: 0: 42714.3. Samples: 2394975040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:36,994][12645] Avg episode reward: [(0, '0.611')] +[2024-06-18 13:52:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146171_2394865664.pth... +[2024-06-18 13:52:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145545_2384609280.pth +[2024-06-18 13:52:37,775][12883] Updated weights for policy 0, policy_version 146173 (0.0032) +[2024-06-18 13:52:41,228][12883] Updated weights for policy 0, policy_version 146183 (0.0024) +[2024-06-18 13:52:41,994][12645] Fps is (10 sec: 45876.5, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 2395095040. Throughput: 0: 42760.0. Samples: 2395234040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:41,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 13:52:45,577][12883] Updated weights for policy 0, policy_version 146193 (0.0035) +[2024-06-18 13:52:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 2395275264. Throughput: 0: 42579.7. Samples: 2395359980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:46,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 13:52:49,576][12883] Updated weights for policy 0, policy_version 146203 (0.0042) +[2024-06-18 13:52:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2395521024. Throughput: 0: 42556.9. Samples: 2395605840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:51,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 13:52:53,252][12883] Updated weights for policy 0, policy_version 146213 (0.0034) +[2024-06-18 13:52:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2395701248. Throughput: 0: 42725.7. Samples: 2395870340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:52:56,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 13:52:57,352][12883] Updated weights for policy 0, policy_version 146223 (0.0042) +[2024-06-18 13:53:00,829][12883] Updated weights for policy 0, policy_version 146233 (0.0028) +[2024-06-18 13:53:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2395897856. Throughput: 0: 42350.1. Samples: 2395990040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:53:01,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 13:53:04,963][12883] Updated weights for policy 0, policy_version 146243 (0.0036) +[2024-06-18 13:53:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2396160000. Throughput: 0: 42656.4. Samples: 2396248820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:53:07,003][12645] Avg episode reward: [(0, '0.754')] +[2024-06-18 13:53:08,421][12883] Updated weights for policy 0, policy_version 146253 (0.0041) +[2024-06-18 13:53:11,996][12645] Fps is (10 sec: 44227.3, 60 sec: 42050.7, 300 sec: 42653.9). Total num frames: 2396340224. Throughput: 0: 42827.6. Samples: 2396516320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:53:11,997][12645] Avg episode reward: [(0, '0.661')] +[2024-06-18 13:53:12,621][12883] Updated weights for policy 0, policy_version 146263 (0.0031) +[2024-06-18 13:53:15,997][12883] Updated weights for policy 0, policy_version 146273 (0.0031) +[2024-06-18 13:53:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 2396553216. Throughput: 0: 42485.1. Samples: 2396635660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 13:53:16,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 13:53:20,227][12883] Updated weights for policy 0, policy_version 146283 (0.0030) +[2024-06-18 13:53:21,994][12645] Fps is (10 sec: 45885.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 2396798976. Throughput: 0: 42616.0. Samples: 2396892760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:53:21,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 13:53:23,467][12883] Updated weights for policy 0, policy_version 146293 (0.0036) +[2024-06-18 13:53:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 2396962816. Throughput: 0: 42750.6. Samples: 2397157820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:53:26,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 13:53:27,855][12883] Updated weights for policy 0, policy_version 146303 (0.0032) +[2024-06-18 13:53:31,168][12883] Updated weights for policy 0, policy_version 146313 (0.0032) +[2024-06-18 13:53:31,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.7, 300 sec: 42709.5). Total num frames: 2397192192. Throughput: 0: 42397.4. Samples: 2397267860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:53:31,994][12645] Avg episode reward: [(0, '0.718')] +[2024-06-18 13:53:35,228][12862] Signal inference workers to stop experience collection... (35050 times) +[2024-06-18 13:53:35,264][12883] InferenceWorker_p0-w0: stopping experience collection (35050 times) +[2024-06-18 13:53:35,282][12862] Signal inference workers to resume experience collection... (35050 times) +[2024-06-18 13:53:35,283][12883] InferenceWorker_p0-w0: resuming experience collection (35050 times) +[2024-06-18 13:53:35,427][12883] Updated weights for policy 0, policy_version 146323 (0.0047) +[2024-06-18 13:53:36,996][12645] Fps is (10 sec: 47504.8, 60 sec: 42870.1, 300 sec: 42764.7). Total num frames: 2397437952. Throughput: 0: 42809.8. Samples: 2397532360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:53:36,996][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 13:53:38,671][12883] Updated weights for policy 0, policy_version 146333 (0.0029) +[2024-06-18 13:53:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 2397601792. Throughput: 0: 42713.0. Samples: 2397792420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:53:41,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 13:53:43,445][12883] Updated weights for policy 0, policy_version 146343 (0.0038) +[2024-06-18 13:53:46,587][12883] Updated weights for policy 0, policy_version 146353 (0.0028) +[2024-06-18 13:53:46,994][12645] Fps is (10 sec: 40968.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2397847552. Throughput: 0: 42724.1. Samples: 2397912620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:53:46,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 13:53:50,879][12883] Updated weights for policy 0, policy_version 146363 (0.0037) +[2024-06-18 13:53:51,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2398076928. Throughput: 0: 42887.6. Samples: 2398178760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:53:51,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 13:53:54,159][12883] Updated weights for policy 0, policy_version 146373 (0.0038) +[2024-06-18 13:53:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2398240768. Throughput: 0: 42623.0. Samples: 2398434260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:53:56,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 13:53:58,425][12883] Updated weights for policy 0, policy_version 146383 (0.0026) +[2024-06-18 13:54:01,663][12883] Updated weights for policy 0, policy_version 146393 (0.0029) +[2024-06-18 13:54:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2398502912. Throughput: 0: 42679.2. Samples: 2398556220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:54:01,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 13:54:05,983][12883] Updated weights for policy 0, policy_version 146403 (0.0037) +[2024-06-18 13:54:06,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2398715904. Throughput: 0: 42806.7. Samples: 2398819060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:54:06,994][12645] Avg episode reward: [(0, '0.669')] +[2024-06-18 13:54:09,306][12883] Updated weights for policy 0, policy_version 146413 (0.0028) +[2024-06-18 13:54:11,996][12645] Fps is (10 sec: 37674.8, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 2398879744. Throughput: 0: 42720.2. Samples: 2399080320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:54:11,996][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 13:54:13,678][12883] Updated weights for policy 0, policy_version 146423 (0.0035) +[2024-06-18 13:54:16,969][12883] Updated weights for policy 0, policy_version 146433 (0.0038) +[2024-06-18 13:54:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2399158272. Throughput: 0: 42958.5. Samples: 2399201000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) +[2024-06-18 13:54:16,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 13:54:21,284][12883] Updated weights for policy 0, policy_version 146443 (0.0046) +[2024-06-18 13:54:21,993][12645] Fps is (10 sec: 47525.0, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 2399354880. Throughput: 0: 42854.5. Samples: 2399460720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:54:21,994][12645] Avg episode reward: [(0, '0.692')] +[2024-06-18 13:54:24,867][12883] Updated weights for policy 0, policy_version 146453 (0.0028) +[2024-06-18 13:54:26,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2399535104. Throughput: 0: 42748.3. Samples: 2399716100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:54:26,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 13:54:28,888][12883] Updated weights for policy 0, policy_version 146463 (0.0039) +[2024-06-18 13:54:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2399764480. Throughput: 0: 42762.2. Samples: 2399836920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:54:31,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 13:54:32,431][12883] Updated weights for policy 0, policy_version 146473 (0.0028) +[2024-06-18 13:54:36,417][12883] Updated weights for policy 0, policy_version 146483 (0.0034) +[2024-06-18 13:54:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42599.7, 300 sec: 42709.5). Total num frames: 2399993856. Throughput: 0: 42744.8. Samples: 2400102280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:54:36,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 13:54:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146485_2400010240.pth... +[2024-06-18 13:54:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145859_2389753856.pth +[2024-06-18 13:54:40,375][12883] Updated weights for policy 0, policy_version 146493 (0.0026) +[2024-06-18 13:54:42,000][12645] Fps is (10 sec: 42571.6, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 2400190464. Throughput: 0: 42756.3. Samples: 2400358560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:54:42,001][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 13:54:44,150][12883] Updated weights for policy 0, policy_version 146503 (0.0027) +[2024-06-18 13:54:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2400419840. Throughput: 0: 42825.8. Samples: 2400483380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:54:46,994][12645] Avg episode reward: [(0, '0.255')] +[2024-06-18 13:54:48,019][12883] Updated weights for policy 0, policy_version 146513 (0.0032) +[2024-06-18 13:54:51,901][12883] Updated weights for policy 0, policy_version 146523 (0.0031) +[2024-06-18 13:54:51,994][12645] Fps is (10 sec: 44264.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2400632832. Throughput: 0: 42829.4. Samples: 2400746380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:54:51,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 13:54:55,754][12883] Updated weights for policy 0, policy_version 146533 (0.0040) +[2024-06-18 13:54:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2400829440. Throughput: 0: 42620.8. Samples: 2400998160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:54:56,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 13:54:59,350][12862] Signal inference workers to stop experience collection... (35100 times) +[2024-06-18 13:54:59,378][12883] InferenceWorker_p0-w0: stopping experience collection (35100 times) +[2024-06-18 13:54:59,416][12862] Signal inference workers to resume experience collection... (35100 times) +[2024-06-18 13:54:59,419][12883] InferenceWorker_p0-w0: resuming experience collection (35100 times) +[2024-06-18 13:54:59,563][12883] Updated weights for policy 0, policy_version 146543 (0.0034) +[2024-06-18 13:55:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2401058816. Throughput: 0: 42742.2. Samples: 2401124400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:55:01,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 13:55:03,367][12883] Updated weights for policy 0, policy_version 146553 (0.0030) +[2024-06-18 13:55:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2401271808. Throughput: 0: 42750.9. Samples: 2401384520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:55:06,994][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 13:55:07,116][12883] Updated weights for policy 0, policy_version 146563 (0.0032) +[2024-06-18 13:55:11,137][12883] Updated weights for policy 0, policy_version 146573 (0.0041) +[2024-06-18 13:55:12,000][12645] Fps is (10 sec: 42571.9, 60 sec: 43414.6, 300 sec: 42819.6). Total num frames: 2401484800. Throughput: 0: 42571.0. Samples: 2401632060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:55:12,001][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 13:55:15,034][12883] Updated weights for policy 0, policy_version 146583 (0.0027) +[2024-06-18 13:55:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2401697792. Throughput: 0: 42806.1. Samples: 2401763200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:55:16,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 13:55:18,943][12883] Updated weights for policy 0, policy_version 146593 (0.0029) +[2024-06-18 13:55:21,994][12645] Fps is (10 sec: 42625.6, 60 sec: 42598.3, 300 sec: 42765.8). Total num frames: 2401910784. Throughput: 0: 42658.4. Samples: 2402021900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) +[2024-06-18 13:55:21,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 13:55:22,812][12883] Updated weights for policy 0, policy_version 146603 (0.0032) +[2024-06-18 13:55:26,652][12883] Updated weights for policy 0, policy_version 146613 (0.0031) +[2024-06-18 13:55:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2402107392. Throughput: 0: 42545.0. Samples: 2402272820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:55:26,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 13:55:30,348][12883] Updated weights for policy 0, policy_version 146623 (0.0046) +[2024-06-18 13:55:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2402320384. Throughput: 0: 42534.2. Samples: 2402397420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:55:31,994][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 13:55:34,654][12883] Updated weights for policy 0, policy_version 146633 (0.0035) +[2024-06-18 13:55:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2402549760. Throughput: 0: 42439.0. Samples: 2402656140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:55:36,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 13:55:38,239][12883] Updated weights for policy 0, policy_version 146643 (0.0027) +[2024-06-18 13:55:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42329.7, 300 sec: 42709.4). Total num frames: 2402729984. Throughput: 0: 42567.9. Samples: 2402913720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:55:41,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 13:55:42,315][12883] Updated weights for policy 0, policy_version 146653 (0.0032) +[2024-06-18 13:55:45,689][12883] Updated weights for policy 0, policy_version 146663 (0.0033) +[2024-06-18 13:55:46,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 2402975744. Throughput: 0: 42531.3. Samples: 2403038400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:55:46,996][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 13:55:49,881][12883] Updated weights for policy 0, policy_version 146673 (0.0032) +[2024-06-18 13:55:51,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2403172352. Throughput: 0: 42580.0. Samples: 2403300620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:55:51,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 13:55:53,393][12883] Updated weights for policy 0, policy_version 146683 (0.0023) +[2024-06-18 13:55:56,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2403385344. Throughput: 0: 42674.8. Samples: 2403552160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:55:56,994][12645] Avg episode reward: [(0, '0.297')] +[2024-06-18 13:55:57,358][12883] Updated weights for policy 0, policy_version 146693 (0.0030) +[2024-06-18 13:56:00,808][12883] Updated weights for policy 0, policy_version 146703 (0.0035) +[2024-06-18 13:56:01,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2403631104. Throughput: 0: 42681.4. Samples: 2403683860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:56:01,997][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 13:56:05,552][12883] Updated weights for policy 0, policy_version 146713 (0.0028) +[2024-06-18 13:56:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2403827712. Throughput: 0: 42798.1. Samples: 2403947820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:56:06,994][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 13:56:08,646][12883] Updated weights for policy 0, policy_version 146723 (0.0028) +[2024-06-18 13:56:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 2404040704. Throughput: 0: 42806.7. Samples: 2404199120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:56:11,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 13:56:12,991][12883] Updated weights for policy 0, policy_version 146733 (0.0046) +[2024-06-18 13:56:16,365][12883] Updated weights for policy 0, policy_version 146743 (0.0028) +[2024-06-18 13:56:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.3). Total num frames: 2404270080. Throughput: 0: 42962.7. Samples: 2404330740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:56:16,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 13:56:20,392][12883] Updated weights for policy 0, policy_version 146753 (0.0047) +[2024-06-18 13:56:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2404450304. Throughput: 0: 42965.0. Samples: 2404589560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:56:21,994][12645] Avg episode reward: [(0, '0.359')] +[2024-06-18 13:56:24,008][12883] Updated weights for policy 0, policy_version 146763 (0.0036) +[2024-06-18 13:56:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2404679680. Throughput: 0: 42845.3. Samples: 2404841760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 13:56:26,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 13:56:27,770][12883] Updated weights for policy 0, policy_version 146773 (0.0035) +[2024-06-18 13:56:30,900][12862] Signal inference workers to stop experience collection... (35150 times) +[2024-06-18 13:56:30,948][12883] InferenceWorker_p0-w0: stopping experience collection (35150 times) +[2024-06-18 13:56:30,957][12862] Signal inference workers to resume experience collection... (35150 times) +[2024-06-18 13:56:30,965][12883] InferenceWorker_p0-w0: resuming experience collection (35150 times) +[2024-06-18 13:56:31,578][12883] Updated weights for policy 0, policy_version 146783 (0.0028) +[2024-06-18 13:56:31,994][12645] Fps is (10 sec: 47513.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 2404925440. Throughput: 0: 42966.5. Samples: 2404971800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:56:31,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 13:56:35,371][12883] Updated weights for policy 0, policy_version 146793 (0.0043) +[2024-06-18 13:56:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2405105664. Throughput: 0: 42759.1. Samples: 2405224780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:56:36,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 13:56:37,113][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146797_2405122048.pth... +[2024-06-18 13:56:37,162][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146171_2394865664.pth +[2024-06-18 13:56:39,058][12883] Updated weights for policy 0, policy_version 146803 (0.0028) +[2024-06-18 13:56:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2405318656. Throughput: 0: 43031.2. Samples: 2405488560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:56:41,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 13:56:43,159][12883] Updated weights for policy 0, policy_version 146813 (0.0030) +[2024-06-18 13:56:46,706][12883] Updated weights for policy 0, policy_version 146823 (0.0036) +[2024-06-18 13:56:46,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43146.0, 300 sec: 42820.6). Total num frames: 2405564416. Throughput: 0: 42849.3. Samples: 2405612080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:56:46,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 13:56:50,929][12883] Updated weights for policy 0, policy_version 146833 (0.0037) +[2024-06-18 13:56:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2405744640. Throughput: 0: 42800.1. Samples: 2405873820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:56:51,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 13:56:54,382][12883] Updated weights for policy 0, policy_version 146843 (0.0028) +[2024-06-18 13:56:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2405957632. Throughput: 0: 42780.9. Samples: 2406124260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:56:56,994][12645] Avg episode reward: [(0, '0.611')] +[2024-06-18 13:56:58,718][12883] Updated weights for policy 0, policy_version 146853 (0.0028) +[2024-06-18 13:57:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2406187008. Throughput: 0: 42734.6. Samples: 2406253800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:57:01,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 13:57:02,046][12883] Updated weights for policy 0, policy_version 146863 (0.0023) +[2024-06-18 13:57:06,251][12883] Updated weights for policy 0, policy_version 146873 (0.0025) +[2024-06-18 13:57:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2406383616. Throughput: 0: 42666.1. Samples: 2406509540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:57:06,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 13:57:09,845][12883] Updated weights for policy 0, policy_version 146883 (0.0035) +[2024-06-18 13:57:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2406612992. Throughput: 0: 42670.7. Samples: 2406761940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:57:11,996][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 13:57:13,801][12883] Updated weights for policy 0, policy_version 146893 (0.0031) +[2024-06-18 13:57:17,000][12645] Fps is (10 sec: 42572.1, 60 sec: 42320.9, 300 sec: 42708.6). Total num frames: 2406809600. Throughput: 0: 42727.0. Samples: 2406894780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:57:17,001][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 13:57:17,730][12883] Updated weights for policy 0, policy_version 146903 (0.0035) +[2024-06-18 13:57:21,383][12883] Updated weights for policy 0, policy_version 146913 (0.0031) +[2024-06-18 13:57:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2407022592. Throughput: 0: 42733.2. Samples: 2407147780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:57:21,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 13:57:25,386][12883] Updated weights for policy 0, policy_version 146923 (0.0030) +[2024-06-18 13:57:26,998][12645] Fps is (10 sec: 44243.2, 60 sec: 42868.1, 300 sec: 42764.4). Total num frames: 2407251968. Throughput: 0: 42416.8. Samples: 2407397520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:57:26,999][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 13:57:29,483][12883] Updated weights for policy 0, policy_version 146933 (0.0032) +[2024-06-18 13:57:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2407432192. Throughput: 0: 42620.0. Samples: 2407529980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 13:57:31,995][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 13:57:32,190][12862] Signal inference workers to stop experience collection... (35200 times) +[2024-06-18 13:57:32,194][12862] Signal inference workers to resume experience collection... (35200 times) +[2024-06-18 13:57:32,242][12883] InferenceWorker_p0-w0: stopping experience collection (35200 times) +[2024-06-18 13:57:32,242][12883] InferenceWorker_p0-w0: resuming experience collection (35200 times) +[2024-06-18 13:57:32,947][12883] Updated weights for policy 0, policy_version 146943 (0.0035) +[2024-06-18 13:57:36,994][12645] Fps is (10 sec: 40979.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2407661568. Throughput: 0: 42493.9. Samples: 2407786040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:57:36,994][12645] Avg episode reward: [(0, '0.315')] +[2024-06-18 13:57:37,052][12883] Updated weights for policy 0, policy_version 146953 (0.0032) +[2024-06-18 13:57:40,685][12883] Updated weights for policy 0, policy_version 146963 (0.0040) +[2024-06-18 13:57:41,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2407890944. Throughput: 0: 42450.3. Samples: 2408034520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:57:41,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 13:57:44,738][12883] Updated weights for policy 0, policy_version 146973 (0.0031) +[2024-06-18 13:57:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 2408071168. Throughput: 0: 42631.2. Samples: 2408172200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:57:46,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 13:57:48,258][12883] Updated weights for policy 0, policy_version 146983 (0.0045) +[2024-06-18 13:57:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2408284160. Throughput: 0: 42594.7. Samples: 2408426300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:57:51,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 13:57:52,443][12883] Updated weights for policy 0, policy_version 146993 (0.0038) +[2024-06-18 13:57:55,887][12883] Updated weights for policy 0, policy_version 147003 (0.0031) +[2024-06-18 13:57:56,994][12645] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2408546304. Throughput: 0: 42426.1. Samples: 2408671120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:57:56,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 13:58:00,070][12883] Updated weights for policy 0, policy_version 147013 (0.0034) +[2024-06-18 13:58:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2408710144. Throughput: 0: 42490.4. Samples: 2408806580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:58:01,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 13:58:03,720][12883] Updated weights for policy 0, policy_version 147023 (0.0046) +[2024-06-18 13:58:06,995][12645] Fps is (10 sec: 39315.1, 60 sec: 42597.2, 300 sec: 42709.5). Total num frames: 2408939520. Throughput: 0: 42411.7. Samples: 2409056380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:58:06,996][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 13:58:07,710][12883] Updated weights for policy 0, policy_version 147033 (0.0041) +[2024-06-18 13:58:11,511][12883] Updated weights for policy 0, policy_version 147043 (0.0032) +[2024-06-18 13:58:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2409168896. Throughput: 0: 42572.1. Samples: 2409313060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:58:11,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 13:58:15,456][12883] Updated weights for policy 0, policy_version 147053 (0.0027) +[2024-06-18 13:58:16,994][12645] Fps is (10 sec: 42605.9, 60 sec: 42602.8, 300 sec: 42598.4). Total num frames: 2409365504. Throughput: 0: 42630.7. Samples: 2409448360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:58:16,994][12645] Avg episode reward: [(0, '0.729')] +[2024-06-18 13:58:19,364][12883] Updated weights for policy 0, policy_version 147063 (0.0035) +[2024-06-18 13:58:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2409562112. Throughput: 0: 42556.0. Samples: 2409701060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:58:21,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 13:58:22,981][12883] Updated weights for policy 0, policy_version 147073 (0.0041) +[2024-06-18 13:58:26,805][12883] Updated weights for policy 0, policy_version 147083 (0.0040) +[2024-06-18 13:58:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42601.9, 300 sec: 42765.0). Total num frames: 2409807872. Throughput: 0: 42838.2. Samples: 2409962240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:58:26,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 13:58:30,506][12883] Updated weights for policy 0, policy_version 147093 (0.0045) +[2024-06-18 13:58:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2410004480. Throughput: 0: 42699.0. Samples: 2410093660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) +[2024-06-18 13:58:31,994][12645] Avg episode reward: [(0, '0.223')] +[2024-06-18 13:58:34,411][12883] Updated weights for policy 0, policy_version 147103 (0.0027) +[2024-06-18 13:58:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2410217472. Throughput: 0: 42638.7. Samples: 2410345040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:58:36,994][12645] Avg episode reward: [(0, '0.231')] +[2024-06-18 13:58:37,070][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147109_2410233856.pth... +[2024-06-18 13:58:37,129][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146485_2400010240.pth +[2024-06-18 13:58:38,195][12883] Updated weights for policy 0, policy_version 147113 (0.0027) +[2024-06-18 13:58:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2410446848. Throughput: 0: 42902.3. Samples: 2410601720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:58:41,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 13:58:42,346][12883] Updated weights for policy 0, policy_version 147123 (0.0032) +[2024-06-18 13:58:45,936][12883] Updated weights for policy 0, policy_version 147133 (0.0036) +[2024-06-18 13:58:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2410643456. Throughput: 0: 42639.1. Samples: 2410725340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:58:46,994][12645] Avg episode reward: [(0, '0.313')] +[2024-06-18 13:58:49,953][12883] Updated weights for policy 0, policy_version 147143 (0.0036) +[2024-06-18 13:58:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2410872832. Throughput: 0: 42899.1. Samples: 2410986760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:58:51,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 13:58:52,708][12862] Signal inference workers to stop experience collection... (35250 times) +[2024-06-18 13:58:52,708][12862] Signal inference workers to resume experience collection... (35250 times) +[2024-06-18 13:58:52,742][12883] InferenceWorker_p0-w0: stopping experience collection (35250 times) +[2024-06-18 13:58:52,742][12883] InferenceWorker_p0-w0: resuming experience collection (35250 times) +[2024-06-18 13:58:53,664][12883] Updated weights for policy 0, policy_version 147153 (0.0030) +[2024-06-18 13:58:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 2411085824. Throughput: 0: 42813.9. Samples: 2411239680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:58:56,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 13:58:57,497][12883] Updated weights for policy 0, policy_version 147163 (0.0039) +[2024-06-18 13:59:01,666][12883] Updated weights for policy 0, policy_version 147173 (0.0038) +[2024-06-18 13:59:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2411282432. Throughput: 0: 42698.2. Samples: 2411369780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:59:01,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 13:59:05,032][12883] Updated weights for policy 0, policy_version 147183 (0.0033) +[2024-06-18 13:59:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42872.7, 300 sec: 42820.9). Total num frames: 2411511808. Throughput: 0: 42774.1. Samples: 2411625900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:59:06,999][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 13:59:09,308][12883] Updated weights for policy 0, policy_version 147193 (0.0042) +[2024-06-18 13:59:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2411741184. Throughput: 0: 42740.3. Samples: 2411885560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:59:11,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 13:59:12,733][12883] Updated weights for policy 0, policy_version 147203 (0.0046) +[2024-06-18 13:59:16,815][12883] Updated weights for policy 0, policy_version 147213 (0.0031) +[2024-06-18 13:59:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2411937792. Throughput: 0: 42737.8. Samples: 2412016860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:59:16,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 13:59:20,272][12883] Updated weights for policy 0, policy_version 147223 (0.0037) +[2024-06-18 13:59:21,994][12645] Fps is (10 sec: 42599.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2412167168. Throughput: 0: 42865.4. Samples: 2412273980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:59:21,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 13:59:24,382][12883] Updated weights for policy 0, policy_version 147233 (0.0041) +[2024-06-18 13:59:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2412380160. Throughput: 0: 42869.3. Samples: 2412530840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:59:26,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 13:59:28,047][12883] Updated weights for policy 0, policy_version 147243 (0.0028) +[2024-06-18 13:59:31,997][12883] Updated weights for policy 0, policy_version 147253 (0.0042) +[2024-06-18 13:59:31,998][12645] Fps is (10 sec: 42580.5, 60 sec: 43141.6, 300 sec: 42708.9). Total num frames: 2412593152. Throughput: 0: 42922.2. Samples: 2412657020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:59:31,998][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 13:59:35,757][12883] Updated weights for policy 0, policy_version 147263 (0.0039) +[2024-06-18 13:59:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 2412789760. Throughput: 0: 42710.1. Samples: 2412908720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 13:59:36,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 13:59:39,896][12883] Updated weights for policy 0, policy_version 147273 (0.0026) +[2024-06-18 13:59:41,994][12645] Fps is (10 sec: 42616.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2413019136. Throughput: 0: 42797.8. Samples: 2413165580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:59:41,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 13:59:43,487][12883] Updated weights for policy 0, policy_version 147283 (0.0039) +[2024-06-18 13:59:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2413215744. Throughput: 0: 42681.7. Samples: 2413290460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:59:46,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 13:59:47,822][12883] Updated weights for policy 0, policy_version 147293 (0.0038) +[2024-06-18 13:59:51,215][12883] Updated weights for policy 0, policy_version 147303 (0.0035) +[2024-06-18 13:59:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2413428736. Throughput: 0: 42589.9. Samples: 2413542440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:59:51,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 13:59:55,359][12883] Updated weights for policy 0, policy_version 147313 (0.0034) +[2024-06-18 13:59:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2413625344. Throughput: 0: 42659.5. Samples: 2413805240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 13:59:56,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 13:59:58,776][12883] Updated weights for policy 0, policy_version 147323 (0.0031) +[2024-06-18 14:00:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2413854720. Throughput: 0: 42456.8. Samples: 2413927420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:01,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 14:00:03,243][12883] Updated weights for policy 0, policy_version 147333 (0.0039) +[2024-06-18 14:00:06,671][12883] Updated weights for policy 0, policy_version 147343 (0.0046) +[2024-06-18 14:00:07,000][12645] Fps is (10 sec: 44209.9, 60 sec: 42594.0, 300 sec: 42654.0). Total num frames: 2414067712. Throughput: 0: 42434.9. Samples: 2414183820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:07,001][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 14:00:11,001][12883] Updated weights for policy 0, policy_version 147353 (0.0031) +[2024-06-18 14:00:11,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2414280704. Throughput: 0: 42360.2. Samples: 2414437040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:11,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 14:00:14,318][12883] Updated weights for policy 0, policy_version 147363 (0.0035) +[2024-06-18 14:00:16,994][12645] Fps is (10 sec: 44264.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2414510080. Throughput: 0: 42401.7. Samples: 2414564920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:16,994][12645] Avg episode reward: [(0, '0.725')] +[2024-06-18 14:00:18,746][12883] Updated weights for policy 0, policy_version 147373 (0.0032) +[2024-06-18 14:00:21,990][12883] Updated weights for policy 0, policy_version 147383 (0.0034) +[2024-06-18 14:00:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2414723072. Throughput: 0: 42483.7. Samples: 2414820480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:21,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 14:00:26,286][12883] Updated weights for policy 0, policy_version 147393 (0.0042) +[2024-06-18 14:00:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2414903296. Throughput: 0: 42497.8. Samples: 2415077980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:26,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 14:00:29,744][12883] Updated weights for policy 0, policy_version 147403 (0.0035) +[2024-06-18 14:00:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42055.1, 300 sec: 42598.4). Total num frames: 2415116288. Throughput: 0: 42501.8. Samples: 2415203040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:31,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 14:00:33,730][12883] Updated weights for policy 0, policy_version 147413 (0.0028) +[2024-06-18 14:00:36,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2415362048. Throughput: 0: 42631.0. Samples: 2415460840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:36,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 14:00:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147422_2415362048.pth... +[2024-06-18 14:00:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146797_2405122048.pth +[2024-06-18 14:00:37,424][12883] Updated weights for policy 0, policy_version 147423 (0.0022) +[2024-06-18 14:00:41,406][12883] Updated weights for policy 0, policy_version 147433 (0.0035) +[2024-06-18 14:00:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 2415542272. Throughput: 0: 42473.5. Samples: 2415716540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) +[2024-06-18 14:00:41,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 14:00:44,943][12883] Updated weights for policy 0, policy_version 147443 (0.0038) +[2024-06-18 14:00:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2415771648. Throughput: 0: 42652.2. Samples: 2415846760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:00:46,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 14:00:49,152][12883] Updated weights for policy 0, policy_version 147453 (0.0032) +[2024-06-18 14:00:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2415984640. Throughput: 0: 42601.5. Samples: 2416100620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:00:51,994][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 14:00:53,062][12883] Updated weights for policy 0, policy_version 147463 (0.0032) +[2024-06-18 14:00:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2416181248. Throughput: 0: 42670.6. Samples: 2416357220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:00:56,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 14:00:57,141][12883] Updated weights for policy 0, policy_version 147473 (0.0036) +[2024-06-18 14:00:59,099][12862] Signal inference workers to stop experience collection... (35300 times) +[2024-06-18 14:00:59,099][12862] Signal inference workers to resume experience collection... (35300 times) +[2024-06-18 14:00:59,121][12883] InferenceWorker_p0-w0: stopping experience collection (35300 times) +[2024-06-18 14:00:59,121][12883] InferenceWorker_p0-w0: resuming experience collection (35300 times) +[2024-06-18 14:01:00,659][12883] Updated weights for policy 0, policy_version 147483 (0.0030) +[2024-06-18 14:01:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2416410624. Throughput: 0: 42624.0. Samples: 2416483000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:01,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 14:01:04,691][12883] Updated weights for policy 0, policy_version 147493 (0.0039) +[2024-06-18 14:01:06,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42875.9, 300 sec: 42709.5). Total num frames: 2416640000. Throughput: 0: 42574.2. Samples: 2416736320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:06,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 14:01:08,558][12883] Updated weights for policy 0, policy_version 147503 (0.0043) +[2024-06-18 14:01:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2416820224. Throughput: 0: 42710.6. Samples: 2416999960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:11,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 14:01:12,335][12883] Updated weights for policy 0, policy_version 147513 (0.0039) +[2024-06-18 14:01:15,904][12883] Updated weights for policy 0, policy_version 147523 (0.0035) +[2024-06-18 14:01:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2417065984. Throughput: 0: 42729.0. Samples: 2417125840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:16,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 14:01:19,808][12883] Updated weights for policy 0, policy_version 147533 (0.0033) +[2024-06-18 14:01:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2417278976. Throughput: 0: 42816.0. Samples: 2417387560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:21,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 14:01:23,445][12883] Updated weights for policy 0, policy_version 147543 (0.0041) +[2024-06-18 14:01:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2417475584. Throughput: 0: 42763.5. Samples: 2417640900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:26,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 14:01:27,411][12883] Updated weights for policy 0, policy_version 147553 (0.0031) +[2024-06-18 14:01:31,051][12883] Updated weights for policy 0, policy_version 147563 (0.0031) +[2024-06-18 14:01:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2417704960. Throughput: 0: 42662.1. Samples: 2417766560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:31,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 14:01:35,057][12883] Updated weights for policy 0, policy_version 147573 (0.0039) +[2024-06-18 14:01:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2417917952. Throughput: 0: 42754.6. Samples: 2418024580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:36,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 14:01:38,858][12883] Updated weights for policy 0, policy_version 147583 (0.0031) +[2024-06-18 14:01:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2418114560. Throughput: 0: 42689.4. Samples: 2418278240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) +[2024-06-18 14:01:41,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 14:01:42,593][12883] Updated weights for policy 0, policy_version 147593 (0.0042) +[2024-06-18 14:01:46,603][12883] Updated weights for policy 0, policy_version 147603 (0.0036) +[2024-06-18 14:01:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2418327552. Throughput: 0: 42830.6. Samples: 2418410380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:01:46,994][12645] Avg episode reward: [(0, '0.504')] +[2024-06-18 14:01:50,540][12883] Updated weights for policy 0, policy_version 147613 (0.0040) +[2024-06-18 14:01:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2418540544. Throughput: 0: 42801.0. Samples: 2418662360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:01:51,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 14:01:54,254][12883] Updated weights for policy 0, policy_version 147623 (0.0030) +[2024-06-18 14:01:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2418737152. Throughput: 0: 42640.0. Samples: 2418918760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:01:56,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 14:01:58,253][12883] Updated weights for policy 0, policy_version 147633 (0.0043) +[2024-06-18 14:02:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2418966528. Throughput: 0: 42648.4. Samples: 2419045020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:01,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 14:02:02,184][12883] Updated weights for policy 0, policy_version 147643 (0.0037) +[2024-06-18 14:02:06,039][12883] Updated weights for policy 0, policy_version 147653 (0.0025) +[2024-06-18 14:02:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2419179520. Throughput: 0: 42508.9. Samples: 2419300460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:06,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 14:02:09,709][12883] Updated weights for policy 0, policy_version 147663 (0.0034) +[2024-06-18 14:02:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42543.7). Total num frames: 2419359744. Throughput: 0: 42513.3. Samples: 2419554000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:11,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 14:02:13,862][12883] Updated weights for policy 0, policy_version 147673 (0.0030) +[2024-06-18 14:02:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2419621888. Throughput: 0: 42509.9. Samples: 2419679500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:16,996][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 14:02:17,284][12883] Updated weights for policy 0, policy_version 147683 (0.0030) +[2024-06-18 14:02:21,460][12883] Updated weights for policy 0, policy_version 147693 (0.0030) +[2024-06-18 14:02:21,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 42599.1). Total num frames: 2419818496. Throughput: 0: 42508.1. Samples: 2419937440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:21,994][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 14:02:24,775][12883] Updated weights for policy 0, policy_version 147703 (0.0037) +[2024-06-18 14:02:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2420015104. Throughput: 0: 42599.4. Samples: 2420195220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:26,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 14:02:27,816][12862] Signal inference workers to stop experience collection... (35350 times) +[2024-06-18 14:02:27,864][12883] InferenceWorker_p0-w0: stopping experience collection (35350 times) +[2024-06-18 14:02:27,876][12862] Signal inference workers to resume experience collection... (35350 times) +[2024-06-18 14:02:27,886][12883] InferenceWorker_p0-w0: resuming experience collection (35350 times) +[2024-06-18 14:02:29,487][12883] Updated weights for policy 0, policy_version 147713 (0.0032) +[2024-06-18 14:02:31,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42596.9, 300 sec: 42709.1). Total num frames: 2420260864. Throughput: 0: 42538.3. Samples: 2420324700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:31,997][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 14:02:32,302][12883] Updated weights for policy 0, policy_version 147723 (0.0038) +[2024-06-18 14:02:36,907][12883] Updated weights for policy 0, policy_version 147733 (0.0023) +[2024-06-18 14:02:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2420457472. Throughput: 0: 42723.5. Samples: 2420584920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:36,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 14:02:37,171][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147735_2420490240.pth... +[2024-06-18 14:02:37,224][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147109_2410233856.pth +[2024-06-18 14:02:40,346][12883] Updated weights for policy 0, policy_version 147743 (0.0032) +[2024-06-18 14:02:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2420670464. Throughput: 0: 42793.8. Samples: 2420844480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:41,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 14:02:44,338][12883] Updated weights for policy 0, policy_version 147753 (0.0036) +[2024-06-18 14:02:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2420899840. Throughput: 0: 42752.4. Samples: 2420968880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) +[2024-06-18 14:02:46,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 14:02:47,757][12883] Updated weights for policy 0, policy_version 147763 (0.0034) +[2024-06-18 14:02:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2421096448. Throughput: 0: 42934.7. Samples: 2421232520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:02:51,994][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 14:02:52,029][12883] Updated weights for policy 0, policy_version 147773 (0.0042) +[2024-06-18 14:02:55,447][12883] Updated weights for policy 0, policy_version 147783 (0.0046) +[2024-06-18 14:02:57,000][12645] Fps is (10 sec: 42571.8, 60 sec: 43140.1, 300 sec: 42764.1). Total num frames: 2421325824. Throughput: 0: 42915.5. Samples: 2421485460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:02:57,000][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 14:02:59,783][12883] Updated weights for policy 0, policy_version 147793 (0.0039) +[2024-06-18 14:03:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2421555200. Throughput: 0: 43085.9. Samples: 2421618360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:01,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 14:03:02,831][12883] Updated weights for policy 0, policy_version 147803 (0.0024) +[2024-06-18 14:03:06,998][12645] Fps is (10 sec: 40966.5, 60 sec: 42595.1, 300 sec: 42597.7). Total num frames: 2421735424. Throughput: 0: 43071.5. Samples: 2421875860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:06,999][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 14:03:07,199][12883] Updated weights for policy 0, policy_version 147813 (0.0033) +[2024-06-18 14:03:10,735][12883] Updated weights for policy 0, policy_version 147823 (0.0040) +[2024-06-18 14:03:11,994][12645] Fps is (10 sec: 40959.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2421964800. Throughput: 0: 43044.0. Samples: 2422132200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:11,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 14:03:14,764][12883] Updated weights for policy 0, policy_version 147833 (0.0035) +[2024-06-18 14:03:16,994][12645] Fps is (10 sec: 45896.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2422194176. Throughput: 0: 42948.8. Samples: 2422257300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:16,994][12645] Avg episode reward: [(0, '0.286')] +[2024-06-18 14:03:18,203][12883] Updated weights for policy 0, policy_version 147843 (0.0032) +[2024-06-18 14:03:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2422390784. Throughput: 0: 42927.5. Samples: 2422516660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:21,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 14:03:22,344][12883] Updated weights for policy 0, policy_version 147853 (0.0028) +[2024-06-18 14:03:25,944][12883] Updated weights for policy 0, policy_version 147863 (0.0033) +[2024-06-18 14:03:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2422603776. Throughput: 0: 42850.6. Samples: 2422772760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:26,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 14:03:30,114][12883] Updated weights for policy 0, policy_version 147873 (0.0033) +[2024-06-18 14:03:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 2422849536. Throughput: 0: 42986.2. Samples: 2422903260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:31,994][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 14:03:33,629][12883] Updated weights for policy 0, policy_version 147883 (0.0039) +[2024-06-18 14:03:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2423029760. Throughput: 0: 42716.3. Samples: 2423154760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:36,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 14:03:37,630][12883] Updated weights for policy 0, policy_version 147893 (0.0029) +[2024-06-18 14:03:41,872][12883] Updated weights for policy 0, policy_version 147903 (0.0033) +[2024-06-18 14:03:41,908][12862] Signal inference workers to stop experience collection... (35400 times) +[2024-06-18 14:03:41,908][12862] Signal inference workers to resume experience collection... (35400 times) +[2024-06-18 14:03:41,936][12883] InferenceWorker_p0-w0: stopping experience collection (35400 times) +[2024-06-18 14:03:41,936][12883] InferenceWorker_p0-w0: resuming experience collection (35400 times) +[2024-06-18 14:03:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2423242752. Throughput: 0: 42875.8. Samples: 2423414600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:41,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 14:03:45,453][12883] Updated weights for policy 0, policy_version 147913 (0.0037) +[2024-06-18 14:03:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2423472128. Throughput: 0: 42626.2. Samples: 2423536540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:46,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 14:03:49,426][12883] Updated weights for policy 0, policy_version 147923 (0.0028) +[2024-06-18 14:03:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2423668736. Throughput: 0: 42558.1. Samples: 2423790780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:03:51,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 14:03:53,392][12883] Updated weights for policy 0, policy_version 147933 (0.0037) +[2024-06-18 14:03:56,983][12883] Updated weights for policy 0, policy_version 147943 (0.0037) +[2024-06-18 14:03:56,993][12645] Fps is (10 sec: 42598.6, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 2423898112. Throughput: 0: 42698.9. Samples: 2424053640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:03:56,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 14:04:01,051][12883] Updated weights for policy 0, policy_version 147953 (0.0039) +[2024-06-18 14:04:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2424111104. Throughput: 0: 42723.2. Samples: 2424179840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:01,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 14:04:04,500][12883] Updated weights for policy 0, policy_version 147963 (0.0044) +[2024-06-18 14:04:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42874.9, 300 sec: 42598.4). Total num frames: 2424307712. Throughput: 0: 42550.8. Samples: 2424431440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:06,994][12645] Avg episode reward: [(0, '0.685')] +[2024-06-18 14:04:08,567][12883] Updated weights for policy 0, policy_version 147973 (0.0036) +[2024-06-18 14:04:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2424520704. Throughput: 0: 42577.5. Samples: 2424688740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:11,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 14:04:12,469][12883] Updated weights for policy 0, policy_version 147983 (0.0030) +[2024-06-18 14:04:16,202][12883] Updated weights for policy 0, policy_version 147993 (0.0046) +[2024-06-18 14:04:16,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2424750080. Throughput: 0: 42542.7. Samples: 2424817680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:16,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 14:04:19,985][12883] Updated weights for policy 0, policy_version 148003 (0.0032) +[2024-06-18 14:04:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2424946688. Throughput: 0: 42679.7. Samples: 2425075340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:21,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 14:04:23,815][12883] Updated weights for policy 0, policy_version 148013 (0.0028) +[2024-06-18 14:04:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.5). Total num frames: 2425176064. Throughput: 0: 42558.2. Samples: 2425329720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:26,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 14:04:28,138][12883] Updated weights for policy 0, policy_version 148023 (0.0040) +[2024-06-18 14:04:31,308][12883] Updated weights for policy 0, policy_version 148033 (0.0030) +[2024-06-18 14:04:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2425389056. Throughput: 0: 42760.4. Samples: 2425460760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:31,994][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 14:04:35,624][12883] Updated weights for policy 0, policy_version 148043 (0.0048) +[2024-06-18 14:04:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2425585664. Throughput: 0: 42857.8. Samples: 2425719380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:36,994][12645] Avg episode reward: [(0, '0.642')] +[2024-06-18 14:04:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148047_2425602048.pth... +[2024-06-18 14:04:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147422_2415362048.pth +[2024-06-18 14:04:39,063][12883] Updated weights for policy 0, policy_version 148053 (0.0038) +[2024-06-18 14:04:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2425815040. Throughput: 0: 42507.8. Samples: 2425966500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:41,994][12645] Avg episode reward: [(0, '0.679')] +[2024-06-18 14:04:43,174][12883] Updated weights for policy 0, policy_version 148063 (0.0030) +[2024-06-18 14:04:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2426028032. Throughput: 0: 42697.8. Samples: 2426101240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:46,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 14:04:46,999][12883] Updated weights for policy 0, policy_version 148073 (0.0033) +[2024-06-18 14:04:50,671][12883] Updated weights for policy 0, policy_version 148083 (0.0041) +[2024-06-18 14:04:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2426224640. Throughput: 0: 42818.6. Samples: 2426358280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 14:04:51,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 14:04:54,554][12883] Updated weights for policy 0, policy_version 148093 (0.0024) +[2024-06-18 14:04:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2426470400. Throughput: 0: 42701.7. Samples: 2426610320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:04:56,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 14:04:58,313][12883] Updated weights for policy 0, policy_version 148103 (0.0038) +[2024-06-18 14:05:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.8). Total num frames: 2426650624. Throughput: 0: 42796.9. Samples: 2426743540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:01,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 14:05:02,280][12883] Updated weights for policy 0, policy_version 148113 (0.0035) +[2024-06-18 14:05:05,851][12883] Updated weights for policy 0, policy_version 148123 (0.0027) +[2024-06-18 14:05:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 2426863616. Throughput: 0: 42679.4. Samples: 2426995920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:06,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 14:05:09,869][12883] Updated weights for policy 0, policy_version 148133 (0.0032) +[2024-06-18 14:05:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2427092992. Throughput: 0: 42587.1. Samples: 2427246140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:11,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 14:05:13,482][12883] Updated weights for policy 0, policy_version 148143 (0.0033) +[2024-06-18 14:05:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2427305984. Throughput: 0: 42608.4. Samples: 2427378140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:16,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 14:05:17,431][12862] Signal inference workers to stop experience collection... (35450 times) +[2024-06-18 14:05:17,431][12862] Signal inference workers to resume experience collection... (35450 times) +[2024-06-18 14:05:17,468][12883] InferenceWorker_p0-w0: stopping experience collection (35450 times) +[2024-06-18 14:05:17,468][12883] InferenceWorker_p0-w0: resuming experience collection (35450 times) +[2024-06-18 14:05:17,573][12883] Updated weights for policy 0, policy_version 148153 (0.0031) +[2024-06-18 14:05:21,162][12883] Updated weights for policy 0, policy_version 148163 (0.0034) +[2024-06-18 14:05:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2427518976. Throughput: 0: 42421.3. Samples: 2427628340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:21,994][12645] Avg episode reward: [(0, '0.755')] +[2024-06-18 14:05:25,667][12883] Updated weights for policy 0, policy_version 148173 (0.0034) +[2024-06-18 14:05:26,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2427731968. Throughput: 0: 42586.4. Samples: 2427882880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:26,994][12645] Avg episode reward: [(0, '0.779')] +[2024-06-18 14:05:28,833][12883] Updated weights for policy 0, policy_version 148183 (0.0036) +[2024-06-18 14:05:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2427944960. Throughput: 0: 42488.8. Samples: 2428013240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:31,994][12645] Avg episode reward: [(0, '0.876')] +[2024-06-18 14:05:33,138][12883] Updated weights for policy 0, policy_version 148193 (0.0028) +[2024-06-18 14:05:36,636][12883] Updated weights for policy 0, policy_version 148203 (0.0040) +[2024-06-18 14:05:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2428157952. Throughput: 0: 42400.9. Samples: 2428266320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:36,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 14:05:41,056][12883] Updated weights for policy 0, policy_version 148213 (0.0032) +[2024-06-18 14:05:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2428370944. Throughput: 0: 42523.1. Samples: 2428523860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:41,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 14:05:44,190][12883] Updated weights for policy 0, policy_version 148223 (0.0024) +[2024-06-18 14:05:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2428583936. Throughput: 0: 42391.5. Samples: 2428651160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:46,995][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 14:05:48,670][12883] Updated weights for policy 0, policy_version 148233 (0.0029) +[2024-06-18 14:05:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2428796928. Throughput: 0: 42443.1. Samples: 2428905860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:51,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 14:05:52,195][12883] Updated weights for policy 0, policy_version 148243 (0.0037) +[2024-06-18 14:05:56,271][12883] Updated weights for policy 0, policy_version 148253 (0.0036) +[2024-06-18 14:05:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2429009920. Throughput: 0: 42554.6. Samples: 2429161100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) +[2024-06-18 14:05:56,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 14:05:59,702][12883] Updated weights for policy 0, policy_version 148263 (0.0039) +[2024-06-18 14:06:01,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2429190144. Throughput: 0: 42427.3. Samples: 2429287360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:01,994][12645] Avg episode reward: [(0, '0.768')] +[2024-06-18 14:06:03,942][12883] Updated weights for policy 0, policy_version 148273 (0.0038) +[2024-06-18 14:06:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2429419520. Throughput: 0: 42509.5. Samples: 2429541260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:06,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 14:06:07,539][12883] Updated weights for policy 0, policy_version 148283 (0.0028) +[2024-06-18 14:06:11,786][12883] Updated weights for policy 0, policy_version 148293 (0.0048) +[2024-06-18 14:06:11,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2429632512. Throughput: 0: 42538.5. Samples: 2429797120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:11,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 14:06:15,267][12883] Updated weights for policy 0, policy_version 148303 (0.0033) +[2024-06-18 14:06:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2429845504. Throughput: 0: 42409.7. Samples: 2429921680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:16,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 14:06:19,466][12883] Updated weights for policy 0, policy_version 148313 (0.0028) +[2024-06-18 14:06:21,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 2430058496. Throughput: 0: 42550.8. Samples: 2430181200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:21,996][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 14:06:22,887][12883] Updated weights for policy 0, policy_version 148323 (0.0040) +[2024-06-18 14:06:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2430255104. Throughput: 0: 42369.4. Samples: 2430430480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:26,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 14:06:27,725][12883] Updated weights for policy 0, policy_version 148333 (0.0057) +[2024-06-18 14:06:30,516][12883] Updated weights for policy 0, policy_version 148343 (0.0033) +[2024-06-18 14:06:31,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2430500864. Throughput: 0: 42317.3. Samples: 2430555440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:31,994][12645] Avg episode reward: [(0, '0.650')] +[2024-06-18 14:06:35,523][12883] Updated weights for policy 0, policy_version 148353 (0.0032) +[2024-06-18 14:06:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2430681088. Throughput: 0: 42322.7. Samples: 2430810380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:36,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 14:06:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148357_2430681088.pth... +[2024-06-18 14:06:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147735_2420490240.pth +[2024-06-18 14:06:38,412][12883] Updated weights for policy 0, policy_version 148363 (0.0041) +[2024-06-18 14:06:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2430894080. Throughput: 0: 42196.1. Samples: 2431059920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:41,994][12645] Avg episode reward: [(0, '0.658')] +[2024-06-18 14:06:43,247][12883] Updated weights for policy 0, policy_version 148373 (0.0026) +[2024-06-18 14:06:46,367][12883] Updated weights for policy 0, policy_version 148383 (0.0026) +[2024-06-18 14:06:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2431123456. Throughput: 0: 42298.1. Samples: 2431190780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:46,994][12645] Avg episode reward: [(0, '0.667')] +[2024-06-18 14:06:50,871][12883] Updated weights for policy 0, policy_version 148393 (0.0037) +[2024-06-18 14:06:51,344][12862] Signal inference workers to stop experience collection... (35500 times) +[2024-06-18 14:06:51,344][12862] Signal inference workers to resume experience collection... (35500 times) +[2024-06-18 14:06:51,357][12883] InferenceWorker_p0-w0: stopping experience collection (35500 times) +[2024-06-18 14:06:51,389][12883] InferenceWorker_p0-w0: resuming experience collection (35500 times) +[2024-06-18 14:06:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2431320064. Throughput: 0: 42347.9. Samples: 2431446920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:51,994][12645] Avg episode reward: [(0, '0.222')] +[2024-06-18 14:06:53,960][12883] Updated weights for policy 0, policy_version 148403 (0.0031) +[2024-06-18 14:06:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 2431516672. Throughput: 0: 42188.9. Samples: 2431695620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:06:56,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 14:06:58,532][12883] Updated weights for policy 0, policy_version 148413 (0.0038) +[2024-06-18 14:07:01,613][12883] Updated weights for policy 0, policy_version 148423 (0.0037) +[2024-06-18 14:07:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2431762432. Throughput: 0: 42141.5. Samples: 2431818040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:07:01,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 14:07:06,284][12883] Updated weights for policy 0, policy_version 148433 (0.0039) +[2024-06-18 14:07:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2431959040. Throughput: 0: 42136.2. Samples: 2432077240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:06,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 14:07:09,681][12883] Updated weights for policy 0, policy_version 148443 (0.0032) +[2024-06-18 14:07:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2432155648. Throughput: 0: 42206.6. Samples: 2432329780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:11,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 14:07:14,001][12883] Updated weights for policy 0, policy_version 148453 (0.0030) +[2024-06-18 14:07:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2432385024. Throughput: 0: 42288.4. Samples: 2432458420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:16,994][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 14:07:17,416][12883] Updated weights for policy 0, policy_version 148463 (0.0039) +[2024-06-18 14:07:21,665][12883] Updated weights for policy 0, policy_version 148473 (0.0035) +[2024-06-18 14:07:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 2432598016. Throughput: 0: 42357.9. Samples: 2432716480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:21,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 14:07:25,249][12883] Updated weights for policy 0, policy_version 148483 (0.0025) +[2024-06-18 14:07:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2432794624. Throughput: 0: 42416.4. Samples: 2432968660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:26,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 14:07:29,314][12883] Updated weights for policy 0, policy_version 148493 (0.0029) +[2024-06-18 14:07:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2433024000. Throughput: 0: 42463.1. Samples: 2433101620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:31,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 14:07:32,887][12883] Updated weights for policy 0, policy_version 148503 (0.0023) +[2024-06-18 14:07:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2433220608. Throughput: 0: 42268.0. Samples: 2433348980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:36,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 14:07:37,159][12883] Updated weights for policy 0, policy_version 148513 (0.0034) +[2024-06-18 14:07:40,638][12883] Updated weights for policy 0, policy_version 148523 (0.0036) +[2024-06-18 14:07:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2433449984. Throughput: 0: 42278.2. Samples: 2433598140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:41,994][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 14:07:44,885][12883] Updated weights for policy 0, policy_version 148533 (0.0045) +[2024-06-18 14:07:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2433662976. Throughput: 0: 42514.6. Samples: 2433731200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:46,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 14:07:48,392][12883] Updated weights for policy 0, policy_version 148543 (0.0042) +[2024-06-18 14:07:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42543.8). Total num frames: 2433875968. Throughput: 0: 42374.3. Samples: 2433984080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:51,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 14:07:52,427][12883] Updated weights for policy 0, policy_version 148553 (0.0036) +[2024-06-18 14:07:56,023][12883] Updated weights for policy 0, policy_version 148563 (0.0040) +[2024-06-18 14:07:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2434088960. Throughput: 0: 42340.9. Samples: 2434235120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:07:56,994][12645] Avg episode reward: [(0, '0.608')] +[2024-06-18 14:08:00,051][12883] Updated weights for policy 0, policy_version 148573 (0.0030) +[2024-06-18 14:08:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42599.1). Total num frames: 2434301952. Throughput: 0: 42440.5. Samples: 2434368240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:08:01,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 14:08:03,911][12883] Updated weights for policy 0, policy_version 148583 (0.0033) +[2024-06-18 14:08:06,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2434514944. Throughput: 0: 42313.3. Samples: 2434620680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:08:06,997][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 14:08:08,210][12883] Updated weights for policy 0, policy_version 148593 (0.0040) +[2024-06-18 14:08:11,727][12883] Updated weights for policy 0, policy_version 148603 (0.0036) +[2024-06-18 14:08:11,997][12645] Fps is (10 sec: 40947.1, 60 sec: 42596.2, 300 sec: 42431.3). Total num frames: 2434711552. Throughput: 0: 42181.0. Samples: 2434866940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:11,997][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 14:08:15,882][12883] Updated weights for policy 0, policy_version 148613 (0.0041) +[2024-06-18 14:08:16,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2434924544. Throughput: 0: 42001.8. Samples: 2434991700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:16,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 14:08:19,951][12883] Updated weights for policy 0, policy_version 148623 (0.0027) +[2024-06-18 14:08:21,994][12645] Fps is (10 sec: 42611.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2435137536. Throughput: 0: 42255.0. Samples: 2435250460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:21,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 14:08:23,552][12883] Updated weights for policy 0, policy_version 148633 (0.0034) +[2024-06-18 14:08:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 2435334144. Throughput: 0: 42332.6. Samples: 2435503100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:26,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 14:08:27,607][12883] Updated weights for policy 0, policy_version 148643 (0.0031) +[2024-06-18 14:08:28,933][12862] Signal inference workers to stop experience collection... (35550 times) +[2024-06-18 14:08:28,933][12862] Signal inference workers to resume experience collection... (35550 times) +[2024-06-18 14:08:28,956][12883] InferenceWorker_p0-w0: stopping experience collection (35550 times) +[2024-06-18 14:08:28,957][12883] InferenceWorker_p0-w0: resuming experience collection (35550 times) +[2024-06-18 14:08:31,151][12883] Updated weights for policy 0, policy_version 148653 (0.0027) +[2024-06-18 14:08:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2435563520. Throughput: 0: 42015.0. Samples: 2435621880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:31,994][12645] Avg episode reward: [(0, '0.435')] +[2024-06-18 14:08:35,365][12883] Updated weights for policy 0, policy_version 148663 (0.0038) +[2024-06-18 14:08:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2435776512. Throughput: 0: 42220.5. Samples: 2435884000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:36,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 14:08:37,108][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148669_2435792896.pth... +[2024-06-18 14:08:37,173][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148047_2425602048.pth +[2024-06-18 14:08:38,775][12883] Updated weights for policy 0, policy_version 148673 (0.0042) +[2024-06-18 14:08:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 2435973120. Throughput: 0: 42187.6. Samples: 2436133560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:41,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 14:08:43,069][12883] Updated weights for policy 0, policy_version 148683 (0.0033) +[2024-06-18 14:08:46,481][12883] Updated weights for policy 0, policy_version 148693 (0.0031) +[2024-06-18 14:08:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2436202496. Throughput: 0: 42086.7. Samples: 2436262140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:46,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 14:08:50,874][12883] Updated weights for policy 0, policy_version 148703 (0.0023) +[2024-06-18 14:08:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2436415488. Throughput: 0: 42302.6. Samples: 2436524200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:51,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 14:08:54,077][12883] Updated weights for policy 0, policy_version 148713 (0.0037) +[2024-06-18 14:08:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 2436595712. Throughput: 0: 42397.2. Samples: 2436774680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:08:56,995][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 14:08:58,722][12883] Updated weights for policy 0, policy_version 148723 (0.0025) +[2024-06-18 14:09:01,787][12883] Updated weights for policy 0, policy_version 148733 (0.0028) +[2024-06-18 14:09:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2436841472. Throughput: 0: 42390.7. Samples: 2436899280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:09:01,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 14:09:06,318][12883] Updated weights for policy 0, policy_version 148743 (0.0049) +[2024-06-18 14:09:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 2437038080. Throughput: 0: 42377.0. Samples: 2437157420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:09:06,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 14:09:09,667][12883] Updated weights for policy 0, policy_version 148753 (0.0034) +[2024-06-18 14:09:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42327.6, 300 sec: 42376.3). Total num frames: 2437251072. Throughput: 0: 42502.2. Samples: 2437415700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:11,994][12645] Avg episode reward: [(0, '0.196')] +[2024-06-18 14:09:13,898][12883] Updated weights for policy 0, policy_version 148763 (0.0033) +[2024-06-18 14:09:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2437464064. Throughput: 0: 42560.1. Samples: 2437537080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:16,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 14:09:17,347][12883] Updated weights for policy 0, policy_version 148773 (0.0042) +[2024-06-18 14:09:21,663][12883] Updated weights for policy 0, policy_version 148783 (0.0028) +[2024-06-18 14:09:21,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 2437660672. Throughput: 0: 42405.6. Samples: 2437792260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:21,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 14:09:24,845][12883] Updated weights for policy 0, policy_version 148793 (0.0034) +[2024-06-18 14:09:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 2437890048. Throughput: 0: 42508.1. Samples: 2438046420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:26,994][12645] Avg episode reward: [(0, '0.281')] +[2024-06-18 14:09:29,365][12883] Updated weights for policy 0, policy_version 148803 (0.0045) +[2024-06-18 14:09:31,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2438103040. Throughput: 0: 42526.7. Samples: 2438175840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:31,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 14:09:32,445][12883] Updated weights for policy 0, policy_version 148813 (0.0035) +[2024-06-18 14:09:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 2438299648. Throughput: 0: 42364.1. Samples: 2438430580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:36,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 14:09:37,094][12883] Updated weights for policy 0, policy_version 148823 (0.0032) +[2024-06-18 14:09:39,994][12883] Updated weights for policy 0, policy_version 148833 (0.0033) +[2024-06-18 14:09:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 2438529024. Throughput: 0: 42571.1. Samples: 2438690380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:41,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 14:09:44,688][12883] Updated weights for policy 0, policy_version 148843 (0.0030) +[2024-06-18 14:09:46,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2438758400. Throughput: 0: 42747.5. Samples: 2438822920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:46,994][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 14:09:47,634][12883] Updated weights for policy 0, policy_version 148853 (0.0034) +[2024-06-18 14:09:49,639][12862] Signal inference workers to stop experience collection... (35600 times) +[2024-06-18 14:09:49,681][12883] InferenceWorker_p0-w0: stopping experience collection (35600 times) +[2024-06-18 14:09:49,692][12862] Signal inference workers to resume experience collection... (35600 times) +[2024-06-18 14:09:49,702][12883] InferenceWorker_p0-w0: resuming experience collection (35600 times) +[2024-06-18 14:09:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 2438938624. Throughput: 0: 42605.8. Samples: 2439074680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:51,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 14:09:52,431][12883] Updated weights for policy 0, policy_version 148863 (0.0029) +[2024-06-18 14:09:55,705][12883] Updated weights for policy 0, policy_version 148873 (0.0039) +[2024-06-18 14:09:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2439168000. Throughput: 0: 42452.0. Samples: 2439326040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:09:56,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 14:10:00,119][12883] Updated weights for policy 0, policy_version 148883 (0.0046) +[2024-06-18 14:10:01,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 2439380992. Throughput: 0: 42750.8. Samples: 2439460960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:10:01,996][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 14:10:03,260][12883] Updated weights for policy 0, policy_version 148893 (0.0044) +[2024-06-18 14:10:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 2439593984. Throughput: 0: 42678.4. Samples: 2439712780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:10:06,994][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 14:10:07,918][12883] Updated weights for policy 0, policy_version 148903 (0.0040) +[2024-06-18 14:10:10,810][12883] Updated weights for policy 0, policy_version 148913 (0.0040) +[2024-06-18 14:10:11,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 2439806976. Throughput: 0: 42723.9. Samples: 2439969000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 14:10:11,994][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 14:10:15,539][12883] Updated weights for policy 0, policy_version 148923 (0.0034) +[2024-06-18 14:10:17,000][12645] Fps is (10 sec: 44208.6, 60 sec: 42867.0, 300 sec: 42430.9). Total num frames: 2440036352. Throughput: 0: 42782.4. Samples: 2440101320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:17,001][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 14:10:18,477][12883] Updated weights for policy 0, policy_version 148933 (0.0038) +[2024-06-18 14:10:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 2440249344. Throughput: 0: 42836.8. Samples: 2440358240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:21,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 14:10:23,164][12883] Updated weights for policy 0, policy_version 148943 (0.0032) +[2024-06-18 14:10:26,216][12883] Updated weights for policy 0, policy_version 148953 (0.0035) +[2024-06-18 14:10:26,994][12645] Fps is (10 sec: 42625.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2440462336. Throughput: 0: 42681.0. Samples: 2440611020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:26,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 14:10:30,640][12883] Updated weights for policy 0, policy_version 148963 (0.0036) +[2024-06-18 14:10:31,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42596.8, 300 sec: 42375.9). Total num frames: 2440658944. Throughput: 0: 42752.6. Samples: 2440746880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:31,997][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 14:10:33,780][12883] Updated weights for policy 0, policy_version 148973 (0.0036) +[2024-06-18 14:10:36,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42487.3). Total num frames: 2440904704. Throughput: 0: 42850.5. Samples: 2441002960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:36,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 14:10:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148981_2440904704.pth... +[2024-06-18 14:10:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148357_2430681088.pth +[2024-06-18 14:10:38,575][12883] Updated weights for policy 0, policy_version 148983 (0.0031) +[2024-06-18 14:10:41,238][12883] Updated weights for policy 0, policy_version 148993 (0.0029) +[2024-06-18 14:10:41,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2441101312. Throughput: 0: 42927.9. Samples: 2441257800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:41,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 14:10:46,201][12883] Updated weights for policy 0, policy_version 149003 (0.0031) +[2024-06-18 14:10:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 2441297920. Throughput: 0: 42802.1. Samples: 2441386960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:46,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 14:10:49,323][12883] Updated weights for policy 0, policy_version 149013 (0.0030) +[2024-06-18 14:10:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42487.3). Total num frames: 2441543680. Throughput: 0: 42938.9. Samples: 2441645040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:51,995][12645] Avg episode reward: [(0, '0.696')] +[2024-06-18 14:10:54,043][12883] Updated weights for policy 0, policy_version 149023 (0.0034) +[2024-06-18 14:10:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2441740288. Throughput: 0: 43019.6. Samples: 2441904880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:10:56,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 14:10:57,015][12883] Updated weights for policy 0, policy_version 149033 (0.0023) +[2024-06-18 14:11:01,556][12883] Updated weights for policy 0, policy_version 149043 (0.0027) +[2024-06-18 14:11:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 2441953280. Throughput: 0: 42901.6. Samples: 2442031620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:11:01,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 14:11:04,458][12883] Updated weights for policy 0, policy_version 149053 (0.0025) +[2024-06-18 14:11:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2442182656. Throughput: 0: 43072.0. Samples: 2442296480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:11:06,994][12645] Avg episode reward: [(0, '0.280')] +[2024-06-18 14:11:09,024][12883] Updated weights for policy 0, policy_version 149063 (0.0041) +[2024-06-18 14:11:09,130][12862] Signal inference workers to stop experience collection... (35650 times) +[2024-06-18 14:11:09,179][12883] InferenceWorker_p0-w0: stopping experience collection (35650 times) +[2024-06-18 14:11:09,184][12862] Signal inference workers to resume experience collection... (35650 times) +[2024-06-18 14:11:09,188][12883] InferenceWorker_p0-w0: resuming experience collection (35650 times) +[2024-06-18 14:11:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 2442395648. Throughput: 0: 43103.6. Samples: 2442550680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:11:11,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 14:11:12,035][12883] Updated weights for policy 0, policy_version 149073 (0.0031) +[2024-06-18 14:11:16,614][12883] Updated weights for policy 0, policy_version 149083 (0.0040) +[2024-06-18 14:11:16,999][12645] Fps is (10 sec: 42575.5, 60 sec: 42872.1, 300 sec: 42542.4). Total num frames: 2442608640. Throughput: 0: 42883.3. Samples: 2442676760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) +[2024-06-18 14:11:17,000][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 14:11:19,919][12883] Updated weights for policy 0, policy_version 149093 (0.0032) +[2024-06-18 14:11:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2442821632. Throughput: 0: 42916.1. Samples: 2442934180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:11:21,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 14:11:24,204][12883] Updated weights for policy 0, policy_version 149103 (0.0036) +[2024-06-18 14:11:26,994][12645] Fps is (10 sec: 40981.3, 60 sec: 42598.2, 300 sec: 42431.8). Total num frames: 2443018240. Throughput: 0: 42991.0. Samples: 2443192400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:11:26,994][12645] Avg episode reward: [(0, '0.670')] +[2024-06-18 14:11:27,528][12883] Updated weights for policy 0, policy_version 149113 (0.0037) +[2024-06-18 14:11:31,625][12883] Updated weights for policy 0, policy_version 149123 (0.0038) +[2024-06-18 14:11:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42542.9). Total num frames: 2443231232. Throughput: 0: 43008.0. Samples: 2443322320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:11:31,994][12645] Avg episode reward: [(0, '0.795')] +[2024-06-18 14:11:35,016][12883] Updated weights for policy 0, policy_version 149133 (0.0030) +[2024-06-18 14:11:37,000][12645] Fps is (10 sec: 44210.4, 60 sec: 42594.1, 300 sec: 42597.5). Total num frames: 2443460608. Throughput: 0: 42884.9. Samples: 2443575120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:11:37,000][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 14:11:38,987][12883] Updated weights for policy 0, policy_version 149143 (0.0027) +[2024-06-18 14:11:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2443673600. Throughput: 0: 42865.6. Samples: 2443833840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:11:41,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 14:11:42,570][12883] Updated weights for policy 0, policy_version 149153 (0.0041) +[2024-06-18 14:11:46,994][12645] Fps is (10 sec: 40985.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2443870208. Throughput: 0: 42928.1. Samples: 2443963380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:11:46,994][12645] Avg episode reward: [(0, '0.784')] +[2024-06-18 14:11:47,010][12883] Updated weights for policy 0, policy_version 149163 (0.0035) +[2024-06-18 14:11:50,006][12883] Updated weights for policy 0, policy_version 149173 (0.0027) +[2024-06-18 14:11:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2444099584. Throughput: 0: 42646.6. Samples: 2444215580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:11:51,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 14:11:54,630][12883] Updated weights for policy 0, policy_version 149183 (0.0036) +[2024-06-18 14:11:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2444328960. Throughput: 0: 42763.5. Samples: 2444475040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:11:56,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 14:11:57,670][12883] Updated weights for policy 0, policy_version 149193 (0.0036) +[2024-06-18 14:12:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2444525568. Throughput: 0: 42790.4. Samples: 2444602100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:12:01,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 14:12:02,176][12883] Updated weights for policy 0, policy_version 149203 (0.0035) +[2024-06-18 14:12:05,780][12883] Updated weights for policy 0, policy_version 149213 (0.0042) +[2024-06-18 14:12:07,000][12645] Fps is (10 sec: 42571.8, 60 sec: 42867.0, 300 sec: 42708.6). Total num frames: 2444754944. Throughput: 0: 42738.9. Samples: 2444857700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:12:07,000][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 14:12:10,281][12883] Updated weights for policy 0, policy_version 149223 (0.0026) +[2024-06-18 14:12:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2444967936. Throughput: 0: 42620.2. Samples: 2445110300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:12:11,994][12645] Avg episode reward: [(0, '0.607')] +[2024-06-18 14:12:13,554][12883] Updated weights for policy 0, policy_version 149233 (0.0025) +[2024-06-18 14:12:16,994][12645] Fps is (10 sec: 39346.5, 60 sec: 42329.2, 300 sec: 42542.9). Total num frames: 2445148160. Throughput: 0: 42625.0. Samples: 2445240440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:12:16,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 14:12:17,774][12883] Updated weights for policy 0, policy_version 149243 (0.0026) +[2024-06-18 14:12:21,297][12883] Updated weights for policy 0, policy_version 149253 (0.0027) +[2024-06-18 14:12:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2445393920. Throughput: 0: 42716.9. Samples: 2445497120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) +[2024-06-18 14:12:21,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 14:12:25,286][12883] Updated weights for policy 0, policy_version 149263 (0.0036) +[2024-06-18 14:12:26,109][12862] Signal inference workers to stop experience collection... (35700 times) +[2024-06-18 14:12:26,109][12862] Signal inference workers to resume experience collection... (35700 times) +[2024-06-18 14:12:26,119][12883] InferenceWorker_p0-w0: stopping experience collection (35700 times) +[2024-06-18 14:12:26,120][12883] InferenceWorker_p0-w0: resuming experience collection (35700 times) +[2024-06-18 14:12:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 2445590528. Throughput: 0: 42687.8. Samples: 2445754780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:12:26,994][12645] Avg episode reward: [(0, '0.464')] +[2024-06-18 14:12:28,865][12883] Updated weights for policy 0, policy_version 149273 (0.0031) +[2024-06-18 14:12:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2445803520. Throughput: 0: 42654.9. Samples: 2445882860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:12:31,994][12645] Avg episode reward: [(0, '0.377')] +[2024-06-18 14:12:32,861][12883] Updated weights for policy 0, policy_version 149283 (0.0038) +[2024-06-18 14:12:36,427][12883] Updated weights for policy 0, policy_version 149293 (0.0033) +[2024-06-18 14:12:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42875.9, 300 sec: 42654.0). Total num frames: 2446032896. Throughput: 0: 42774.8. Samples: 2446140440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:12:36,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 14:12:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149295_2446049280.pth... +[2024-06-18 14:12:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148669_2435792896.pth +[2024-06-18 14:12:40,394][12883] Updated weights for policy 0, policy_version 149303 (0.0027) +[2024-06-18 14:12:42,000][12645] Fps is (10 sec: 44209.7, 60 sec: 42867.1, 300 sec: 42653.0). Total num frames: 2446245888. Throughput: 0: 42879.9. Samples: 2446404900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:12:42,000][12645] Avg episode reward: [(0, '0.141')] +[2024-06-18 14:12:43,956][12883] Updated weights for policy 0, policy_version 149313 (0.0043) +[2024-06-18 14:12:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2446458880. Throughput: 0: 42809.7. Samples: 2446528540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:12:46,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 14:12:48,537][12883] Updated weights for policy 0, policy_version 149323 (0.0028) +[2024-06-18 14:12:51,595][12883] Updated weights for policy 0, policy_version 149333 (0.0035) +[2024-06-18 14:12:51,996][12645] Fps is (10 sec: 42615.4, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2446671872. Throughput: 0: 42818.5. Samples: 2446784360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:12:51,997][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 14:12:56,114][12883] Updated weights for policy 0, policy_version 149343 (0.0036) +[2024-06-18 14:12:56,996][12645] Fps is (10 sec: 42587.3, 60 sec: 42596.5, 300 sec: 42653.6). Total num frames: 2446884864. Throughput: 0: 42990.7. Samples: 2447045000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:12:56,997][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 14:12:59,212][12883] Updated weights for policy 0, policy_version 149353 (0.0044) +[2024-06-18 14:13:01,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2447097856. Throughput: 0: 42833.6. Samples: 2447167960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:13:01,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 14:13:03,664][12883] Updated weights for policy 0, policy_version 149363 (0.0039) +[2024-06-18 14:13:06,994][12645] Fps is (10 sec: 42610.0, 60 sec: 42602.9, 300 sec: 42709.9). Total num frames: 2447310848. Throughput: 0: 42760.5. Samples: 2447421340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:13:06,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 14:13:07,052][12883] Updated weights for policy 0, policy_version 149373 (0.0039) +[2024-06-18 14:13:11,277][12883] Updated weights for policy 0, policy_version 149383 (0.0032) +[2024-06-18 14:13:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2447523840. Throughput: 0: 42787.5. Samples: 2447680220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:13:11,994][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 14:13:14,932][12883] Updated weights for policy 0, policy_version 149393 (0.0042) +[2024-06-18 14:13:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2447736832. Throughput: 0: 42796.9. Samples: 2447808720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:13:16,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 14:13:18,801][12883] Updated weights for policy 0, policy_version 149403 (0.0041) +[2024-06-18 14:13:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2447949824. Throughput: 0: 42643.9. Samples: 2448059420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) +[2024-06-18 14:13:21,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 14:13:22,967][12883] Updated weights for policy 0, policy_version 149413 (0.0035) +[2024-06-18 14:13:26,310][12883] Updated weights for policy 0, policy_version 149423 (0.0045) +[2024-06-18 14:13:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2448162816. Throughput: 0: 42488.9. Samples: 2448316640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:13:26,994][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 14:13:30,719][12883] Updated weights for policy 0, policy_version 149433 (0.0044) +[2024-06-18 14:13:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2448359424. Throughput: 0: 42518.2. Samples: 2448441860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:13:31,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 14:13:34,296][12883] Updated weights for policy 0, policy_version 149443 (0.0038) +[2024-06-18 14:13:36,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2448572416. Throughput: 0: 42442.2. Samples: 2448694160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:13:36,994][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 14:13:38,319][12883] Updated weights for policy 0, policy_version 149453 (0.0032) +[2024-06-18 14:13:41,794][12883] Updated weights for policy 0, policy_version 149463 (0.0043) +[2024-06-18 14:13:41,998][12645] Fps is (10 sec: 44216.2, 60 sec: 42599.4, 300 sec: 42708.8). Total num frames: 2448801792. Throughput: 0: 42438.5. Samples: 2448954820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:13:41,999][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 14:13:45,965][12883] Updated weights for policy 0, policy_version 149473 (0.0037) +[2024-06-18 14:13:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2448998400. Throughput: 0: 42594.3. Samples: 2449084700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:13:46,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 14:13:49,438][12883] Updated weights for policy 0, policy_version 149483 (0.0033) +[2024-06-18 14:13:51,994][12645] Fps is (10 sec: 40979.4, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 2449211392. Throughput: 0: 42468.8. Samples: 2449332440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:13:51,994][12645] Avg episode reward: [(0, '0.625')] +[2024-06-18 14:13:53,709][12883] Updated weights for policy 0, policy_version 149493 (0.0037) +[2024-06-18 14:13:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42327.3, 300 sec: 42653.9). Total num frames: 2449424384. Throughput: 0: 42412.9. Samples: 2449588800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:13:56,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 14:13:57,376][12883] Updated weights for policy 0, policy_version 149503 (0.0044) +[2024-06-18 14:14:01,541][12883] Updated weights for policy 0, policy_version 149513 (0.0041) +[2024-06-18 14:14:01,994][12645] Fps is (10 sec: 42595.5, 60 sec: 42324.9, 300 sec: 42709.4). Total num frames: 2449637376. Throughput: 0: 42424.7. Samples: 2449717860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:14:01,995][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 14:14:04,979][12883] Updated weights for policy 0, policy_version 149523 (0.0034) +[2024-06-18 14:14:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2449866752. Throughput: 0: 42355.1. Samples: 2449965400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:14:06,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 14:14:09,363][12883] Updated weights for policy 0, policy_version 149533 (0.0032) +[2024-06-18 14:14:11,994][12645] Fps is (10 sec: 42601.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2450063360. Throughput: 0: 42593.0. Samples: 2450233320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:14:11,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 14:14:12,637][12883] Updated weights for policy 0, policy_version 149543 (0.0028) +[2024-06-18 14:14:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2450259968. Throughput: 0: 42514.3. Samples: 2450355000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:14:16,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 14:14:17,095][12883] Updated weights for policy 0, policy_version 149553 (0.0042) +[2024-06-18 14:14:20,339][12883] Updated weights for policy 0, policy_version 149563 (0.0030) +[2024-06-18 14:14:20,578][12862] Signal inference workers to stop experience collection... (35750 times) +[2024-06-18 14:14:20,579][12862] Signal inference workers to resume experience collection... (35750 times) +[2024-06-18 14:14:20,613][12883] InferenceWorker_p0-w0: stopping experience collection (35750 times) +[2024-06-18 14:14:20,613][12883] InferenceWorker_p0-w0: resuming experience collection (35750 times) +[2024-06-18 14:14:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2450505728. Throughput: 0: 42522.1. Samples: 2450607660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:14:21,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 14:14:24,732][12883] Updated weights for policy 0, policy_version 149573 (0.0033) +[2024-06-18 14:14:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 2450685952. Throughput: 0: 42509.4. Samples: 2450867540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:14:26,994][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 14:14:27,944][12883] Updated weights for policy 0, policy_version 149583 (0.0037) +[2024-06-18 14:14:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2450898944. Throughput: 0: 42353.3. Samples: 2450990600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:14:31,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 14:14:32,514][12883] Updated weights for policy 0, policy_version 149593 (0.0038) +[2024-06-18 14:14:35,562][12883] Updated weights for policy 0, policy_version 149603 (0.0034) +[2024-06-18 14:14:36,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2451161088. Throughput: 0: 42609.8. Samples: 2451249880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:14:36,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 14:14:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149607_2451161088.pth... +[2024-06-18 14:14:37,094][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148981_2440904704.pth +[2024-06-18 14:14:40,135][12883] Updated weights for policy 0, policy_version 149613 (0.0023) +[2024-06-18 14:14:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42055.7, 300 sec: 42598.4). Total num frames: 2451324928. Throughput: 0: 42597.9. Samples: 2451505700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:14:41,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 14:14:43,413][12883] Updated weights for policy 0, policy_version 149623 (0.0039) +[2024-06-18 14:14:46,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2451537920. Throughput: 0: 42303.3. Samples: 2451621480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:14:46,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 14:14:48,396][12883] Updated weights for policy 0, policy_version 149633 (0.0038) +[2024-06-18 14:14:51,297][12883] Updated weights for policy 0, policy_version 149643 (0.0033) +[2024-06-18 14:14:51,994][12645] Fps is (10 sec: 47512.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2451800064. Throughput: 0: 42504.0. Samples: 2451878080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:14:51,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 14:14:56,323][12883] Updated weights for policy 0, policy_version 149653 (0.0034) +[2024-06-18 14:14:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 2451963904. Throughput: 0: 42290.2. Samples: 2452136380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:14:56,994][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 14:14:58,904][12883] Updated weights for policy 0, policy_version 149663 (0.0027) +[2024-06-18 14:15:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.9, 300 sec: 42709.5). Total num frames: 2452193280. Throughput: 0: 42130.2. Samples: 2452250860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:15:01,994][12645] Avg episode reward: [(0, '0.667')] +[2024-06-18 14:15:03,872][12883] Updated weights for policy 0, policy_version 149673 (0.0028) +[2024-06-18 14:15:06,618][12883] Updated weights for policy 0, policy_version 149683 (0.0042) +[2024-06-18 14:15:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2452422656. Throughput: 0: 42428.8. Samples: 2452516960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:15:06,994][12645] Avg episode reward: [(0, '0.775')] +[2024-06-18 14:15:11,528][12883] Updated weights for policy 0, policy_version 149693 (0.0025) +[2024-06-18 14:15:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42543.8). Total num frames: 2452586496. Throughput: 0: 42462.1. Samples: 2452778340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:15:11,994][12645] Avg episode reward: [(0, '0.787')] +[2024-06-18 14:15:14,320][12883] Updated weights for policy 0, policy_version 149703 (0.0032) +[2024-06-18 14:15:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2452848640. Throughput: 0: 42277.0. Samples: 2452893060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:15:16,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 14:15:19,176][12883] Updated weights for policy 0, policy_version 149713 (0.0050) +[2024-06-18 14:15:21,993][12645] Fps is (10 sec: 47514.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2453061632. Throughput: 0: 42395.3. Samples: 2453157660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:15:21,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 14:15:22,002][12883] Updated weights for policy 0, policy_version 149723 (0.0041) +[2024-06-18 14:15:26,802][12883] Updated weights for policy 0, policy_version 149733 (0.0033) +[2024-06-18 14:15:26,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 2453225472. Throughput: 0: 42512.3. Samples: 2453418760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:15:26,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 14:15:29,549][12883] Updated weights for policy 0, policy_version 149743 (0.0028) +[2024-06-18 14:15:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2453471232. Throughput: 0: 42464.1. Samples: 2453532360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) +[2024-06-18 14:15:31,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 14:15:34,654][12883] Updated weights for policy 0, policy_version 149753 (0.0033) +[2024-06-18 14:15:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2453684224. Throughput: 0: 42553.0. Samples: 2453792960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:15:36,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 14:15:37,558][12883] Updated weights for policy 0, policy_version 149763 (0.0034) +[2024-06-18 14:15:41,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 2453848064. Throughput: 0: 42428.8. Samples: 2454045680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:15:41,994][12645] Avg episode reward: [(0, '0.730')] +[2024-06-18 14:15:42,373][12883] Updated weights for policy 0, policy_version 149773 (0.0027) +[2024-06-18 14:15:45,404][12883] Updated weights for policy 0, policy_version 149783 (0.0022) +[2024-06-18 14:15:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2454110208. Throughput: 0: 42478.3. Samples: 2454162380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:15:46,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 14:15:49,989][12883] Updated weights for policy 0, policy_version 149793 (0.0039) +[2024-06-18 14:15:51,123][12862] Signal inference workers to stop experience collection... (35800 times) +[2024-06-18 14:15:51,153][12883] InferenceWorker_p0-w0: stopping experience collection (35800 times) +[2024-06-18 14:15:51,171][12862] Signal inference workers to resume experience collection... (35800 times) +[2024-06-18 14:15:51,173][12883] InferenceWorker_p0-w0: resuming experience collection (35800 times) +[2024-06-18 14:15:52,000][12645] Fps is (10 sec: 45846.6, 60 sec: 41774.9, 300 sec: 42597.5). Total num frames: 2454306816. Throughput: 0: 42448.8. Samples: 2454427420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:15:52,001][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 14:15:53,080][12883] Updated weights for policy 0, policy_version 149803 (0.0036) +[2024-06-18 14:15:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2454503424. Throughput: 0: 42282.7. Samples: 2454681060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:15:56,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 14:15:57,531][12883] Updated weights for policy 0, policy_version 149813 (0.0038) +[2024-06-18 14:16:00,748][12883] Updated weights for policy 0, policy_version 149823 (0.0035) +[2024-06-18 14:16:01,994][12645] Fps is (10 sec: 42625.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2454732800. Throughput: 0: 42597.3. Samples: 2454809940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:16:01,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 14:16:05,031][12883] Updated weights for policy 0, policy_version 149833 (0.0038) +[2024-06-18 14:16:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 2454945792. Throughput: 0: 42349.5. Samples: 2455063400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:16:06,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 14:16:08,282][12883] Updated weights for policy 0, policy_version 149843 (0.0033) +[2024-06-18 14:16:11,999][12645] Fps is (10 sec: 40937.2, 60 sec: 42594.5, 300 sec: 42487.3). Total num frames: 2455142400. Throughput: 0: 42121.9. Samples: 2455314480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:16:12,000][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 14:16:12,607][12883] Updated weights for policy 0, policy_version 149853 (0.0046) +[2024-06-18 14:16:15,859][12883] Updated weights for policy 0, policy_version 149863 (0.0032) +[2024-06-18 14:16:16,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2455355392. Throughput: 0: 42535.5. Samples: 2455446460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:16:16,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 14:16:20,517][12883] Updated weights for policy 0, policy_version 149873 (0.0035) +[2024-06-18 14:16:21,994][12645] Fps is (10 sec: 42621.8, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 2455568384. Throughput: 0: 42358.6. Samples: 2455699100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:16:21,994][12645] Avg episode reward: [(0, '0.225')] +[2024-06-18 14:16:23,843][12883] Updated weights for policy 0, policy_version 149883 (0.0032) +[2024-06-18 14:16:27,000][12645] Fps is (10 sec: 42571.4, 60 sec: 42594.0, 300 sec: 42542.0). Total num frames: 2455781376. Throughput: 0: 42453.3. Samples: 2455956340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:16:27,000][12645] Avg episode reward: [(0, '0.241')] +[2024-06-18 14:16:28,205][12883] Updated weights for policy 0, policy_version 149893 (0.0026) +[2024-06-18 14:16:31,395][12883] Updated weights for policy 0, policy_version 149903 (0.0037) +[2024-06-18 14:16:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42543.7). Total num frames: 2456010752. Throughput: 0: 42650.6. Samples: 2456081660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:16:31,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 14:16:36,070][12883] Updated weights for policy 0, policy_version 149913 (0.0029) +[2024-06-18 14:16:36,994][12645] Fps is (10 sec: 42625.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2456207360. Throughput: 0: 42502.0. Samples: 2456339740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:16:36,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 14:16:37,122][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149916_2456223744.pth... +[2024-06-18 14:16:37,183][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149295_2446049280.pth +[2024-06-18 14:16:39,008][12883] Updated weights for policy 0, policy_version 149923 (0.0037) +[2024-06-18 14:16:41,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 2456420352. Throughput: 0: 42343.3. Samples: 2456586600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:16:41,996][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 14:16:43,897][12883] Updated weights for policy 0, policy_version 149933 (0.0029) +[2024-06-18 14:16:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2456649728. Throughput: 0: 42348.9. Samples: 2456715640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:16:46,994][12645] Avg episode reward: [(0, '0.763')] +[2024-06-18 14:16:47,091][12883] Updated weights for policy 0, policy_version 149943 (0.0031) +[2024-06-18 14:16:51,913][12883] Updated weights for policy 0, policy_version 149953 (0.0037) +[2024-06-18 14:16:51,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42056.6, 300 sec: 42376.2). Total num frames: 2456829952. Throughput: 0: 42243.1. Samples: 2456964340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:16:51,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 14:16:54,629][12862] Signal inference workers to stop experience collection... (35850 times) +[2024-06-18 14:16:54,630][12862] Signal inference workers to resume experience collection... (35850 times) +[2024-06-18 14:16:54,656][12883] InferenceWorker_p0-w0: stopping experience collection (35850 times) +[2024-06-18 14:16:54,656][12883] InferenceWorker_p0-w0: resuming experience collection (35850 times) +[2024-06-18 14:16:54,770][12883] Updated weights for policy 0, policy_version 149963 (0.0028) +[2024-06-18 14:16:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2457059328. Throughput: 0: 42512.9. Samples: 2457227320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:16:56,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 14:16:59,412][12883] Updated weights for policy 0, policy_version 149973 (0.0035) +[2024-06-18 14:17:01,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42488.2). Total num frames: 2457288704. Throughput: 0: 42491.9. Samples: 2457358600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:17:01,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 14:17:02,321][12883] Updated weights for policy 0, policy_version 149983 (0.0037) +[2024-06-18 14:17:06,855][12883] Updated weights for policy 0, policy_version 149993 (0.0030) +[2024-06-18 14:17:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2457485312. Throughput: 0: 42589.7. Samples: 2457615640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:17:06,995][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 14:17:10,282][12883] Updated weights for policy 0, policy_version 150003 (0.0037) +[2024-06-18 14:17:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42602.3, 300 sec: 42542.9). Total num frames: 2457698304. Throughput: 0: 42506.4. Samples: 2457868860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:17:11,994][12645] Avg episode reward: [(0, '0.642')] +[2024-06-18 14:17:14,599][12883] Updated weights for policy 0, policy_version 150013 (0.0038) +[2024-06-18 14:17:16,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2457927680. Throughput: 0: 42646.7. Samples: 2458000760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:17:16,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 14:17:17,996][12883] Updated weights for policy 0, policy_version 150023 (0.0032) +[2024-06-18 14:17:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2458124288. Throughput: 0: 42554.6. Samples: 2458254700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:17:21,996][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 14:17:22,046][12883] Updated weights for policy 0, policy_version 150033 (0.0029) +[2024-06-18 14:17:25,507][12883] Updated weights for policy 0, policy_version 150043 (0.0031) +[2024-06-18 14:17:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42876.0, 300 sec: 42542.9). Total num frames: 2458353664. Throughput: 0: 42900.4. Samples: 2458517020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:17:26,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 14:17:29,600][12883] Updated weights for policy 0, policy_version 150053 (0.0037) +[2024-06-18 14:17:31,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 2458583040. Throughput: 0: 42896.9. Samples: 2458646100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:17:31,996][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 14:17:33,124][12883] Updated weights for policy 0, policy_version 150063 (0.0040) +[2024-06-18 14:17:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42432.7). Total num frames: 2458763264. Throughput: 0: 43056.5. Samples: 2458901880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) +[2024-06-18 14:17:36,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 14:17:37,259][12883] Updated weights for policy 0, policy_version 150073 (0.0033) +[2024-06-18 14:17:40,906][12883] Updated weights for policy 0, policy_version 150083 (0.0045) +[2024-06-18 14:17:41,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42873.0, 300 sec: 42487.3). Total num frames: 2458992640. Throughput: 0: 42846.5. Samples: 2459155420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:17:41,994][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 14:17:44,824][12883] Updated weights for policy 0, policy_version 150093 (0.0045) +[2024-06-18 14:17:46,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 2459238400. Throughput: 0: 42832.4. Samples: 2459286060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:17:46,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 14:17:48,774][12883] Updated weights for policy 0, policy_version 150103 (0.0035) +[2024-06-18 14:17:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42487.7). Total num frames: 2459418624. Throughput: 0: 42746.6. Samples: 2459539240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:17:51,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 14:17:52,429][12883] Updated weights for policy 0, policy_version 150113 (0.0034) +[2024-06-18 14:17:56,825][12883] Updated weights for policy 0, policy_version 150123 (0.0039) +[2024-06-18 14:17:56,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 2459631616. Throughput: 0: 42800.5. Samples: 2459794980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:17:56,996][12645] Avg episode reward: [(0, '0.270')] +[2024-06-18 14:18:00,481][12883] Updated weights for policy 0, policy_version 150133 (0.0037) +[2024-06-18 14:18:01,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2459877376. Throughput: 0: 42624.4. Samples: 2459918860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:01,999][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 14:18:04,450][12883] Updated weights for policy 0, policy_version 150143 (0.0036) +[2024-06-18 14:18:06,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2460057600. Throughput: 0: 42819.1. Samples: 2460181560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:06,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 14:18:08,096][12883] Updated weights for policy 0, policy_version 150153 (0.0031) +[2024-06-18 14:18:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2460254208. Throughput: 0: 42642.6. Samples: 2460435940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:11,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 14:18:12,199][12883] Updated weights for policy 0, policy_version 150163 (0.0038) +[2024-06-18 14:18:12,668][12862] Signal inference workers to stop experience collection... (35900 times) +[2024-06-18 14:18:12,669][12862] Signal inference workers to resume experience collection... (35900 times) +[2024-06-18 14:18:12,698][12883] InferenceWorker_p0-w0: stopping experience collection (35900 times) +[2024-06-18 14:18:12,699][12883] InferenceWorker_p0-w0: resuming experience collection (35900 times) +[2024-06-18 14:18:15,833][12883] Updated weights for policy 0, policy_version 150173 (0.0029) +[2024-06-18 14:18:16,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 2460532736. Throughput: 0: 42511.0. Samples: 2460559000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:16,994][12645] Avg episode reward: [(0, '0.400')] +[2024-06-18 14:18:20,122][12883] Updated weights for policy 0, policy_version 150183 (0.0046) +[2024-06-18 14:18:21,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 2460712960. Throughput: 0: 42630.9. Samples: 2460820280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:21,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 14:18:23,428][12883] Updated weights for policy 0, policy_version 150193 (0.0026) +[2024-06-18 14:18:26,994][12645] Fps is (10 sec: 36045.2, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 2460893184. Throughput: 0: 42651.7. Samples: 2461074740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:26,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 14:18:27,754][12883] Updated weights for policy 0, policy_version 150203 (0.0034) +[2024-06-18 14:18:31,177][12883] Updated weights for policy 0, policy_version 150213 (0.0036) +[2024-06-18 14:18:31,994][12645] Fps is (10 sec: 45876.2, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 2461171712. Throughput: 0: 42509.8. Samples: 2461199000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:31,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 14:18:35,567][12883] Updated weights for policy 0, policy_version 150223 (0.0033) +[2024-06-18 14:18:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42432.5). Total num frames: 2461319168. Throughput: 0: 42695.8. Samples: 2461460540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:36,994][12645] Avg episode reward: [(0, '0.768')] +[2024-06-18 14:18:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150227_2461319168.pth... +[2024-06-18 14:18:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149607_2451161088.pth +[2024-06-18 14:18:38,843][12883] Updated weights for policy 0, policy_version 150233 (0.0034) +[2024-06-18 14:18:41,993][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 2461548544. Throughput: 0: 42400.0. Samples: 2461702880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 14:18:41,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 14:18:43,354][12883] Updated weights for policy 0, policy_version 150243 (0.0029) +[2024-06-18 14:18:46,386][12883] Updated weights for policy 0, policy_version 150253 (0.0039) +[2024-06-18 14:18:46,996][12645] Fps is (10 sec: 45864.5, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 2461777920. Throughput: 0: 42689.9. Samples: 2461840000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:18:46,996][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 14:18:50,919][12883] Updated weights for policy 0, policy_version 150263 (0.0044) +[2024-06-18 14:18:51,994][12645] Fps is (10 sec: 39320.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2461941760. Throughput: 0: 42533.7. Samples: 2462095580. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:18:51,994][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 14:18:53,866][12883] Updated weights for policy 0, policy_version 150273 (0.0027) +[2024-06-18 14:18:56,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42873.1, 300 sec: 42598.5). Total num frames: 2462203904. Throughput: 0: 42333.8. Samples: 2462340960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:18:56,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 14:18:58,840][12883] Updated weights for policy 0, policy_version 150283 (0.0044) +[2024-06-18 14:19:01,483][12883] Updated weights for policy 0, policy_version 150293 (0.0024) +[2024-06-18 14:19:01,993][12645] Fps is (10 sec: 47515.0, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2462416896. Throughput: 0: 42726.8. Samples: 2462481700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:01,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 14:19:03,148][12862] Signal inference workers to stop experience collection... (35950 times) +[2024-06-18 14:19:03,192][12883] InferenceWorker_p0-w0: stopping experience collection (35950 times) +[2024-06-18 14:19:03,197][12862] Signal inference workers to resume experience collection... (35950 times) +[2024-06-18 14:19:03,203][12883] InferenceWorker_p0-w0: resuming experience collection (35950 times) +[2024-06-18 14:19:06,603][12883] Updated weights for policy 0, policy_version 150303 (0.0031) +[2024-06-18 14:19:06,996][12645] Fps is (10 sec: 37674.7, 60 sec: 42050.8, 300 sec: 42431.5). Total num frames: 2462580736. Throughput: 0: 42572.3. Samples: 2462736120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:06,997][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 14:19:09,097][12883] Updated weights for policy 0, policy_version 150313 (0.0026) +[2024-06-18 14:19:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2462842880. Throughput: 0: 42394.6. Samples: 2462982500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:11,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 14:19:14,121][12883] Updated weights for policy 0, policy_version 150323 (0.0028) +[2024-06-18 14:19:16,786][12883] Updated weights for policy 0, policy_version 150333 (0.0038) +[2024-06-18 14:19:16,994][12645] Fps is (10 sec: 49162.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2463072256. Throughput: 0: 42870.5. Samples: 2463128180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:16,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 14:19:21,727][12883] Updated weights for policy 0, policy_version 150343 (0.0030) +[2024-06-18 14:19:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.4, 300 sec: 42487.3). Total num frames: 2463219712. Throughput: 0: 42622.7. Samples: 2463378560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:21,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 14:19:24,438][12883] Updated weights for policy 0, policy_version 150353 (0.0033) +[2024-06-18 14:19:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2463498240. Throughput: 0: 42645.6. Samples: 2463621940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:26,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 14:19:29,371][12883] Updated weights for policy 0, policy_version 150363 (0.0037) +[2024-06-18 14:19:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 2463678464. Throughput: 0: 42754.2. Samples: 2463763840. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:31,994][12645] Avg episode reward: [(0, '0.688')] +[2024-06-18 14:19:32,222][12883] Updated weights for policy 0, policy_version 150373 (0.0032) +[2024-06-18 14:19:37,000][12645] Fps is (10 sec: 36022.6, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 2463858688. Throughput: 0: 42442.2. Samples: 2464005740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:37,000][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 14:19:37,092][12883] Updated weights for policy 0, policy_version 150383 (0.0036) +[2024-06-18 14:19:39,859][12883] Updated weights for policy 0, policy_version 150393 (0.0030) +[2024-06-18 14:19:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2464120832. Throughput: 0: 42612.9. Samples: 2464258540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:41,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 14:19:44,699][12883] Updated weights for policy 0, policy_version 150403 (0.0025) +[2024-06-18 14:19:46,994][12645] Fps is (10 sec: 47543.6, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 2464333824. Throughput: 0: 42665.2. Samples: 2464401640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) +[2024-06-18 14:19:46,994][12645] Avg episode reward: [(0, '0.607')] +[2024-06-18 14:19:47,568][12883] Updated weights for policy 0, policy_version 150413 (0.0033) +[2024-06-18 14:19:51,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2464497664. Throughput: 0: 42564.7. Samples: 2464651440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:19:51,994][12645] Avg episode reward: [(0, '0.742')] +[2024-06-18 14:19:52,312][12883] Updated weights for policy 0, policy_version 150423 (0.0042) +[2024-06-18 14:19:55,044][12883] Updated weights for policy 0, policy_version 150433 (0.0028) +[2024-06-18 14:19:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2464759808. Throughput: 0: 42671.5. Samples: 2464902720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:19:56,994][12645] Avg episode reward: [(0, '0.649')] +[2024-06-18 14:20:00,055][12883] Updated weights for policy 0, policy_version 150443 (0.0023) +[2024-06-18 14:20:00,679][12862] Signal inference workers to stop experience collection... (36000 times) +[2024-06-18 14:20:00,679][12862] Signal inference workers to resume experience collection... (36000 times) +[2024-06-18 14:20:00,723][12883] InferenceWorker_p0-w0: stopping experience collection (36000 times) +[2024-06-18 14:20:00,724][12883] InferenceWorker_p0-w0: resuming experience collection (36000 times) +[2024-06-18 14:20:01,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2464972800. Throughput: 0: 42522.8. Samples: 2465041700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:01,994][12645] Avg episode reward: [(0, '0.705')] +[2024-06-18 14:20:02,559][12883] Updated weights for policy 0, policy_version 150453 (0.0027) +[2024-06-18 14:20:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43146.2, 300 sec: 42654.0). Total num frames: 2465169408. Throughput: 0: 42561.7. Samples: 2465293840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:06,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 14:20:07,719][12883] Updated weights for policy 0, policy_version 150463 (0.0030) +[2024-06-18 14:20:10,492][12883] Updated weights for policy 0, policy_version 150473 (0.0049) +[2024-06-18 14:20:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2465398784. Throughput: 0: 42741.0. Samples: 2465545280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:11,994][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 14:20:15,353][12883] Updated weights for policy 0, policy_version 150483 (0.0042) +[2024-06-18 14:20:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2465595392. Throughput: 0: 42477.6. Samples: 2465675340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:16,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 14:20:18,189][12883] Updated weights for policy 0, policy_version 150493 (0.0037) +[2024-06-18 14:20:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2465808384. Throughput: 0: 42777.9. Samples: 2465930480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:21,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 14:20:23,056][12883] Updated weights for policy 0, policy_version 150503 (0.0035) +[2024-06-18 14:20:25,866][12883] Updated weights for policy 0, policy_version 150513 (0.0030) +[2024-06-18 14:20:26,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2466054144. Throughput: 0: 42696.8. Samples: 2466179900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:26,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 14:20:30,842][12883] Updated weights for policy 0, policy_version 150523 (0.0032) +[2024-06-18 14:20:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2466234368. Throughput: 0: 42490.7. Samples: 2466313720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:31,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 14:20:33,570][12883] Updated weights for policy 0, policy_version 150533 (0.0039) +[2024-06-18 14:20:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43422.1, 300 sec: 42765.0). Total num frames: 2466463744. Throughput: 0: 42670.7. Samples: 2466571620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:36,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 14:20:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150541_2466463744.pth... +[2024-06-18 14:20:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149916_2456223744.pth +[2024-06-18 14:20:38,552][12883] Updated weights for policy 0, policy_version 150543 (0.0042) +[2024-06-18 14:20:41,346][12883] Updated weights for policy 0, policy_version 150553 (0.0032) +[2024-06-18 14:20:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2466693120. Throughput: 0: 42578.2. Samples: 2466818740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:41,994][12645] Avg episode reward: [(0, '0.717')] +[2024-06-18 14:20:46,338][12883] Updated weights for policy 0, policy_version 150563 (0.0044) +[2024-06-18 14:20:46,996][12645] Fps is (10 sec: 39313.0, 60 sec: 42050.7, 300 sec: 42543.4). Total num frames: 2466856960. Throughput: 0: 42306.8. Samples: 2466945600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) +[2024-06-18 14:20:46,996][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 14:20:49,262][12883] Updated weights for policy 0, policy_version 150573 (0.0036) +[2024-06-18 14:20:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2467086336. Throughput: 0: 42422.6. Samples: 2467202860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:20:51,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 14:20:53,978][12883] Updated weights for policy 0, policy_version 150583 (0.0029) +[2024-06-18 14:20:56,787][12883] Updated weights for policy 0, policy_version 150593 (0.0051) +[2024-06-18 14:20:56,994][12645] Fps is (10 sec: 45885.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2467315712. Throughput: 0: 42411.1. Samples: 2467453780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:20:56,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 14:21:01,828][12883] Updated weights for policy 0, policy_version 150603 (0.0045) +[2024-06-18 14:21:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 2467479552. Throughput: 0: 42587.2. Samples: 2467591760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:01,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 14:21:04,393][12883] Updated weights for policy 0, policy_version 150613 (0.0031) +[2024-06-18 14:21:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42654.7). Total num frames: 2467725312. Throughput: 0: 42543.1. Samples: 2467844920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:06,994][12645] Avg episode reward: [(0, '0.303')] +[2024-06-18 14:21:09,326][12883] Updated weights for policy 0, policy_version 150623 (0.0037) +[2024-06-18 14:21:10,883][12862] Signal inference workers to stop experience collection... (36050 times) +[2024-06-18 14:21:10,934][12862] Signal inference workers to resume experience collection... (36050 times) +[2024-06-18 14:21:10,935][12883] InferenceWorker_p0-w0: stopping experience collection (36050 times) +[2024-06-18 14:21:10,950][12883] InferenceWorker_p0-w0: resuming experience collection (36050 times) +[2024-06-18 14:21:11,895][12883] Updated weights for policy 0, policy_version 150633 (0.0021) +[2024-06-18 14:21:11,994][12645] Fps is (10 sec: 49152.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2467971072. Throughput: 0: 42715.6. Samples: 2468102100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:11,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 14:21:16,922][12883] Updated weights for policy 0, policy_version 150643 (0.0034) +[2024-06-18 14:21:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2468134912. Throughput: 0: 42715.1. Samples: 2468235900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:16,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 14:21:19,417][12883] Updated weights for policy 0, policy_version 150653 (0.0032) +[2024-06-18 14:21:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 2468380672. Throughput: 0: 42513.3. Samples: 2468484720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:21,995][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 14:21:24,487][12883] Updated weights for policy 0, policy_version 150663 (0.0032) +[2024-06-18 14:21:26,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2468610048. Throughput: 0: 42719.6. Samples: 2468741120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:26,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 14:21:27,070][12883] Updated weights for policy 0, policy_version 150673 (0.0027) +[2024-06-18 14:21:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2468773888. Throughput: 0: 42768.8. Samples: 2468870100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:32,000][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 14:21:32,219][12883] Updated weights for policy 0, policy_version 150683 (0.0037) +[2024-06-18 14:21:34,926][12883] Updated weights for policy 0, policy_version 150693 (0.0042) +[2024-06-18 14:21:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2469036032. Throughput: 0: 42740.8. Samples: 2469126200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:36,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 14:21:39,724][12883] Updated weights for policy 0, policy_version 150703 (0.0036) +[2024-06-18 14:21:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2469232640. Throughput: 0: 42932.6. Samples: 2469385740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:41,994][12645] Avg episode reward: [(0, '0.249')] +[2024-06-18 14:21:42,917][12883] Updated weights for policy 0, policy_version 150713 (0.0046) +[2024-06-18 14:21:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 2469412864. Throughput: 0: 42620.8. Samples: 2469509700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:46,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 14:21:47,583][12883] Updated weights for policy 0, policy_version 150723 (0.0038) +[2024-06-18 14:21:50,648][12883] Updated weights for policy 0, policy_version 150733 (0.0038) +[2024-06-18 14:21:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2469658624. Throughput: 0: 42787.1. Samples: 2469770340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) +[2024-06-18 14:21:51,994][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 14:21:54,978][12883] Updated weights for policy 0, policy_version 150743 (0.0031) +[2024-06-18 14:21:56,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2469871616. Throughput: 0: 42776.5. Samples: 2470027040. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:21:56,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 14:21:58,571][12883] Updated weights for policy 0, policy_version 150753 (0.0041) +[2024-06-18 14:22:01,996][12645] Fps is (10 sec: 39313.4, 60 sec: 42870.0, 300 sec: 42598.1). Total num frames: 2470051840. Throughput: 0: 42654.4. Samples: 2470155440. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:01,996][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 14:22:02,843][12883] Updated weights for policy 0, policy_version 150763 (0.0032) +[2024-06-18 14:22:05,993][12883] Updated weights for policy 0, policy_version 150773 (0.0038) +[2024-06-18 14:22:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2470297600. Throughput: 0: 42861.4. Samples: 2470413480. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:06,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 14:22:10,437][12883] Updated weights for policy 0, policy_version 150783 (0.0025) +[2024-06-18 14:22:11,994][12645] Fps is (10 sec: 45884.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2470510592. Throughput: 0: 42739.4. Samples: 2470664400. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:11,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 14:22:13,369][12862] Signal inference workers to stop experience collection... (36100 times) +[2024-06-18 14:22:13,370][12862] Signal inference workers to resume experience collection... (36100 times) +[2024-06-18 14:22:13,380][12883] InferenceWorker_p0-w0: stopping experience collection (36100 times) +[2024-06-18 14:22:13,380][12883] InferenceWorker_p0-w0: resuming experience collection (36100 times) +[2024-06-18 14:22:13,518][12883] Updated weights for policy 0, policy_version 150793 (0.0031) +[2024-06-18 14:22:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2470707200. Throughput: 0: 42802.7. Samples: 2470796220. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:16,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 14:22:17,907][12883] Updated weights for policy 0, policy_version 150803 (0.0035) +[2024-06-18 14:22:21,699][12883] Updated weights for policy 0, policy_version 150813 (0.0032) +[2024-06-18 14:22:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2470936576. Throughput: 0: 42797.9. Samples: 2471052100. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:21,994][12645] Avg episode reward: [(0, '0.713')] +[2024-06-18 14:22:25,397][12883] Updated weights for policy 0, policy_version 150823 (0.0036) +[2024-06-18 14:22:26,994][12645] Fps is (10 sec: 45873.9, 60 sec: 42598.2, 300 sec: 42654.2). Total num frames: 2471165952. Throughput: 0: 42740.6. Samples: 2471309080. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:26,994][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 14:22:29,347][12883] Updated weights for policy 0, policy_version 150833 (0.0030) +[2024-06-18 14:22:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2471362560. Throughput: 0: 42817.4. Samples: 2471436480. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:31,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 14:22:33,236][12883] Updated weights for policy 0, policy_version 150843 (0.0035) +[2024-06-18 14:22:36,843][12883] Updated weights for policy 0, policy_version 150853 (0.0042) +[2024-06-18 14:22:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2471575552. Throughput: 0: 42761.4. Samples: 2471694600. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:36,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 14:22:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150853_2471575552.pth... +[2024-06-18 14:22:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150227_2461319168.pth +[2024-06-18 14:22:40,938][12883] Updated weights for policy 0, policy_version 150863 (0.0032) +[2024-06-18 14:22:41,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 2471804928. Throughput: 0: 42784.0. Samples: 2471952420. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:41,996][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 14:22:44,499][12883] Updated weights for policy 0, policy_version 150873 (0.0031) +[2024-06-18 14:22:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2472001536. Throughput: 0: 42736.2. Samples: 2472078480. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:46,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 14:22:48,479][12883] Updated weights for policy 0, policy_version 150883 (0.0045) +[2024-06-18 14:22:51,994][12645] Fps is (10 sec: 39330.5, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2472198144. Throughput: 0: 42766.2. Samples: 2472337960. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:51,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 14:22:52,169][12883] Updated weights for policy 0, policy_version 150893 (0.0037) +[2024-06-18 14:22:56,089][12883] Updated weights for policy 0, policy_version 150903 (0.0030) +[2024-06-18 14:22:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2472427520. Throughput: 0: 42782.7. Samples: 2472589620. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) +[2024-06-18 14:22:56,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 14:23:00,039][12883] Updated weights for policy 0, policy_version 150913 (0.0037) +[2024-06-18 14:23:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 2472640512. Throughput: 0: 42816.8. Samples: 2472722980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:01,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 14:23:03,658][12883] Updated weights for policy 0, policy_version 150923 (0.0041) +[2024-06-18 14:23:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2472853504. Throughput: 0: 42689.7. Samples: 2472973140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:06,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 14:23:07,650][12883] Updated weights for policy 0, policy_version 150933 (0.0027) +[2024-06-18 14:23:11,277][12883] Updated weights for policy 0, policy_version 150943 (0.0047) +[2024-06-18 14:23:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2473082880. Throughput: 0: 42692.5. Samples: 2473230240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:11,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 14:23:15,311][12883] Updated weights for policy 0, policy_version 150953 (0.0038) +[2024-06-18 14:23:17,000][12645] Fps is (10 sec: 42572.1, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 2473279488. Throughput: 0: 42734.5. Samples: 2473359800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:17,001][12645] Avg episode reward: [(0, '0.649')] +[2024-06-18 14:23:18,850][12883] Updated weights for policy 0, policy_version 150963 (0.0051) +[2024-06-18 14:23:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2473492480. Throughput: 0: 42579.6. Samples: 2473610680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:21,994][12645] Avg episode reward: [(0, '0.789')] +[2024-06-18 14:23:22,825][12862] Signal inference workers to stop experience collection... (36150 times) +[2024-06-18 14:23:22,826][12862] Signal inference workers to resume experience collection... (36150 times) +[2024-06-18 14:23:22,852][12883] InferenceWorker_p0-w0: stopping experience collection (36150 times) +[2024-06-18 14:23:22,852][12883] InferenceWorker_p0-w0: resuming experience collection (36150 times) +[2024-06-18 14:23:23,003][12883] Updated weights for policy 0, policy_version 150973 (0.0018) +[2024-06-18 14:23:26,902][12883] Updated weights for policy 0, policy_version 150983 (0.0032) +[2024-06-18 14:23:26,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 2473705472. Throughput: 0: 42572.3. Samples: 2473868080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:26,994][12645] Avg episode reward: [(0, '0.662')] +[2024-06-18 14:23:30,710][12883] Updated weights for policy 0, policy_version 150993 (0.0027) +[2024-06-18 14:23:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2473918464. Throughput: 0: 42485.0. Samples: 2473990300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:31,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 14:23:34,606][12883] Updated weights for policy 0, policy_version 151003 (0.0025) +[2024-06-18 14:23:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2474131456. Throughput: 0: 42376.0. Samples: 2474244880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:36,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 14:23:38,234][12883] Updated weights for policy 0, policy_version 151013 (0.0034) +[2024-06-18 14:23:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42053.9, 300 sec: 42543.2). Total num frames: 2474328064. Throughput: 0: 42557.9. Samples: 2474504720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:41,994][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 14:23:42,318][12883] Updated weights for policy 0, policy_version 151023 (0.0042) +[2024-06-18 14:23:46,258][12883] Updated weights for policy 0, policy_version 151033 (0.0034) +[2024-06-18 14:23:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2474557440. Throughput: 0: 42409.2. Samples: 2474631400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:46,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 14:23:49,900][12883] Updated weights for policy 0, policy_version 151043 (0.0031) +[2024-06-18 14:23:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2474770432. Throughput: 0: 42339.2. Samples: 2474878400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:51,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 14:23:53,913][12883] Updated weights for policy 0, policy_version 151053 (0.0027) +[2024-06-18 14:23:56,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42596.8, 300 sec: 42598.0). Total num frames: 2474983424. Throughput: 0: 42279.3. Samples: 2475132900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:23:56,997][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 14:23:57,584][12883] Updated weights for policy 0, policy_version 151063 (0.0038) +[2024-06-18 14:24:01,689][12883] Updated weights for policy 0, policy_version 151073 (0.0041) +[2024-06-18 14:24:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2475180032. Throughput: 0: 42196.5. Samples: 2475258380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 14:24:01,994][12645] Avg episode reward: [(0, '0.702')] +[2024-06-18 14:24:05,222][12883] Updated weights for policy 0, policy_version 151083 (0.0031) +[2024-06-18 14:24:06,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2475409408. Throughput: 0: 42314.2. Samples: 2475514820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:06,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 14:24:09,340][12883] Updated weights for policy 0, policy_version 151093 (0.0041) +[2024-06-18 14:24:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2475622400. Throughput: 0: 42429.7. Samples: 2475777420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:11,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 14:24:12,967][12883] Updated weights for policy 0, policy_version 151103 (0.0032) +[2024-06-18 14:24:16,960][12883] Updated weights for policy 0, policy_version 151113 (0.0030) +[2024-06-18 14:24:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42602.8, 300 sec: 42765.0). Total num frames: 2475835392. Throughput: 0: 42436.4. Samples: 2475899940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:16,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 14:24:20,442][12883] Updated weights for policy 0, policy_version 151123 (0.0035) +[2024-06-18 14:24:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2476048384. Throughput: 0: 42415.0. Samples: 2476153560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:21,995][12645] Avg episode reward: [(0, '0.274')] +[2024-06-18 14:24:24,458][12883] Updated weights for policy 0, policy_version 151133 (0.0027) +[2024-06-18 14:24:26,999][12645] Fps is (10 sec: 42574.9, 60 sec: 42594.4, 300 sec: 42653.1). Total num frames: 2476261376. Throughput: 0: 42476.9. Samples: 2476416420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:27,000][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 14:24:28,113][12883] Updated weights for policy 0, policy_version 151143 (0.0034) +[2024-06-18 14:24:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42765.9). Total num frames: 2476474368. Throughput: 0: 42358.9. Samples: 2476537540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:31,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 14:24:32,088][12883] Updated weights for policy 0, policy_version 151153 (0.0038) +[2024-06-18 14:24:35,728][12883] Updated weights for policy 0, policy_version 151163 (0.0031) +[2024-06-18 14:24:36,994][12645] Fps is (10 sec: 40983.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2476670976. Throughput: 0: 42588.5. Samples: 2476794880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:36,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 14:24:37,169][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151166_2476703744.pth... +[2024-06-18 14:24:37,226][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150541_2466463744.pth +[2024-06-18 14:24:39,933][12883] Updated weights for policy 0, policy_version 151173 (0.0036) +[2024-06-18 14:24:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2476883968. Throughput: 0: 42779.9. Samples: 2477057900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:41,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 14:24:43,627][12883] Updated weights for policy 0, policy_version 151183 (0.0048) +[2024-06-18 14:24:44,352][12862] Signal inference workers to stop experience collection... (36200 times) +[2024-06-18 14:24:44,352][12862] Signal inference workers to resume experience collection... (36200 times) +[2024-06-18 14:24:44,369][12883] InferenceWorker_p0-w0: stopping experience collection (36200 times) +[2024-06-18 14:24:44,369][12883] InferenceWorker_p0-w0: resuming experience collection (36200 times) +[2024-06-18 14:24:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2477113344. Throughput: 0: 42738.2. Samples: 2477181600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:46,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 14:24:47,374][12883] Updated weights for policy 0, policy_version 151193 (0.0028) +[2024-06-18 14:24:51,162][12883] Updated weights for policy 0, policy_version 151203 (0.0041) +[2024-06-18 14:24:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2477326336. Throughput: 0: 42721.8. Samples: 2477437300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:51,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 14:24:55,215][12883] Updated weights for policy 0, policy_version 151213 (0.0027) +[2024-06-18 14:24:56,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42598.4, 300 sec: 42598.1). Total num frames: 2477539328. Throughput: 0: 42735.7. Samples: 2477700620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:24:56,996][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 14:24:58,776][12883] Updated weights for policy 0, policy_version 151223 (0.0031) +[2024-06-18 14:25:01,998][12645] Fps is (10 sec: 42581.4, 60 sec: 42868.7, 300 sec: 42653.4). Total num frames: 2477752320. Throughput: 0: 42692.7. Samples: 2477821280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) +[2024-06-18 14:25:01,998][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 14:25:03,038][12883] Updated weights for policy 0, policy_version 151233 (0.0029) +[2024-06-18 14:25:06,470][12883] Updated weights for policy 0, policy_version 151243 (0.0033) +[2024-06-18 14:25:06,996][12645] Fps is (10 sec: 42598.3, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2477965312. Throughput: 0: 42786.8. Samples: 2478079060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:06,996][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 14:25:11,050][12883] Updated weights for policy 0, policy_version 151253 (0.0045) +[2024-06-18 14:25:11,996][12645] Fps is (10 sec: 40967.1, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 2478161920. Throughput: 0: 42691.2. Samples: 2478337380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:11,997][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 14:25:14,259][12883] Updated weights for policy 0, policy_version 151263 (0.0027) +[2024-06-18 14:25:17,000][12645] Fps is (10 sec: 44218.9, 60 sec: 42867.0, 300 sec: 42708.6). Total num frames: 2478407680. Throughput: 0: 42785.9. Samples: 2478463180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:17,000][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 14:25:18,810][12883] Updated weights for policy 0, policy_version 151273 (0.0032) +[2024-06-18 14:25:21,881][12883] Updated weights for policy 0, policy_version 151283 (0.0047) +[2024-06-18 14:25:21,998][12645] Fps is (10 sec: 45867.7, 60 sec: 42868.7, 300 sec: 42597.8). Total num frames: 2478620672. Throughput: 0: 42778.5. Samples: 2478720080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:21,998][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 14:25:26,710][12883] Updated weights for policy 0, policy_version 151293 (0.0038) +[2024-06-18 14:25:26,994][12645] Fps is (10 sec: 39346.3, 60 sec: 42329.3, 300 sec: 42598.4). Total num frames: 2478800896. Throughput: 0: 42852.4. Samples: 2478986260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:26,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 14:25:29,802][12883] Updated weights for policy 0, policy_version 151303 (0.0032) +[2024-06-18 14:25:31,994][12645] Fps is (10 sec: 44254.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2479063040. Throughput: 0: 42793.5. Samples: 2479107300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:31,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 14:25:34,380][12883] Updated weights for policy 0, policy_version 151313 (0.0034) +[2024-06-18 14:25:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2479259648. Throughput: 0: 42777.3. Samples: 2479362280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:36,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 14:25:37,286][12883] Updated weights for policy 0, policy_version 151323 (0.0036) +[2024-06-18 14:25:41,918][12883] Updated weights for policy 0, policy_version 151333 (0.0037) +[2024-06-18 14:25:41,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2479439872. Throughput: 0: 43043.5. Samples: 2479637480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:41,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 14:25:44,732][12883] Updated weights for policy 0, policy_version 151343 (0.0039) +[2024-06-18 14:25:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2479702016. Throughput: 0: 42923.8. Samples: 2479752680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:46,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 14:25:49,468][12883] Updated weights for policy 0, policy_version 151353 (0.0033) +[2024-06-18 14:25:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2479898624. Throughput: 0: 43030.1. Samples: 2480015320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:51,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 14:25:52,531][12883] Updated weights for policy 0, policy_version 151363 (0.0026) +[2024-06-18 14:25:56,994][12645] Fps is (10 sec: 36044.7, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 2480062464. Throughput: 0: 43075.5. Samples: 2480275680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:25:56,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 14:25:57,283][12883] Updated weights for policy 0, policy_version 151373 (0.0045) +[2024-06-18 14:26:00,125][12883] Updated weights for policy 0, policy_version 151383 (0.0024) +[2024-06-18 14:26:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43420.5, 300 sec: 42820.6). Total num frames: 2480357376. Throughput: 0: 42881.1. Samples: 2480392560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:26:01,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 14:26:04,947][12883] Updated weights for policy 0, policy_version 151393 (0.0029) +[2024-06-18 14:26:05,204][12862] Signal inference workers to stop experience collection... (36250 times) +[2024-06-18 14:26:05,257][12862] Signal inference workers to resume experience collection... (36250 times) +[2024-06-18 14:26:05,258][12883] InferenceWorker_p0-w0: stopping experience collection (36250 times) +[2024-06-18 14:26:05,271][12883] InferenceWorker_p0-w0: resuming experience collection (36250 times) +[2024-06-18 14:26:06,994][12645] Fps is (10 sec: 49152.1, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 2480553984. Throughput: 0: 42949.0. Samples: 2480652620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) +[2024-06-18 14:26:06,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 14:26:07,629][12883] Updated weights for policy 0, policy_version 151403 (0.0041) +[2024-06-18 14:26:11,994][12645] Fps is (10 sec: 36044.8, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 2480717824. Throughput: 0: 42944.0. Samples: 2480918740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:11,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 14:26:12,394][12883] Updated weights for policy 0, policy_version 151413 (0.0028) +[2024-06-18 14:26:15,256][12883] Updated weights for policy 0, policy_version 151423 (0.0031) +[2024-06-18 14:26:16,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42874.3, 300 sec: 42709.2). Total num frames: 2480979968. Throughput: 0: 42847.5. Samples: 2481035540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:16,997][12645] Avg episode reward: [(0, '0.634')] +[2024-06-18 14:26:20,064][12883] Updated weights for policy 0, policy_version 151433 (0.0042) +[2024-06-18 14:26:21,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42874.3, 300 sec: 42653.9). Total num frames: 2481192960. Throughput: 0: 42831.3. Samples: 2481289680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:21,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 14:26:22,869][12883] Updated weights for policy 0, policy_version 151443 (0.0046) +[2024-06-18 14:26:26,994][12645] Fps is (10 sec: 39330.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2481373184. Throughput: 0: 42454.3. Samples: 2481547920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:26,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 14:26:27,610][12883] Updated weights for policy 0, policy_version 151453 (0.0040) +[2024-06-18 14:26:30,772][12883] Updated weights for policy 0, policy_version 151463 (0.0030) +[2024-06-18 14:26:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2481602560. Throughput: 0: 42551.5. Samples: 2481667500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:31,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 14:26:35,065][12883] Updated weights for policy 0, policy_version 151473 (0.0032) +[2024-06-18 14:26:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2481815552. Throughput: 0: 42604.9. Samples: 2481932540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:36,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 14:26:37,139][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151480_2481848320.pth... +[2024-06-18 14:26:37,191][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150853_2471575552.pth +[2024-06-18 14:26:38,716][12883] Updated weights for policy 0, policy_version 151483 (0.0028) +[2024-06-18 14:26:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2482028544. Throughput: 0: 42429.7. Samples: 2482185020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:41,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 14:26:42,635][12883] Updated weights for policy 0, policy_version 151493 (0.0037) +[2024-06-18 14:26:46,319][12883] Updated weights for policy 0, policy_version 151503 (0.0023) +[2024-06-18 14:26:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2482257920. Throughput: 0: 42627.1. Samples: 2482310780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:46,995][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 14:26:50,472][12883] Updated weights for policy 0, policy_version 151513 (0.0041) +[2024-06-18 14:26:51,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2482470912. Throughput: 0: 42494.2. Samples: 2482564860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:51,994][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 14:26:54,047][12883] Updated weights for policy 0, policy_version 151523 (0.0036) +[2024-06-18 14:26:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 2482651136. Throughput: 0: 42324.0. Samples: 2482823320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:26:56,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 14:26:58,370][12883] Updated weights for policy 0, policy_version 151533 (0.0034) +[2024-06-18 14:27:01,562][12883] Updated weights for policy 0, policy_version 151543 (0.0026) +[2024-06-18 14:27:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2482880512. Throughput: 0: 42485.2. Samples: 2482947280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:27:01,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 14:27:05,899][12883] Updated weights for policy 0, policy_version 151553 (0.0028) +[2024-06-18 14:27:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2483109888. Throughput: 0: 42638.5. Samples: 2483208420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:27:06,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 14:27:09,363][12883] Updated weights for policy 0, policy_version 151563 (0.0041) +[2024-06-18 14:27:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2483290112. Throughput: 0: 42556.4. Samples: 2483462960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) +[2024-06-18 14:27:11,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 14:27:12,655][12862] Signal inference workers to stop experience collection... (36300 times) +[2024-06-18 14:27:12,655][12862] Signal inference workers to resume experience collection... (36300 times) +[2024-06-18 14:27:12,673][12883] InferenceWorker_p0-w0: stopping experience collection (36300 times) +[2024-06-18 14:27:12,673][12883] InferenceWorker_p0-w0: resuming experience collection (36300 times) +[2024-06-18 14:27:13,480][12883] Updated weights for policy 0, policy_version 151573 (0.0039) +[2024-06-18 14:27:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 2483503104. Throughput: 0: 42523.6. Samples: 2483581060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:16,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 14:27:17,289][12883] Updated weights for policy 0, policy_version 151583 (0.0034) +[2024-06-18 14:27:21,132][12883] Updated weights for policy 0, policy_version 151593 (0.0043) +[2024-06-18 14:27:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2483716096. Throughput: 0: 42453.8. Samples: 2483842960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:21,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 14:27:24,836][12883] Updated weights for policy 0, policy_version 151603 (0.0042) +[2024-06-18 14:27:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2483912704. Throughput: 0: 42463.6. Samples: 2484095880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:26,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 14:27:28,777][12883] Updated weights for policy 0, policy_version 151613 (0.0041) +[2024-06-18 14:27:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2484142080. Throughput: 0: 42426.3. Samples: 2484219960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:31,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 14:27:32,784][12883] Updated weights for policy 0, policy_version 151623 (0.0034) +[2024-06-18 14:27:36,420][12883] Updated weights for policy 0, policy_version 151633 (0.0039) +[2024-06-18 14:27:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 2484355072. Throughput: 0: 42540.5. Samples: 2484479180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:36,994][12645] Avg episode reward: [(0, '0.043')] +[2024-06-18 14:27:40,547][12883] Updated weights for policy 0, policy_version 151643 (0.0040) +[2024-06-18 14:27:41,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2484568064. Throughput: 0: 42409.7. Samples: 2484731760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:41,994][12645] Avg episode reward: [(0, '0.057')] +[2024-06-18 14:27:43,998][12883] Updated weights for policy 0, policy_version 151653 (0.0046) +[2024-06-18 14:27:46,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 2484764672. Throughput: 0: 42542.2. Samples: 2484861680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:46,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 14:27:48,257][12883] Updated weights for policy 0, policy_version 151663 (0.0029) +[2024-06-18 14:27:51,650][12883] Updated weights for policy 0, policy_version 151673 (0.0048) +[2024-06-18 14:27:51,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2485010432. Throughput: 0: 42476.1. Samples: 2485119840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:51,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 14:27:55,903][12883] Updated weights for policy 0, policy_version 151683 (0.0043) +[2024-06-18 14:27:56,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2485223424. Throughput: 0: 42383.9. Samples: 2485370240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:27:56,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 14:27:59,287][12883] Updated weights for policy 0, policy_version 151693 (0.0027) +[2024-06-18 14:28:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2485420032. Throughput: 0: 42636.0. Samples: 2485499680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:28:01,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 14:28:03,447][12883] Updated weights for policy 0, policy_version 151703 (0.0034) +[2024-06-18 14:28:06,934][12883] Updated weights for policy 0, policy_version 151713 (0.0029) +[2024-06-18 14:28:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2485665792. Throughput: 0: 42631.5. Samples: 2485761380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:28:06,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 14:28:11,270][12883] Updated weights for policy 0, policy_version 151723 (0.0042) +[2024-06-18 14:28:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.8). Total num frames: 2485862400. Throughput: 0: 42565.4. Samples: 2486011320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:28:11,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 14:28:14,846][12883] Updated weights for policy 0, policy_version 151733 (0.0031) +[2024-06-18 14:28:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2486042624. Throughput: 0: 42585.6. Samples: 2486136320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) +[2024-06-18 14:28:16,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 14:28:18,753][12883] Updated weights for policy 0, policy_version 151743 (0.0042) +[2024-06-18 14:28:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2486288384. Throughput: 0: 42520.3. Samples: 2486392600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:28:21,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 14:28:22,601][12883] Updated weights for policy 0, policy_version 151753 (0.0044) +[2024-06-18 14:28:26,747][12883] Updated weights for policy 0, policy_version 151763 (0.0033) +[2024-06-18 14:28:26,994][12645] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2486501376. Throughput: 0: 42730.8. Samples: 2486654640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:28:26,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 14:28:30,198][12883] Updated weights for policy 0, policy_version 151773 (0.0038) +[2024-06-18 14:28:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2486697984. Throughput: 0: 42595.3. Samples: 2486778460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:28:31,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 14:28:34,370][12883] Updated weights for policy 0, policy_version 151783 (0.0033) +[2024-06-18 14:28:36,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 2486927360. Throughput: 0: 42554.3. Samples: 2487034880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:28:36,996][12645] Avg episode reward: [(0, '0.376')] +[2024-06-18 14:28:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151790_2486927360.pth... +[2024-06-18 14:28:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151166_2476703744.pth +[2024-06-18 14:28:37,823][12883] Updated weights for policy 0, policy_version 151793 (0.0035) +[2024-06-18 14:28:41,897][12883] Updated weights for policy 0, policy_version 151803 (0.0042) +[2024-06-18 14:28:41,998][12645] Fps is (10 sec: 44218.8, 60 sec: 42868.8, 300 sec: 42653.4). Total num frames: 2487140352. Throughput: 0: 42746.1. Samples: 2487293980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:28:41,998][12645] Avg episode reward: [(0, '0.300')] +[2024-06-18 14:28:45,707][12883] Updated weights for policy 0, policy_version 151813 (0.0039) +[2024-06-18 14:28:46,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2487336960. Throughput: 0: 42583.1. Samples: 2487415920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:28:46,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 14:28:49,778][12883] Updated weights for policy 0, policy_version 151823 (0.0023) +[2024-06-18 14:28:49,807][12862] Signal inference workers to stop experience collection... (36350 times) +[2024-06-18 14:28:49,807][12862] Signal inference workers to resume experience collection... (36350 times) +[2024-06-18 14:28:49,825][12883] InferenceWorker_p0-w0: stopping experience collection (36350 times) +[2024-06-18 14:28:49,825][12883] InferenceWorker_p0-w0: resuming experience collection (36350 times) +[2024-06-18 14:28:51,994][12645] Fps is (10 sec: 40976.4, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 2487549952. Throughput: 0: 42381.8. Samples: 2487668560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:28:51,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 14:28:53,441][12883] Updated weights for policy 0, policy_version 151833 (0.0023) +[2024-06-18 14:28:56,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42050.8, 300 sec: 42598.1). Total num frames: 2487746560. Throughput: 0: 42618.3. Samples: 2487929240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:28:56,996][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 14:28:57,550][12883] Updated weights for policy 0, policy_version 151843 (0.0031) +[2024-06-18 14:29:01,093][12883] Updated weights for policy 0, policy_version 151853 (0.0042) +[2024-06-18 14:29:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2487975936. Throughput: 0: 42560.6. Samples: 2488051540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:29:01,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 14:29:05,109][12883] Updated weights for policy 0, policy_version 151863 (0.0027) +[2024-06-18 14:29:06,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2488188928. Throughput: 0: 42484.9. Samples: 2488304420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:29:06,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 14:29:09,015][12883] Updated weights for policy 0, policy_version 151873 (0.0033) +[2024-06-18 14:29:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2488401920. Throughput: 0: 42575.5. Samples: 2488570540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:29:11,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 14:29:12,847][12883] Updated weights for policy 0, policy_version 151883 (0.0034) +[2024-06-18 14:29:16,704][12883] Updated weights for policy 0, policy_version 151893 (0.0034) +[2024-06-18 14:29:16,996][12645] Fps is (10 sec: 45865.4, 60 sec: 43416.1, 300 sec: 42709.2). Total num frames: 2488647680. Throughput: 0: 42436.9. Samples: 2488688220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:29:16,996][12645] Avg episode reward: [(0, '0.361')] +[2024-06-18 14:29:20,736][12883] Updated weights for policy 0, policy_version 151903 (0.0045) +[2024-06-18 14:29:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.8). Total num frames: 2488844288. Throughput: 0: 42631.5. Samples: 2488953200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:29:21,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 14:29:24,133][12883] Updated weights for policy 0, policy_version 151913 (0.0030) +[2024-06-18 14:29:26,994][12645] Fps is (10 sec: 37691.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2489024512. Throughput: 0: 42469.6. Samples: 2489204940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:29:26,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 14:29:28,355][12883] Updated weights for policy 0, policy_version 151923 (0.0022) +[2024-06-18 14:29:31,745][12883] Updated weights for policy 0, policy_version 151933 (0.0040) +[2024-06-18 14:29:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2489286656. Throughput: 0: 42587.9. Samples: 2489332380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:29:31,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 14:29:36,070][12883] Updated weights for policy 0, policy_version 151943 (0.0026) +[2024-06-18 14:29:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2489483264. Throughput: 0: 42734.2. Samples: 2489591600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:29:36,994][12645] Avg episode reward: [(0, '0.719')] +[2024-06-18 14:29:39,427][12883] Updated weights for policy 0, policy_version 151953 (0.0034) +[2024-06-18 14:29:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42328.1, 300 sec: 42598.4). Total num frames: 2489679872. Throughput: 0: 42674.6. Samples: 2489849500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:29:41,994][12645] Avg episode reward: [(0, '0.769')] +[2024-06-18 14:29:43,870][12883] Updated weights for policy 0, policy_version 151963 (0.0027) +[2024-06-18 14:29:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2489909248. Throughput: 0: 42612.4. Samples: 2489969100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:29:46,994][12645] Avg episode reward: [(0, '0.698')] +[2024-06-18 14:29:47,346][12883] Updated weights for policy 0, policy_version 151973 (0.0032) +[2024-06-18 14:29:51,499][12883] Updated weights for policy 0, policy_version 151983 (0.0027) +[2024-06-18 14:29:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42654.2). Total num frames: 2490122240. Throughput: 0: 42825.3. Samples: 2490231560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:29:51,994][12645] Avg episode reward: [(0, '0.698')] +[2024-06-18 14:29:55,085][12883] Updated weights for policy 0, policy_version 151993 (0.0027) +[2024-06-18 14:29:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42600.1, 300 sec: 42543.5). Total num frames: 2490302464. Throughput: 0: 42633.5. Samples: 2490489040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:29:56,994][12645] Avg episode reward: [(0, '0.642')] +[2024-06-18 14:29:59,097][12883] Updated weights for policy 0, policy_version 152003 (0.0045) +[2024-06-18 14:30:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2490548224. Throughput: 0: 42673.1. Samples: 2490608420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:30:01,994][12645] Avg episode reward: [(0, '0.613')] +[2024-06-18 14:30:02,680][12883] Updated weights for policy 0, policy_version 152013 (0.0048) +[2024-06-18 14:30:06,822][12883] Updated weights for policy 0, policy_version 152023 (0.0046) +[2024-06-18 14:30:06,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2490744832. Throughput: 0: 42489.7. Samples: 2490865240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:30:06,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 14:30:07,152][12862] Signal inference workers to stop experience collection... (36400 times) +[2024-06-18 14:30:07,206][12862] Signal inference workers to resume experience collection... (36400 times) +[2024-06-18 14:30:07,208][12883] InferenceWorker_p0-w0: stopping experience collection (36400 times) +[2024-06-18 14:30:07,231][12883] InferenceWorker_p0-w0: resuming experience collection (36400 times) +[2024-06-18 14:30:10,391][12883] Updated weights for policy 0, policy_version 152033 (0.0042) +[2024-06-18 14:30:11,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42325.5, 300 sec: 42488.2). Total num frames: 2490941440. Throughput: 0: 42516.0. Samples: 2491118160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:30:11,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 14:30:14,389][12883] Updated weights for policy 0, policy_version 152043 (0.0026) +[2024-06-18 14:30:17,000][12645] Fps is (10 sec: 42570.6, 60 sec: 42049.2, 300 sec: 42542.5). Total num frames: 2491170816. Throughput: 0: 42517.5. Samples: 2491245940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:30:17,001][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 14:30:18,063][12883] Updated weights for policy 0, policy_version 152053 (0.0032) +[2024-06-18 14:30:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2491367424. Throughput: 0: 42514.2. Samples: 2491504740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:30:21,994][12645] Avg episode reward: [(0, '0.413')] +[2024-06-18 14:30:22,394][12883] Updated weights for policy 0, policy_version 152063 (0.0028) +[2024-06-18 14:30:25,623][12883] Updated weights for policy 0, policy_version 152073 (0.0025) +[2024-06-18 14:30:26,994][12645] Fps is (10 sec: 42626.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2491596800. Throughput: 0: 42492.8. Samples: 2491761680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:30:26,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 14:30:29,954][12883] Updated weights for policy 0, policy_version 152083 (0.0042) +[2024-06-18 14:30:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2491809792. Throughput: 0: 42726.6. Samples: 2491891800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:30:31,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 14:30:33,534][12883] Updated weights for policy 0, policy_version 152093 (0.0029) +[2024-06-18 14:30:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2492022784. Throughput: 0: 42522.7. Samples: 2492145080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:30:36,994][12645] Avg episode reward: [(0, '0.677')] +[2024-06-18 14:30:37,000][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152101_2492022784.pth... +[2024-06-18 14:30:37,057][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151480_2481848320.pth +[2024-06-18 14:30:37,518][12883] Updated weights for policy 0, policy_version 152103 (0.0043) +[2024-06-18 14:30:41,355][12883] Updated weights for policy 0, policy_version 152113 (0.0033) +[2024-06-18 14:30:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2492235776. Throughput: 0: 42234.1. Samples: 2492389580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:30:41,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 14:30:45,313][12883] Updated weights for policy 0, policy_version 152123 (0.0037) +[2024-06-18 14:30:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2492448768. Throughput: 0: 42488.1. Samples: 2492520380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:30:46,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 14:30:49,067][12883] Updated weights for policy 0, policy_version 152133 (0.0042) +[2024-06-18 14:30:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2492645376. Throughput: 0: 42456.6. Samples: 2492775780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:30:51,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 14:30:52,873][12883] Updated weights for policy 0, policy_version 152143 (0.0033) +[2024-06-18 14:30:56,566][12883] Updated weights for policy 0, policy_version 152153 (0.0030) +[2024-06-18 14:30:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 2492874752. Throughput: 0: 42497.2. Samples: 2493030540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:30:56,994][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 14:31:00,419][12883] Updated weights for policy 0, policy_version 152163 (0.0032) +[2024-06-18 14:31:01,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2493087744. Throughput: 0: 42604.8. Samples: 2493162880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:31:01,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 14:31:04,165][12883] Updated weights for policy 0, policy_version 152173 (0.0032) +[2024-06-18 14:31:06,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2493300736. Throughput: 0: 42410.4. Samples: 2493413300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:31:06,996][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 14:31:08,593][12883] Updated weights for policy 0, policy_version 152183 (0.0044) +[2024-06-18 14:31:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 2493497344. Throughput: 0: 42310.3. Samples: 2493665640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:31:11,994][12645] Avg episode reward: [(0, '0.668')] +[2024-06-18 14:31:12,410][12883] Updated weights for policy 0, policy_version 152193 (0.0042) +[2024-06-18 14:31:16,148][12883] Updated weights for policy 0, policy_version 152203 (0.0028) +[2024-06-18 14:31:16,994][12645] Fps is (10 sec: 42608.5, 60 sec: 42603.1, 300 sec: 42487.3). Total num frames: 2493726720. Throughput: 0: 42329.9. Samples: 2493796640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:31:16,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 14:31:20,204][12883] Updated weights for policy 0, policy_version 152213 (0.0030) +[2024-06-18 14:31:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2493923328. Throughput: 0: 42327.2. Samples: 2494049800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:31:21,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 14:31:23,750][12883] Updated weights for policy 0, policy_version 152223 (0.0040) +[2024-06-18 14:31:26,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2494136320. Throughput: 0: 42597.3. Samples: 2494306460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) +[2024-06-18 14:31:26,994][12645] Avg episode reward: [(0, '0.743')] +[2024-06-18 14:31:27,931][12883] Updated weights for policy 0, policy_version 152233 (0.0028) +[2024-06-18 14:31:31,423][12883] Updated weights for policy 0, policy_version 152243 (0.0041) +[2024-06-18 14:31:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2494365696. Throughput: 0: 42614.8. Samples: 2494438040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:31:31,994][12645] Avg episode reward: [(0, '0.709')] +[2024-06-18 14:31:35,554][12883] Updated weights for policy 0, policy_version 152253 (0.0041) +[2024-06-18 14:31:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2494562304. Throughput: 0: 42598.8. Samples: 2494692740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:31:36,995][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 14:31:39,140][12883] Updated weights for policy 0, policy_version 152263 (0.0042) +[2024-06-18 14:31:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2494791680. Throughput: 0: 42468.1. Samples: 2494941600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:31:41,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 14:31:43,273][12883] Updated weights for policy 0, policy_version 152273 (0.0046) +[2024-06-18 14:31:43,699][12862] Signal inference workers to stop experience collection... (36450 times) +[2024-06-18 14:31:43,699][12862] Signal inference workers to resume experience collection... (36450 times) +[2024-06-18 14:31:43,747][12883] InferenceWorker_p0-w0: stopping experience collection (36450 times) +[2024-06-18 14:31:43,747][12883] InferenceWorker_p0-w0: resuming experience collection (36450 times) +[2024-06-18 14:31:46,735][12883] Updated weights for policy 0, policy_version 152283 (0.0024) +[2024-06-18 14:31:46,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2495004672. Throughput: 0: 42477.0. Samples: 2495074340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:31:46,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 14:31:50,751][12883] Updated weights for policy 0, policy_version 152293 (0.0031) +[2024-06-18 14:31:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2495201280. Throughput: 0: 42643.6. Samples: 2495332160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:31:51,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 14:31:54,356][12883] Updated weights for policy 0, policy_version 152303 (0.0032) +[2024-06-18 14:31:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2495447040. Throughput: 0: 42672.0. Samples: 2495585880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:31:56,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 14:31:58,237][12883] Updated weights for policy 0, policy_version 152313 (0.0039) +[2024-06-18 14:32:01,980][12883] Updated weights for policy 0, policy_version 152323 (0.0039) +[2024-06-18 14:32:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2495660032. Throughput: 0: 42764.0. Samples: 2495721020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:32:01,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 14:32:06,170][12883] Updated weights for policy 0, policy_version 152333 (0.0034) +[2024-06-18 14:32:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2495856640. Throughput: 0: 42766.6. Samples: 2495974300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:32:06,994][12645] Avg episode reward: [(0, '0.105')] +[2024-06-18 14:32:10,184][12883] Updated weights for policy 0, policy_version 152343 (0.0043) +[2024-06-18 14:32:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2496069632. Throughput: 0: 42520.2. Samples: 2496219860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:32:11,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 14:32:14,187][12883] Updated weights for policy 0, policy_version 152353 (0.0033) +[2024-06-18 14:32:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2496282624. Throughput: 0: 42534.9. Samples: 2496352120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:32:16,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 14:32:17,758][12883] Updated weights for policy 0, policy_version 152363 (0.0025) +[2024-06-18 14:32:21,713][12883] Updated weights for policy 0, policy_version 152373 (0.0038) +[2024-06-18 14:32:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2496479232. Throughput: 0: 42477.9. Samples: 2496604240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:32:22,003][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 14:32:25,321][12883] Updated weights for policy 0, policy_version 152383 (0.0028) +[2024-06-18 14:32:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2496708608. Throughput: 0: 42671.4. Samples: 2496861820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:32:26,994][12645] Avg episode reward: [(0, '0.813')] +[2024-06-18 14:32:29,621][12883] Updated weights for policy 0, policy_version 152393 (0.0038) +[2024-06-18 14:32:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2496921600. Throughput: 0: 42626.2. Samples: 2496992520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 14:32:31,994][12645] Avg episode reward: [(0, '0.671')] +[2024-06-18 14:32:32,909][12883] Updated weights for policy 0, policy_version 152403 (0.0034) +[2024-06-18 14:32:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 2497118208. Throughput: 0: 42482.6. Samples: 2497243880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:32:36,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 14:32:37,090][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152413_2497134592.pth... +[2024-06-18 14:32:37,099][12883] Updated weights for policy 0, policy_version 152413 (0.0034) +[2024-06-18 14:32:37,142][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151790_2486927360.pth +[2024-06-18 14:32:40,578][12883] Updated weights for policy 0, policy_version 152423 (0.0033) +[2024-06-18 14:32:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2497331200. Throughput: 0: 42505.4. Samples: 2497498620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:32:41,996][12645] Avg episode reward: [(0, '0.689')] +[2024-06-18 14:32:44,775][12883] Updated weights for policy 0, policy_version 152433 (0.0028) +[2024-06-18 14:32:46,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2497576960. Throughput: 0: 42402.9. Samples: 2497629160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:32:46,994][12645] Avg episode reward: [(0, '0.672')] +[2024-06-18 14:32:48,543][12883] Updated weights for policy 0, policy_version 152443 (0.0032) +[2024-06-18 14:32:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2497773568. Throughput: 0: 42459.2. Samples: 2497884960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:32:51,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 14:32:52,182][12883] Updated weights for policy 0, policy_version 152453 (0.0031) +[2024-06-18 14:32:56,240][12883] Updated weights for policy 0, policy_version 152463 (0.0037) +[2024-06-18 14:32:56,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2497986560. Throughput: 0: 42679.1. Samples: 2498140420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:32:56,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 14:32:59,788][12883] Updated weights for policy 0, policy_version 152473 (0.0034) +[2024-06-18 14:33:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2498199552. Throughput: 0: 42574.2. Samples: 2498267960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:33:01,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 14:33:03,849][12883] Updated weights for policy 0, policy_version 152483 (0.0033) +[2024-06-18 14:33:06,036][12862] Signal inference workers to stop experience collection... (36500 times) +[2024-06-18 14:33:06,037][12862] Signal inference workers to resume experience collection... (36500 times) +[2024-06-18 14:33:06,072][12883] InferenceWorker_p0-w0: stopping experience collection (36500 times) +[2024-06-18 14:33:06,072][12883] InferenceWorker_p0-w0: resuming experience collection (36500 times) +[2024-06-18 14:33:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2498412544. Throughput: 0: 42676.4. Samples: 2498524680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:33:06,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 14:33:07,606][12883] Updated weights for policy 0, policy_version 152493 (0.0028) +[2024-06-18 14:33:11,450][12883] Updated weights for policy 0, policy_version 152503 (0.0033) +[2024-06-18 14:33:12,000][12645] Fps is (10 sec: 42572.1, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 2498625536. Throughput: 0: 42618.9. Samples: 2498779940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:33:12,001][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 14:33:15,253][12883] Updated weights for policy 0, policy_version 152513 (0.0046) +[2024-06-18 14:33:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2498838528. Throughput: 0: 42617.9. Samples: 2498910320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:33:16,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 14:33:18,978][12883] Updated weights for policy 0, policy_version 152523 (0.0033) +[2024-06-18 14:33:21,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2499051520. Throughput: 0: 42799.0. Samples: 2499169840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:33:21,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 14:33:22,901][12883] Updated weights for policy 0, policy_version 152533 (0.0029) +[2024-06-18 14:33:26,589][12883] Updated weights for policy 0, policy_version 152543 (0.0033) +[2024-06-18 14:33:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2499264512. Throughput: 0: 42821.8. Samples: 2499425600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:33:26,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 14:33:30,359][12883] Updated weights for policy 0, policy_version 152553 (0.0029) +[2024-06-18 14:33:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 2499477504. Throughput: 0: 42899.3. Samples: 2499559620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:33:31,994][12645] Avg episode reward: [(0, '0.662')] +[2024-06-18 14:33:34,723][12883] Updated weights for policy 0, policy_version 152563 (0.0025) +[2024-06-18 14:33:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42487.9). Total num frames: 2499674112. Throughput: 0: 42832.8. Samples: 2499812440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) +[2024-06-18 14:33:36,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 14:33:38,081][12883] Updated weights for policy 0, policy_version 152573 (0.0038) +[2024-06-18 14:33:41,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2499903488. Throughput: 0: 42917.2. Samples: 2500071700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:33:41,995][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 14:33:42,186][12883] Updated weights for policy 0, policy_version 152583 (0.0030) +[2024-06-18 14:33:45,629][12883] Updated weights for policy 0, policy_version 152593 (0.0036) +[2024-06-18 14:33:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2500132864. Throughput: 0: 43084.5. Samples: 2500206760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:33:46,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 14:33:49,755][12883] Updated weights for policy 0, policy_version 152603 (0.0040) +[2024-06-18 14:33:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2500329472. Throughput: 0: 42952.0. Samples: 2500457520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:33:51,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 14:33:53,245][12883] Updated weights for policy 0, policy_version 152613 (0.0028) +[2024-06-18 14:33:56,995][12645] Fps is (10 sec: 42592.4, 60 sec: 42870.4, 300 sec: 42653.7). Total num frames: 2500558848. Throughput: 0: 43017.9. Samples: 2500715540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:33:56,996][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 14:33:57,314][12883] Updated weights for policy 0, policy_version 152623 (0.0032) +[2024-06-18 14:34:00,782][12883] Updated weights for policy 0, policy_version 152633 (0.0039) +[2024-06-18 14:34:01,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 2500771840. Throughput: 0: 43077.3. Samples: 2500848800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:34:01,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 14:34:04,654][12883] Updated weights for policy 0, policy_version 152643 (0.0033) +[2024-06-18 14:34:06,998][12645] Fps is (10 sec: 40948.8, 60 sec: 42595.4, 300 sec: 42597.8). Total num frames: 2500968448. Throughput: 0: 43094.2. Samples: 2501109260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:34:06,998][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 14:34:08,416][12883] Updated weights for policy 0, policy_version 152653 (0.0038) +[2024-06-18 14:34:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42876.0, 300 sec: 42543.2). Total num frames: 2501197824. Throughput: 0: 43014.7. Samples: 2501361260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:34:11,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 14:34:12,149][12883] Updated weights for policy 0, policy_version 152663 (0.0022) +[2024-06-18 14:34:16,026][12883] Updated weights for policy 0, policy_version 152673 (0.0035) +[2024-06-18 14:34:16,994][12645] Fps is (10 sec: 45894.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2501427200. Throughput: 0: 43034.6. Samples: 2501496180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:34:16,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 14:34:19,536][12883] Updated weights for policy 0, policy_version 152683 (0.0045) +[2024-06-18 14:34:21,998][12645] Fps is (10 sec: 42579.2, 60 sec: 42868.3, 300 sec: 42708.8). Total num frames: 2501623808. Throughput: 0: 43177.6. Samples: 2501755620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:34:21,998][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 14:34:23,397][12883] Updated weights for policy 0, policy_version 152693 (0.0038) +[2024-06-18 14:34:25,939][12862] Signal inference workers to stop experience collection... (36550 times) +[2024-06-18 14:34:25,944][12862] Signal inference workers to resume experience collection... (36550 times) +[2024-06-18 14:34:25,989][12883] InferenceWorker_p0-w0: stopping experience collection (36550 times) +[2024-06-18 14:34:25,992][12883] InferenceWorker_p0-w0: resuming experience collection (36550 times) +[2024-06-18 14:34:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2501853184. Throughput: 0: 42957.0. Samples: 2502004760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:34:26,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 14:34:28,103][12883] Updated weights for policy 0, policy_version 152703 (0.0035) +[2024-06-18 14:34:31,290][12883] Updated weights for policy 0, policy_version 152713 (0.0038) +[2024-06-18 14:34:31,994][12645] Fps is (10 sec: 45895.1, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2502082560. Throughput: 0: 43036.0. Samples: 2502143380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:34:31,994][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 14:34:35,646][12883] Updated weights for policy 0, policy_version 152723 (0.0038) +[2024-06-18 14:34:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2502262784. Throughput: 0: 43098.3. Samples: 2502396940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) +[2024-06-18 14:34:36,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 14:34:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152726_2502262784.pth... +[2024-06-18 14:34:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152101_2492022784.pth +[2024-06-18 14:34:39,083][12883] Updated weights for policy 0, policy_version 152733 (0.0038) +[2024-06-18 14:34:41,996][12645] Fps is (10 sec: 42590.9, 60 sec: 43416.3, 300 sec: 42709.2). Total num frames: 2502508544. Throughput: 0: 42937.4. Samples: 2502647740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:34:41,996][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 14:34:43,093][12883] Updated weights for policy 0, policy_version 152743 (0.0038) +[2024-06-18 14:34:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2502688768. Throughput: 0: 42887.9. Samples: 2502778760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:34:46,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 14:34:47,059][12883] Updated weights for policy 0, policy_version 152753 (0.0040) +[2024-06-18 14:34:50,818][12883] Updated weights for policy 0, policy_version 152763 (0.0031) +[2024-06-18 14:34:51,994][12645] Fps is (10 sec: 39328.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2502901760. Throughput: 0: 42753.3. Samples: 2503032980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:34:52,003][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 14:34:54,654][12883] Updated weights for policy 0, policy_version 152773 (0.0035) +[2024-06-18 14:34:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43145.6, 300 sec: 42709.5). Total num frames: 2503147520. Throughput: 0: 42815.5. Samples: 2503287960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:34:56,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 14:34:58,295][12883] Updated weights for policy 0, policy_version 152783 (0.0029) +[2024-06-18 14:35:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2503344128. Throughput: 0: 42788.0. Samples: 2503421640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:01,994][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 14:35:02,439][12883] Updated weights for policy 0, policy_version 152793 (0.0029) +[2024-06-18 14:35:05,786][12883] Updated weights for policy 0, policy_version 152803 (0.0031) +[2024-06-18 14:35:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42874.5, 300 sec: 42709.5). Total num frames: 2503540736. Throughput: 0: 42674.5. Samples: 2503675780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:06,994][12645] Avg episode reward: [(0, '0.295')] +[2024-06-18 14:35:10,229][12883] Updated weights for policy 0, policy_version 152813 (0.0034) +[2024-06-18 14:35:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42766.0). Total num frames: 2503786496. Throughput: 0: 42666.7. Samples: 2503924760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:11,994][12645] Avg episode reward: [(0, '0.636')] +[2024-06-18 14:35:13,919][12883] Updated weights for policy 0, policy_version 152823 (0.0029) +[2024-06-18 14:35:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2503983104. Throughput: 0: 42645.4. Samples: 2504062420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:16,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 14:35:17,778][12883] Updated weights for policy 0, policy_version 152833 (0.0037) +[2024-06-18 14:35:21,578][12883] Updated weights for policy 0, policy_version 152843 (0.0037) +[2024-06-18 14:35:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42601.5, 300 sec: 42654.0). Total num frames: 2504179712. Throughput: 0: 42571.6. Samples: 2504312660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:21,994][12645] Avg episode reward: [(0, '0.524')] +[2024-06-18 14:35:25,487][12883] Updated weights for policy 0, policy_version 152853 (0.0033) +[2024-06-18 14:35:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2504425472. Throughput: 0: 42652.0. Samples: 2504567000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:26,994][12645] Avg episode reward: [(0, '0.663')] +[2024-06-18 14:35:29,202][12883] Updated weights for policy 0, policy_version 152863 (0.0027) +[2024-06-18 14:35:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 2504622080. Throughput: 0: 42729.3. Samples: 2504701580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:31,994][12645] Avg episode reward: [(0, '0.690')] +[2024-06-18 14:35:33,107][12883] Updated weights for policy 0, policy_version 152873 (0.0029) +[2024-06-18 14:35:36,806][12883] Updated weights for policy 0, policy_version 152883 (0.0026) +[2024-06-18 14:35:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2504835072. Throughput: 0: 42708.1. Samples: 2504954840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:37,000][12645] Avg episode reward: [(0, '0.826')] +[2024-06-18 14:35:40,915][12883] Updated weights for policy 0, policy_version 152893 (0.0038) +[2024-06-18 14:35:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 2505064448. Throughput: 0: 42744.4. Samples: 2505211460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) +[2024-06-18 14:35:41,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 14:35:42,078][12862] Signal inference workers to stop experience collection... (36600 times) +[2024-06-18 14:35:42,078][12862] Signal inference workers to resume experience collection... (36600 times) +[2024-06-18 14:35:42,105][12883] InferenceWorker_p0-w0: stopping experience collection (36600 times) +[2024-06-18 14:35:42,105][12883] InferenceWorker_p0-w0: resuming experience collection (36600 times) +[2024-06-18 14:35:44,428][12883] Updated weights for policy 0, policy_version 152903 (0.0041) +[2024-06-18 14:35:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2505261056. Throughput: 0: 42631.5. Samples: 2505340060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:35:46,994][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 14:35:48,566][12883] Updated weights for policy 0, policy_version 152913 (0.0048) +[2024-06-18 14:35:51,839][12883] Updated weights for policy 0, policy_version 152923 (0.0040) +[2024-06-18 14:35:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2505490432. Throughput: 0: 42627.4. Samples: 2505594020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:35:51,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 14:35:56,216][12883] Updated weights for policy 0, policy_version 152933 (0.0032) +[2024-06-18 14:35:56,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2505719808. Throughput: 0: 43032.0. Samples: 2505861200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:35:56,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 14:35:59,299][12883] Updated weights for policy 0, policy_version 152943 (0.0030) +[2024-06-18 14:36:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 2505900032. Throughput: 0: 42761.7. Samples: 2505986700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:01,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 14:36:03,756][12883] Updated weights for policy 0, policy_version 152953 (0.0037) +[2024-06-18 14:36:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2506129408. Throughput: 0: 42831.5. Samples: 2506240080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:06,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 14:36:07,242][12883] Updated weights for policy 0, policy_version 152963 (0.0032) +[2024-06-18 14:36:11,538][12883] Updated weights for policy 0, policy_version 152973 (0.0035) +[2024-06-18 14:36:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2506342400. Throughput: 0: 43106.2. Samples: 2506506780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:11,995][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 14:36:14,735][12883] Updated weights for policy 0, policy_version 152983 (0.0046) +[2024-06-18 14:36:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2506555392. Throughput: 0: 42832.8. Samples: 2506629060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:16,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 14:36:19,219][12883] Updated weights for policy 0, policy_version 152993 (0.0029) +[2024-06-18 14:36:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2506784768. Throughput: 0: 42882.1. Samples: 2506884540. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:21,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 14:36:22,240][12883] Updated weights for policy 0, policy_version 153003 (0.0038) +[2024-06-18 14:36:26,727][12883] Updated weights for policy 0, policy_version 153013 (0.0041) +[2024-06-18 14:36:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2506964992. Throughput: 0: 43123.7. Samples: 2507152020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:26,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 14:36:29,767][12883] Updated weights for policy 0, policy_version 153023 (0.0042) +[2024-06-18 14:36:31,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42869.8, 300 sec: 42820.3). Total num frames: 2507194368. Throughput: 0: 42923.6. Samples: 2507271720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:31,996][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 14:36:34,500][12883] Updated weights for policy 0, policy_version 153033 (0.0027) +[2024-06-18 14:36:36,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2507440128. Throughput: 0: 42959.7. Samples: 2507527200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:36,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 14:36:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153042_2507440128.pth... +[2024-06-18 14:36:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152413_2497134592.pth +[2024-06-18 14:36:37,260][12883] Updated weights for policy 0, policy_version 153043 (0.0038) +[2024-06-18 14:36:41,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2507603968. Throughput: 0: 42969.3. Samples: 2507794820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:41,994][12645] Avg episode reward: [(0, '0.701')] +[2024-06-18 14:36:42,035][12883] Updated weights for policy 0, policy_version 153053 (0.0032) +[2024-06-18 14:36:45,400][12883] Updated weights for policy 0, policy_version 153063 (0.0032) +[2024-06-18 14:36:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2507833344. Throughput: 0: 42754.7. Samples: 2507910660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) +[2024-06-18 14:36:46,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 14:36:49,655][12883] Updated weights for policy 0, policy_version 153073 (0.0026) +[2024-06-18 14:36:49,904][12862] Signal inference workers to stop experience collection... (36650 times) +[2024-06-18 14:36:49,905][12862] Signal inference workers to resume experience collection... (36650 times) +[2024-06-18 14:36:49,950][12883] InferenceWorker_p0-w0: stopping experience collection (36650 times) +[2024-06-18 14:36:49,951][12883] InferenceWorker_p0-w0: resuming experience collection (36650 times) +[2024-06-18 14:36:51,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2508079104. Throughput: 0: 42883.6. Samples: 2508169840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:36:51,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 14:36:53,050][12883] Updated weights for policy 0, policy_version 153083 (0.0032) +[2024-06-18 14:36:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2508259328. Throughput: 0: 42856.9. Samples: 2508435340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:36:56,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 14:36:57,197][12883] Updated weights for policy 0, policy_version 153093 (0.0051) +[2024-06-18 14:37:00,543][12883] Updated weights for policy 0, policy_version 153103 (0.0036) +[2024-06-18 14:37:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2508472320. Throughput: 0: 42771.1. Samples: 2508553760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:01,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 14:37:04,832][12883] Updated weights for policy 0, policy_version 153113 (0.0026) +[2024-06-18 14:37:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2508718080. Throughput: 0: 42931.1. Samples: 2508816440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:06,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 14:37:08,048][12883] Updated weights for policy 0, policy_version 153123 (0.0027) +[2024-06-18 14:37:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2508898304. Throughput: 0: 42719.0. Samples: 2509074380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:11,999][12645] Avg episode reward: [(0, '0.666')] +[2024-06-18 14:37:12,354][12883] Updated weights for policy 0, policy_version 153133 (0.0039) +[2024-06-18 14:37:16,134][12883] Updated weights for policy 0, policy_version 153143 (0.0034) +[2024-06-18 14:37:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2509127680. Throughput: 0: 42728.9. Samples: 2509194420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:16,994][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 14:37:19,976][12883] Updated weights for policy 0, policy_version 153153 (0.0032) +[2024-06-18 14:37:21,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2509357056. Throughput: 0: 42858.7. Samples: 2509455840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:21,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 14:37:23,730][12883] Updated weights for policy 0, policy_version 153163 (0.0026) +[2024-06-18 14:37:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2509537280. Throughput: 0: 42793.3. Samples: 2509720520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:26,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 14:37:27,703][12883] Updated weights for policy 0, policy_version 153173 (0.0027) +[2024-06-18 14:37:31,256][12883] Updated weights for policy 0, policy_version 153183 (0.0037) +[2024-06-18 14:37:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 2509783040. Throughput: 0: 42865.4. Samples: 2509839600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:31,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 14:37:35,398][12883] Updated weights for policy 0, policy_version 153193 (0.0039) +[2024-06-18 14:37:36,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2510012416. Throughput: 0: 42863.9. Samples: 2510098720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:36,995][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 14:37:38,753][12883] Updated weights for policy 0, policy_version 153203 (0.0041) +[2024-06-18 14:37:41,995][12645] Fps is (10 sec: 37678.9, 60 sec: 42597.6, 300 sec: 42653.8). Total num frames: 2510159872. Throughput: 0: 42727.5. Samples: 2510358120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:41,995][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 14:37:43,047][12883] Updated weights for policy 0, policy_version 153213 (0.0046) +[2024-06-18 14:37:46,633][12883] Updated weights for policy 0, policy_version 153223 (0.0042) +[2024-06-18 14:37:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2510405632. Throughput: 0: 42717.4. Samples: 2510476040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:46,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 14:37:51,014][12883] Updated weights for policy 0, policy_version 153233 (0.0041) +[2024-06-18 14:37:51,995][12645] Fps is (10 sec: 49149.9, 60 sec: 42870.4, 300 sec: 42931.4). Total num frames: 2510651392. Throughput: 0: 42774.6. Samples: 2510741360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) +[2024-06-18 14:37:51,996][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 14:37:54,152][12883] Updated weights for policy 0, policy_version 153243 (0.0040) +[2024-06-18 14:37:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2510815232. Throughput: 0: 42735.2. Samples: 2510997460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:37:56,994][12645] Avg episode reward: [(0, '0.685')] +[2024-06-18 14:37:58,604][12883] Updated weights for policy 0, policy_version 153253 (0.0046) +[2024-06-18 14:38:01,660][12883] Updated weights for policy 0, policy_version 153263 (0.0032) +[2024-06-18 14:38:01,994][12645] Fps is (10 sec: 40966.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2511060992. Throughput: 0: 42709.3. Samples: 2511116340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:01,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 14:38:06,339][12883] Updated weights for policy 0, policy_version 153273 (0.0043) +[2024-06-18 14:38:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42877.0). Total num frames: 2511273984. Throughput: 0: 42792.4. Samples: 2511381500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:06,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 14:38:09,291][12883] Updated weights for policy 0, policy_version 153283 (0.0028) +[2024-06-18 14:38:11,994][12645] Fps is (10 sec: 37682.3, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 2511437824. Throughput: 0: 42462.1. Samples: 2511631320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:11,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 14:38:13,971][12883] Updated weights for policy 0, policy_version 153293 (0.0029) +[2024-06-18 14:38:14,231][12862] Signal inference workers to stop experience collection... (36700 times) +[2024-06-18 14:38:14,267][12883] InferenceWorker_p0-w0: stopping experience collection (36700 times) +[2024-06-18 14:38:14,285][12862] Signal inference workers to resume experience collection... (36700 times) +[2024-06-18 14:38:14,291][12883] InferenceWorker_p0-w0: resuming experience collection (36700 times) +[2024-06-18 14:38:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2511699968. Throughput: 0: 42530.2. Samples: 2511753460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:16,994][12645] Avg episode reward: [(0, '0.633')] +[2024-06-18 14:38:17,048][12883] Updated weights for policy 0, policy_version 153303 (0.0037) +[2024-06-18 14:38:21,715][12883] Updated weights for policy 0, policy_version 153313 (0.0028) +[2024-06-18 14:38:21,994][12645] Fps is (10 sec: 45876.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2511896576. Throughput: 0: 42681.1. Samples: 2512019360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:21,994][12645] Avg episode reward: [(0, '0.809')] +[2024-06-18 14:38:25,074][12883] Updated weights for policy 0, policy_version 153323 (0.0035) +[2024-06-18 14:38:26,995][12645] Fps is (10 sec: 39316.1, 60 sec: 42597.5, 300 sec: 42764.8). Total num frames: 2512093184. Throughput: 0: 42320.2. Samples: 2512262540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:26,996][12645] Avg episode reward: [(0, '0.704')] +[2024-06-18 14:38:29,601][12883] Updated weights for policy 0, policy_version 153333 (0.0031) +[2024-06-18 14:38:31,994][12645] Fps is (10 sec: 42597.0, 60 sec: 42325.1, 300 sec: 42876.1). Total num frames: 2512322560. Throughput: 0: 42508.2. Samples: 2512388920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:31,994][12645] Avg episode reward: [(0, '0.700')] +[2024-06-18 14:38:32,845][12883] Updated weights for policy 0, policy_version 153343 (0.0053) +[2024-06-18 14:38:36,994][12645] Fps is (10 sec: 40965.0, 60 sec: 41506.1, 300 sec: 42709.5). Total num frames: 2512502784. Throughput: 0: 42440.4. Samples: 2512651120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:36,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 14:38:37,120][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153352_2512519168.pth... +[2024-06-18 14:38:37,184][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152726_2502262784.pth +[2024-06-18 14:38:37,348][12883] Updated weights for policy 0, policy_version 153353 (0.0037) +[2024-06-18 14:38:40,571][12883] Updated weights for policy 0, policy_version 153363 (0.0038) +[2024-06-18 14:38:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42872.2, 300 sec: 42709.5). Total num frames: 2512732160. Throughput: 0: 41917.7. Samples: 2512883760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:41,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 14:38:45,274][12883] Updated weights for policy 0, policy_version 153373 (0.0039) +[2024-06-18 14:38:46,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2512977920. Throughput: 0: 42297.3. Samples: 2513019720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:46,994][12645] Avg episode reward: [(0, '0.664')] +[2024-06-18 14:38:48,295][12883] Updated weights for policy 0, policy_version 153383 (0.0031) +[2024-06-18 14:38:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40961.1, 300 sec: 42543.1). Total num frames: 2513108992. Throughput: 0: 42025.8. Samples: 2513272660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) +[2024-06-18 14:38:51,994][12645] Avg episode reward: [(0, '0.529')] +[2024-06-18 14:38:52,925][12883] Updated weights for policy 0, policy_version 153393 (0.0038) +[2024-06-18 14:38:55,856][12883] Updated weights for policy 0, policy_version 153403 (0.0035) +[2024-06-18 14:38:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2513387520. Throughput: 0: 41951.7. Samples: 2513519140. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:38:56,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 14:39:00,607][12883] Updated weights for policy 0, policy_version 153413 (0.0037) +[2024-06-18 14:39:01,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42052.3, 300 sec: 42765.6). Total num frames: 2513584128. Throughput: 0: 42345.4. Samples: 2513659000. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:01,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 14:39:03,546][12883] Updated weights for policy 0, policy_version 153423 (0.0027) +[2024-06-18 14:39:06,994][12645] Fps is (10 sec: 34406.5, 60 sec: 40960.0, 300 sec: 42487.3). Total num frames: 2513731584. Throughput: 0: 41906.2. Samples: 2513905140. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:06,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 14:39:08,020][12862] Signal inference workers to stop experience collection... (36750 times) +[2024-06-18 14:39:08,020][12862] Signal inference workers to resume experience collection... (36750 times) +[2024-06-18 14:39:08,055][12883] InferenceWorker_p0-w0: stopping experience collection (36750 times) +[2024-06-18 14:39:08,060][12883] InferenceWorker_p0-w0: resuming experience collection (36750 times) +[2024-06-18 14:39:08,346][12883] Updated weights for policy 0, policy_version 153433 (0.0031) +[2024-06-18 14:39:11,311][12883] Updated weights for policy 0, policy_version 153443 (0.0042) +[2024-06-18 14:39:11,996][12645] Fps is (10 sec: 44226.5, 60 sec: 43143.1, 300 sec: 42709.1). Total num frames: 2514026496. Throughput: 0: 42045.4. Samples: 2514154620. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:11,996][12645] Avg episode reward: [(0, '0.200')] +[2024-06-18 14:39:15,972][12883] Updated weights for policy 0, policy_version 153453 (0.0029) +[2024-06-18 14:39:16,994][12645] Fps is (10 sec: 49151.6, 60 sec: 42052.2, 300 sec: 42710.1). Total num frames: 2514223104. Throughput: 0: 42494.4. Samples: 2514301160. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:16,994][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 14:39:18,915][12883] Updated weights for policy 0, policy_version 153463 (0.0040) +[2024-06-18 14:39:21,994][12645] Fps is (10 sec: 36052.7, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 2514386944. Throughput: 0: 42157.9. Samples: 2514548220. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:21,994][12645] Avg episode reward: [(0, '0.664')] +[2024-06-18 14:39:23,662][12883] Updated weights for policy 0, policy_version 153473 (0.0033) +[2024-06-18 14:39:26,808][12883] Updated weights for policy 0, policy_version 153483 (0.0033) +[2024-06-18 14:39:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42872.5, 300 sec: 42654.0). Total num frames: 2514665472. Throughput: 0: 42584.6. Samples: 2514800060. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:26,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 14:39:31,285][12883] Updated weights for policy 0, policy_version 153493 (0.0041) +[2024-06-18 14:39:31,994][12645] Fps is (10 sec: 49151.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2514878464. Throughput: 0: 42736.9. Samples: 2514942880. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:31,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 14:39:34,418][12883] Updated weights for policy 0, policy_version 153503 (0.0034) +[2024-06-18 14:39:36,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42325.4, 300 sec: 42487.6). Total num frames: 2515042304. Throughput: 0: 42481.2. Samples: 2515184320. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:36,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 14:39:39,051][12883] Updated weights for policy 0, policy_version 153513 (0.0028) +[2024-06-18 14:39:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2515304448. Throughput: 0: 42581.3. Samples: 2515435300. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:41,994][12645] Avg episode reward: [(0, '0.681')] +[2024-06-18 14:39:42,493][12883] Updated weights for policy 0, policy_version 153523 (0.0030) +[2024-06-18 14:39:46,618][12883] Updated weights for policy 0, policy_version 153533 (0.0040) +[2024-06-18 14:39:46,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2515501056. Throughput: 0: 42618.6. Samples: 2515576840. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:46,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 14:39:50,211][12883] Updated weights for policy 0, policy_version 153543 (0.0042) +[2024-06-18 14:39:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2515681280. Throughput: 0: 42670.3. Samples: 2515825300. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:51,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 14:39:54,429][12883] Updated weights for policy 0, policy_version 153553 (0.0028) +[2024-06-18 14:39:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2515943424. Throughput: 0: 42709.7. Samples: 2516076460. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) +[2024-06-18 14:39:56,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 14:39:57,686][12883] Updated weights for policy 0, policy_version 153563 (0.0033) +[2024-06-18 14:40:01,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2516123648. Throughput: 0: 42654.2. Samples: 2516220600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:01,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 14:40:02,034][12883] Updated weights for policy 0, policy_version 153573 (0.0035) +[2024-06-18 14:40:05,230][12883] Updated weights for policy 0, policy_version 153583 (0.0037) +[2024-06-18 14:40:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 43417.5, 300 sec: 42542.8). Total num frames: 2516336640. Throughput: 0: 42678.6. Samples: 2516468760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:06,994][12645] Avg episode reward: [(0, '0.358')] +[2024-06-18 14:40:09,755][12883] Updated weights for policy 0, policy_version 153593 (0.0035) +[2024-06-18 14:40:10,888][12862] Signal inference workers to stop experience collection... (36800 times) +[2024-06-18 14:40:10,938][12883] InferenceWorker_p0-w0: stopping experience collection (36800 times) +[2024-06-18 14:40:10,943][12862] Signal inference workers to resume experience collection... (36800 times) +[2024-06-18 14:40:10,960][12883] InferenceWorker_p0-w0: resuming experience collection (36800 times) +[2024-06-18 14:40:11,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2516598784. Throughput: 0: 42612.8. Samples: 2516717640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:11,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 14:40:12,764][12883] Updated weights for policy 0, policy_version 153603 (0.0040) +[2024-06-18 14:40:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2516762624. Throughput: 0: 42609.4. Samples: 2516860300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:16,994][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 14:40:17,327][12883] Updated weights for policy 0, policy_version 153613 (0.0039) +[2024-06-18 14:40:20,388][12883] Updated weights for policy 0, policy_version 153623 (0.0032) +[2024-06-18 14:40:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 2516975616. Throughput: 0: 42681.4. Samples: 2517104980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:21,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 14:40:24,878][12883] Updated weights for policy 0, policy_version 153633 (0.0028) +[2024-06-18 14:40:26,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2517237760. Throughput: 0: 42821.4. Samples: 2517362260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:26,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 14:40:27,956][12883] Updated weights for policy 0, policy_version 153643 (0.0025) +[2024-06-18 14:40:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2517417984. Throughput: 0: 42772.8. Samples: 2517501620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:31,994][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 14:40:32,471][12883] Updated weights for policy 0, policy_version 153653 (0.0035) +[2024-06-18 14:40:35,549][12883] Updated weights for policy 0, policy_version 153663 (0.0036) +[2024-06-18 14:40:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2517630976. Throughput: 0: 42618.5. Samples: 2517743140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:36,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 14:40:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153664_2517630976.pth... +[2024-06-18 14:40:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153042_2507440128.pth +[2024-06-18 14:40:40,158][12883] Updated weights for policy 0, policy_version 153673 (0.0037) +[2024-06-18 14:40:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2517860352. Throughput: 0: 42812.9. Samples: 2518003040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:41,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 14:40:43,579][12883] Updated weights for policy 0, policy_version 153683 (0.0046) +[2024-06-18 14:40:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2518056960. Throughput: 0: 42543.7. Samples: 2518135060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:47,002][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 14:40:47,970][12883] Updated weights for policy 0, policy_version 153693 (0.0039) +[2024-06-18 14:40:51,185][12883] Updated weights for policy 0, policy_version 153703 (0.0038) +[2024-06-18 14:40:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2518269952. Throughput: 0: 42434.7. Samples: 2518378320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:52,006][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 14:40:55,543][12883] Updated weights for policy 0, policy_version 153713 (0.0041) +[2024-06-18 14:40:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2518482944. Throughput: 0: 42803.9. Samples: 2518643820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:40:56,994][12645] Avg episode reward: [(0, '0.733')] +[2024-06-18 14:40:59,525][12883] Updated weights for policy 0, policy_version 153723 (0.0024) +[2024-06-18 14:41:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2518695936. Throughput: 0: 42544.4. Samples: 2518774800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:41:01,994][12645] Avg episode reward: [(0, '0.738')] +[2024-06-18 14:41:03,071][12883] Updated weights for policy 0, policy_version 153733 (0.0034) +[2024-06-18 14:41:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2518908928. Throughput: 0: 42595.2. Samples: 2519021760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:06,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 14:41:07,120][12883] Updated weights for policy 0, policy_version 153743 (0.0030) +[2024-06-18 14:41:11,002][12883] Updated weights for policy 0, policy_version 153753 (0.0036) +[2024-06-18 14:41:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2519121920. Throughput: 0: 42783.1. Samples: 2519287500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:11,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 14:41:14,663][12883] Updated weights for policy 0, policy_version 153763 (0.0037) +[2024-06-18 14:41:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2519334912. Throughput: 0: 42517.4. Samples: 2519414900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:16,994][12645] Avg episode reward: [(0, '0.650')] +[2024-06-18 14:41:18,765][12883] Updated weights for policy 0, policy_version 153773 (0.0037) +[2024-06-18 14:41:19,978][12862] Signal inference workers to stop experience collection... (36850 times) +[2024-06-18 14:41:19,979][12862] Signal inference workers to resume experience collection... (36850 times) +[2024-06-18 14:41:19,995][12883] InferenceWorker_p0-w0: stopping experience collection (36850 times) +[2024-06-18 14:41:19,995][12883] InferenceWorker_p0-w0: resuming experience collection (36850 times) +[2024-06-18 14:41:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2519564288. Throughput: 0: 42785.4. Samples: 2519668480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:21,994][12645] Avg episode reward: [(0, '0.599')] +[2024-06-18 14:41:22,129][12883] Updated weights for policy 0, policy_version 153783 (0.0035) +[2024-06-18 14:41:26,742][12883] Updated weights for policy 0, policy_version 153793 (0.0033) +[2024-06-18 14:41:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 2519760896. Throughput: 0: 42893.8. Samples: 2519933260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:26,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 14:41:29,749][12883] Updated weights for policy 0, policy_version 153803 (0.0030) +[2024-06-18 14:41:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2519973888. Throughput: 0: 42645.8. Samples: 2520054120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:31,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 14:41:34,513][12883] Updated weights for policy 0, policy_version 153813 (0.0041) +[2024-06-18 14:41:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2520203264. Throughput: 0: 42859.2. Samples: 2520306980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:36,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 14:41:38,179][12883] Updated weights for policy 0, policy_version 153823 (0.0037) +[2024-06-18 14:41:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2520383488. Throughput: 0: 42871.3. Samples: 2520573020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:41,994][12645] Avg episode reward: [(0, '0.116')] +[2024-06-18 14:41:42,063][12883] Updated weights for policy 0, policy_version 153833 (0.0027) +[2024-06-18 14:41:45,657][12883] Updated weights for policy 0, policy_version 153843 (0.0031) +[2024-06-18 14:41:46,995][12645] Fps is (10 sec: 40953.1, 60 sec: 42597.2, 300 sec: 42487.1). Total num frames: 2520612864. Throughput: 0: 42590.0. Samples: 2520691420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:46,996][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 14:41:49,451][12883] Updated weights for policy 0, policy_version 153853 (0.0037) +[2024-06-18 14:41:51,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2520858624. Throughput: 0: 42855.1. Samples: 2520950240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:51,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 14:41:53,145][12883] Updated weights for policy 0, policy_version 153863 (0.0031) +[2024-06-18 14:41:56,994][12645] Fps is (10 sec: 42605.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2521038848. Throughput: 0: 42683.8. Samples: 2521208280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:41:56,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 14:41:57,112][12883] Updated weights for policy 0, policy_version 153873 (0.0037) +[2024-06-18 14:42:00,607][12883] Updated weights for policy 0, policy_version 153883 (0.0032) +[2024-06-18 14:42:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2521251840. Throughput: 0: 42549.3. Samples: 2521329620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:42:01,994][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 14:42:04,836][12883] Updated weights for policy 0, policy_version 153893 (0.0048) +[2024-06-18 14:42:06,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2521481216. Throughput: 0: 42673.9. Samples: 2521588800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 14:42:06,994][12645] Avg episode reward: [(0, '0.691')] +[2024-06-18 14:42:08,143][12883] Updated weights for policy 0, policy_version 153903 (0.0043) +[2024-06-18 14:42:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2521661440. Throughput: 0: 42532.0. Samples: 2521847200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:11,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 14:42:12,458][12883] Updated weights for policy 0, policy_version 153913 (0.0034) +[2024-06-18 14:42:15,971][12883] Updated weights for policy 0, policy_version 153923 (0.0035) +[2024-06-18 14:42:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2521907200. Throughput: 0: 42633.7. Samples: 2521972640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:16,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 14:42:20,047][12883] Updated weights for policy 0, policy_version 153933 (0.0026) +[2024-06-18 14:42:21,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2522136576. Throughput: 0: 42906.6. Samples: 2522237780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:21,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 14:42:23,503][12883] Updated weights for policy 0, policy_version 153943 (0.0029) +[2024-06-18 14:42:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2522316800. Throughput: 0: 42671.0. Samples: 2522493220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:26,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 14:42:27,642][12883] Updated weights for policy 0, policy_version 153953 (0.0038) +[2024-06-18 14:42:29,756][12862] Signal inference workers to stop experience collection... (36900 times) +[2024-06-18 14:42:29,757][12862] Signal inference workers to resume experience collection... (36900 times) +[2024-06-18 14:42:29,781][12883] InferenceWorker_p0-w0: stopping experience collection (36900 times) +[2024-06-18 14:42:29,781][12883] InferenceWorker_p0-w0: resuming experience collection (36900 times) +[2024-06-18 14:42:31,267][12883] Updated weights for policy 0, policy_version 153963 (0.0035) +[2024-06-18 14:42:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2522529792. Throughput: 0: 42806.5. Samples: 2522617640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:31,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 14:42:35,656][12883] Updated weights for policy 0, policy_version 153973 (0.0048) +[2024-06-18 14:42:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 2522759168. Throughput: 0: 42914.2. Samples: 2522881380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:36,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 14:42:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153978_2522775552.pth... +[2024-06-18 14:42:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153352_2512519168.pth +[2024-06-18 14:42:38,901][12883] Updated weights for policy 0, policy_version 153983 (0.0042) +[2024-06-18 14:42:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2522972160. Throughput: 0: 42759.6. Samples: 2523132460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:41,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 14:42:43,302][12883] Updated weights for policy 0, policy_version 153993 (0.0047) +[2024-06-18 14:42:46,799][12883] Updated weights for policy 0, policy_version 154003 (0.0038) +[2024-06-18 14:42:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42872.6, 300 sec: 42487.5). Total num frames: 2523185152. Throughput: 0: 42926.7. Samples: 2523261320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:46,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 14:42:50,844][12883] Updated weights for policy 0, policy_version 154013 (0.0026) +[2024-06-18 14:42:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2523398144. Throughput: 0: 42881.2. Samples: 2523518460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:51,994][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 14:42:54,499][12883] Updated weights for policy 0, policy_version 154023 (0.0027) +[2024-06-18 14:42:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2523594752. Throughput: 0: 42730.7. Samples: 2523770080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:42:56,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 14:42:58,327][12883] Updated weights for policy 0, policy_version 154033 (0.0041) +[2024-06-18 14:43:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2523807744. Throughput: 0: 42780.0. Samples: 2523897740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:43:01,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 14:43:02,193][12883] Updated weights for policy 0, policy_version 154043 (0.0031) +[2024-06-18 14:43:05,861][12883] Updated weights for policy 0, policy_version 154053 (0.0031) +[2024-06-18 14:43:06,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 2524037120. Throughput: 0: 42615.0. Samples: 2524155460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 14:43:06,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 14:43:09,914][12883] Updated weights for policy 0, policy_version 154063 (0.0029) +[2024-06-18 14:43:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2524233728. Throughput: 0: 42725.4. Samples: 2524415860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:11,994][12645] Avg episode reward: [(0, '0.807')] +[2024-06-18 14:43:13,421][12883] Updated weights for policy 0, policy_version 154073 (0.0042) +[2024-06-18 14:43:16,994][12645] Fps is (10 sec: 42599.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2524463104. Throughput: 0: 42696.1. Samples: 2524538960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:16,994][12645] Avg episode reward: [(0, '0.788')] +[2024-06-18 14:43:17,543][12883] Updated weights for policy 0, policy_version 154083 (0.0041) +[2024-06-18 14:43:20,857][12883] Updated weights for policy 0, policy_version 154093 (0.0040) +[2024-06-18 14:43:21,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 2524692480. Throughput: 0: 42575.1. Samples: 2524797260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:21,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 14:43:25,474][12883] Updated weights for policy 0, policy_version 154103 (0.0035) +[2024-06-18 14:43:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42598.5). Total num frames: 2524889088. Throughput: 0: 42865.5. Samples: 2525061400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:26,994][12645] Avg episode reward: [(0, '0.637')] +[2024-06-18 14:43:28,296][12883] Updated weights for policy 0, policy_version 154113 (0.0035) +[2024-06-18 14:43:31,996][12645] Fps is (10 sec: 42589.0, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2525118464. Throughput: 0: 42821.9. Samples: 2525188400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:31,996][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 14:43:33,019][12883] Updated weights for policy 0, policy_version 154123 (0.0031) +[2024-06-18 14:43:35,077][12862] Signal inference workers to stop experience collection... (36950 times) +[2024-06-18 14:43:35,077][12862] Signal inference workers to resume experience collection... (36950 times) +[2024-06-18 14:43:35,124][12883] InferenceWorker_p0-w0: stopping experience collection (36950 times) +[2024-06-18 14:43:35,124][12883] InferenceWorker_p0-w0: resuming experience collection (36950 times) +[2024-06-18 14:43:35,863][12883] Updated weights for policy 0, policy_version 154133 (0.0030) +[2024-06-18 14:43:36,997][12645] Fps is (10 sec: 42583.2, 60 sec: 42596.0, 300 sec: 42653.5). Total num frames: 2525315072. Throughput: 0: 42839.9. Samples: 2525446400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:36,998][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 14:43:40,536][12883] Updated weights for policy 0, policy_version 154143 (0.0031) +[2024-06-18 14:43:41,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2525544448. Throughput: 0: 43044.9. Samples: 2525707100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:41,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 14:43:43,662][12883] Updated weights for policy 0, policy_version 154153 (0.0031) +[2024-06-18 14:43:46,994][12645] Fps is (10 sec: 44251.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2525757440. Throughput: 0: 43083.1. Samples: 2525836480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:46,994][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 14:43:48,145][12883] Updated weights for policy 0, policy_version 154163 (0.0036) +[2024-06-18 14:43:51,846][12883] Updated weights for policy 0, policy_version 154173 (0.0037) +[2024-06-18 14:43:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2525970432. Throughput: 0: 43012.1. Samples: 2526091000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:51,994][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 14:43:55,608][12883] Updated weights for policy 0, policy_version 154183 (0.0038) +[2024-06-18 14:43:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2526183424. Throughput: 0: 43039.5. Samples: 2526352640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:43:56,994][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 14:43:59,333][12883] Updated weights for policy 0, policy_version 154193 (0.0053) +[2024-06-18 14:44:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2526412800. Throughput: 0: 43141.6. Samples: 2526480340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:44:01,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 14:44:03,268][12883] Updated weights for policy 0, policy_version 154203 (0.0036) +[2024-06-18 14:44:06,707][12883] Updated weights for policy 0, policy_version 154213 (0.0034) +[2024-06-18 14:44:06,994][12645] Fps is (10 sec: 44237.7, 60 sec: 43144.7, 300 sec: 42709.8). Total num frames: 2526625792. Throughput: 0: 43145.5. Samples: 2526738800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:44:06,994][12645] Avg episode reward: [(0, '0.675')] +[2024-06-18 14:44:11,087][12883] Updated weights for policy 0, policy_version 154223 (0.0025) +[2024-06-18 14:44:11,995][12645] Fps is (10 sec: 42591.9, 60 sec: 43416.5, 300 sec: 42764.8). Total num frames: 2526838784. Throughput: 0: 43081.0. Samples: 2527000120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) +[2024-06-18 14:44:11,996][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 14:44:14,274][12883] Updated weights for policy 0, policy_version 154233 (0.0029) +[2024-06-18 14:44:16,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 2527068160. Throughput: 0: 43011.9. Samples: 2527123840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:16,994][12645] Avg episode reward: [(0, '0.567')] +[2024-06-18 14:44:18,672][12883] Updated weights for policy 0, policy_version 154243 (0.0027) +[2024-06-18 14:44:21,922][12883] Updated weights for policy 0, policy_version 154253 (0.0032) +[2024-06-18 14:44:21,994][12645] Fps is (10 sec: 44243.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2527281152. Throughput: 0: 43102.4. Samples: 2527385860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:21,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 14:44:26,139][12883] Updated weights for policy 0, policy_version 154263 (0.0034) +[2024-06-18 14:44:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 2527494144. Throughput: 0: 43127.4. Samples: 2527647840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:26,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 14:44:29,502][12883] Updated weights for policy 0, policy_version 154273 (0.0033) +[2024-06-18 14:44:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 2527707136. Throughput: 0: 43141.4. Samples: 2527777840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:31,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 14:44:33,637][12883] Updated weights for policy 0, policy_version 154283 (0.0031) +[2024-06-18 14:44:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43146.9, 300 sec: 42709.5). Total num frames: 2527903744. Throughput: 0: 43202.6. Samples: 2528035120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:36,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 14:44:37,094][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154292_2527920128.pth... +[2024-06-18 14:44:37,157][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153664_2517630976.pth +[2024-06-18 14:44:37,301][12883] Updated weights for policy 0, policy_version 154293 (0.0048) +[2024-06-18 14:44:41,279][12883] Updated weights for policy 0, policy_version 154303 (0.0035) +[2024-06-18 14:44:41,997][12645] Fps is (10 sec: 42582.1, 60 sec: 43141.8, 300 sec: 42820.0). Total num frames: 2528133120. Throughput: 0: 43111.1. Samples: 2528292800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:41,998][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 14:44:44,840][12883] Updated weights for policy 0, policy_version 154313 (0.0033) +[2024-06-18 14:44:46,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2528362496. Throughput: 0: 43191.1. Samples: 2528423940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:46,998][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 14:44:48,831][12883] Updated weights for policy 0, policy_version 154323 (0.0040) +[2024-06-18 14:44:51,994][12645] Fps is (10 sec: 40975.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2528542720. Throughput: 0: 42976.8. Samples: 2528672760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:51,994][12645] Avg episode reward: [(0, '0.321')] +[2024-06-18 14:44:52,558][12883] Updated weights for policy 0, policy_version 154333 (0.0034) +[2024-06-18 14:44:56,616][12883] Updated weights for policy 0, policy_version 154343 (0.0030) +[2024-06-18 14:44:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2528755712. Throughput: 0: 42890.8. Samples: 2528930140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:44:56,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 14:45:00,221][12883] Updated weights for policy 0, policy_version 154353 (0.0032) +[2024-06-18 14:45:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2528985088. Throughput: 0: 42998.8. Samples: 2529058780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:45:01,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 14:45:04,413][12883] Updated weights for policy 0, policy_version 154363 (0.0032) +[2024-06-18 14:45:04,977][12862] Signal inference workers to stop experience collection... (37000 times) +[2024-06-18 14:45:05,031][12862] Signal inference workers to resume experience collection... (37000 times) +[2024-06-18 14:45:05,031][12883] InferenceWorker_p0-w0: stopping experience collection (37000 times) +[2024-06-18 14:45:05,058][12883] InferenceWorker_p0-w0: resuming experience collection (37000 times) +[2024-06-18 14:45:06,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2529198080. Throughput: 0: 42789.7. Samples: 2529311400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:45:06,994][12645] Avg episode reward: [(0, '0.769')] +[2024-06-18 14:45:07,993][12883] Updated weights for policy 0, policy_version 154373 (0.0050) +[2024-06-18 14:45:11,902][12883] Updated weights for policy 0, policy_version 154383 (0.0023) +[2024-06-18 14:45:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42872.6, 300 sec: 42876.1). Total num frames: 2529411072. Throughput: 0: 42747.7. Samples: 2529571480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:45:11,994][12645] Avg episode reward: [(0, '0.683')] +[2024-06-18 14:45:15,575][12883] Updated weights for policy 0, policy_version 154393 (0.0024) +[2024-06-18 14:45:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2529624064. Throughput: 0: 42676.9. Samples: 2529698300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) +[2024-06-18 14:45:16,994][12645] Avg episode reward: [(0, '0.730')] +[2024-06-18 14:45:19,580][12883] Updated weights for policy 0, policy_version 154403 (0.0030) +[2024-06-18 14:45:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2529837056. Throughput: 0: 42636.7. Samples: 2529953760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:45:21,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 14:45:23,448][12883] Updated weights for policy 0, policy_version 154413 (0.0044) +[2024-06-18 14:45:26,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2530050048. Throughput: 0: 42701.7. Samples: 2530214220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:45:26,995][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 14:45:27,154][12883] Updated weights for policy 0, policy_version 154423 (0.0045) +[2024-06-18 14:45:30,822][12883] Updated weights for policy 0, policy_version 154433 (0.0049) +[2024-06-18 14:45:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2530263040. Throughput: 0: 42612.6. Samples: 2530341500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:45:31,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 14:45:34,802][12883] Updated weights for policy 0, policy_version 154443 (0.0037) +[2024-06-18 14:45:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2530476032. Throughput: 0: 42703.2. Samples: 2530594400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:45:36,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 14:45:38,866][12883] Updated weights for policy 0, policy_version 154453 (0.0041) +[2024-06-18 14:45:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42328.1, 300 sec: 42765.0). Total num frames: 2530672640. Throughput: 0: 42699.2. Samples: 2530851600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:45:41,994][12645] Avg episode reward: [(0, '0.253')] +[2024-06-18 14:45:42,467][12883] Updated weights for policy 0, policy_version 154463 (0.0027) +[2024-06-18 14:45:46,547][12883] Updated weights for policy 0, policy_version 154473 (0.0032) +[2024-06-18 14:45:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2530902016. Throughput: 0: 42585.3. Samples: 2530975120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:45:46,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 14:45:50,177][12883] Updated weights for policy 0, policy_version 154483 (0.0027) +[2024-06-18 14:45:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2531115008. Throughput: 0: 42754.9. Samples: 2531235360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:45:51,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 14:45:54,057][12883] Updated weights for policy 0, policy_version 154493 (0.0027) +[2024-06-18 14:45:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2531328000. Throughput: 0: 42791.9. Samples: 2531497120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:45:56,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 14:45:57,714][12883] Updated weights for policy 0, policy_version 154503 (0.0030) +[2024-06-18 14:46:01,503][12883] Updated weights for policy 0, policy_version 154513 (0.0033) +[2024-06-18 14:46:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2531540992. Throughput: 0: 42817.8. Samples: 2531625100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:46:01,994][12645] Avg episode reward: [(0, '0.260')] +[2024-06-18 14:46:05,565][12883] Updated weights for policy 0, policy_version 154523 (0.0034) +[2024-06-18 14:46:06,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2531753984. Throughput: 0: 42748.4. Samples: 2531877440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:46:06,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 14:46:09,463][12883] Updated weights for policy 0, policy_version 154533 (0.0033) +[2024-06-18 14:46:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2531950592. Throughput: 0: 42748.6. Samples: 2532137900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:46:11,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 14:46:13,153][12883] Updated weights for policy 0, policy_version 154543 (0.0037) +[2024-06-18 14:46:16,933][12883] Updated weights for policy 0, policy_version 154553 (0.0032) +[2024-06-18 14:46:16,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2532196352. Throughput: 0: 42731.3. Samples: 2532264420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:46:16,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 14:46:20,760][12883] Updated weights for policy 0, policy_version 154563 (0.0040) +[2024-06-18 14:46:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2532409344. Throughput: 0: 42851.6. Samples: 2532522720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) +[2024-06-18 14:46:21,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 14:46:24,448][12883] Updated weights for policy 0, policy_version 154573 (0.0033) +[2024-06-18 14:46:26,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 2532605952. Throughput: 0: 42796.0. Samples: 2532777420. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:46:26,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 14:46:28,442][12883] Updated weights for policy 0, policy_version 154583 (0.0026) +[2024-06-18 14:46:30,418][12862] Signal inference workers to stop experience collection... (37050 times) +[2024-06-18 14:46:30,471][12883] InferenceWorker_p0-w0: stopping experience collection (37050 times) +[2024-06-18 14:46:30,533][12862] Signal inference workers to resume experience collection... (37050 times) +[2024-06-18 14:46:30,534][12883] InferenceWorker_p0-w0: resuming experience collection (37050 times) +[2024-06-18 14:46:31,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2532835328. Throughput: 0: 43000.9. Samples: 2532910260. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:46:31,996][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 14:46:32,345][12883] Updated weights for policy 0, policy_version 154593 (0.0031) +[2024-06-18 14:46:36,123][12883] Updated weights for policy 0, policy_version 154603 (0.0026) +[2024-06-18 14:46:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2533048320. Throughput: 0: 42895.4. Samples: 2533165660. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:46:36,996][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 14:46:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154605_2533048320.pth... +[2024-06-18 14:46:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153978_2522775552.pth +[2024-06-18 14:46:39,963][12883] Updated weights for policy 0, policy_version 154613 (0.0031) +[2024-06-18 14:46:41,996][12645] Fps is (10 sec: 44236.9, 60 sec: 43415.9, 300 sec: 42931.6). Total num frames: 2533277696. Throughput: 0: 42770.8. Samples: 2533421900. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:46:41,996][12645] Avg episode reward: [(0, '0.465')] +[2024-06-18 14:46:43,675][12883] Updated weights for policy 0, policy_version 154623 (0.0033) +[2024-06-18 14:46:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2533474304. Throughput: 0: 42795.1. Samples: 2533550880. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:46:46,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 14:46:47,615][12883] Updated weights for policy 0, policy_version 154633 (0.0049) +[2024-06-18 14:46:51,346][12883] Updated weights for policy 0, policy_version 154643 (0.0031) +[2024-06-18 14:46:51,997][12645] Fps is (10 sec: 40956.8, 60 sec: 42869.3, 300 sec: 42875.7). Total num frames: 2533687296. Throughput: 0: 42821.1. Samples: 2533804520. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:46:51,997][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 14:46:55,506][12883] Updated weights for policy 0, policy_version 154653 (0.0035) +[2024-06-18 14:46:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2533900288. Throughput: 0: 42662.7. Samples: 2534057720. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:46:56,994][12645] Avg episode reward: [(0, '0.729')] +[2024-06-18 14:46:58,994][12883] Updated weights for policy 0, policy_version 154663 (0.0037) +[2024-06-18 14:47:01,994][12645] Fps is (10 sec: 42611.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2534113280. Throughput: 0: 42718.9. Samples: 2534186760. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:47:01,994][12645] Avg episode reward: [(0, '0.751')] +[2024-06-18 14:47:03,147][12883] Updated weights for policy 0, policy_version 154673 (0.0028) +[2024-06-18 14:47:06,606][12883] Updated weights for policy 0, policy_version 154683 (0.0044) +[2024-06-18 14:47:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 2534342656. Throughput: 0: 42589.2. Samples: 2534439240. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:47:06,994][12645] Avg episode reward: [(0, '0.779')] +[2024-06-18 14:47:10,805][12883] Updated weights for policy 0, policy_version 154693 (0.0034) +[2024-06-18 14:47:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2534539264. Throughput: 0: 42604.3. Samples: 2534694620. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:47:11,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 14:47:14,224][12883] Updated weights for policy 0, policy_version 154703 (0.0038) +[2024-06-18 14:47:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2534752256. Throughput: 0: 42496.6. Samples: 2534822520. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:47:16,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 14:47:18,291][12883] Updated weights for policy 0, policy_version 154713 (0.0024) +[2024-06-18 14:47:21,993][12883] Updated weights for policy 0, policy_version 154723 (0.0039) +[2024-06-18 14:47:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2534981632. Throughput: 0: 42590.8. Samples: 2535082240. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:47:21,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 14:47:26,121][12883] Updated weights for policy 0, policy_version 154733 (0.0024) +[2024-06-18 14:47:26,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2535161856. Throughput: 0: 42443.0. Samples: 2535331740. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) +[2024-06-18 14:47:26,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 14:47:29,889][12883] Updated weights for policy 0, policy_version 154743 (0.0041) +[2024-06-18 14:47:31,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 2535391232. Throughput: 0: 42336.0. Samples: 2535456100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:47:31,996][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 14:47:34,120][12883] Updated weights for policy 0, policy_version 154753 (0.0032) +[2024-06-18 14:47:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2535604224. Throughput: 0: 42377.0. Samples: 2535711360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:47:36,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 14:47:37,487][12883] Updated weights for policy 0, policy_version 154763 (0.0033) +[2024-06-18 14:47:41,801][12883] Updated weights for policy 0, policy_version 154773 (0.0040) +[2024-06-18 14:47:41,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 2535800832. Throughput: 0: 42529.4. Samples: 2535971540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:47:41,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 14:47:44,964][12883] Updated weights for policy 0, policy_version 154783 (0.0041) +[2024-06-18 14:47:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2536030208. Throughput: 0: 42474.5. Samples: 2536098120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:47:46,994][12645] Avg episode reward: [(0, '0.306')] +[2024-06-18 14:47:49,573][12883] Updated weights for policy 0, policy_version 154793 (0.0036) +[2024-06-18 14:47:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42600.5, 300 sec: 42876.1). Total num frames: 2536243200. Throughput: 0: 42643.2. Samples: 2536358180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:47:51,998][12645] Avg episode reward: [(0, '0.338')] +[2024-06-18 14:47:52,494][12883] Updated weights for policy 0, policy_version 154803 (0.0042) +[2024-06-18 14:47:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2536423424. Throughput: 0: 42724.2. Samples: 2536617200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:47:56,994][12645] Avg episode reward: [(0, '0.662')] +[2024-06-18 14:47:57,151][12883] Updated weights for policy 0, policy_version 154813 (0.0031) +[2024-06-18 14:48:00,254][12883] Updated weights for policy 0, policy_version 154823 (0.0037) +[2024-06-18 14:48:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2536685568. Throughput: 0: 42629.6. Samples: 2536740840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:48:01,994][12645] Avg episode reward: [(0, '0.412')] +[2024-06-18 14:48:04,702][12883] Updated weights for policy 0, policy_version 154833 (0.0034) +[2024-06-18 14:48:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 2536865792. Throughput: 0: 42576.0. Samples: 2536998160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:48:06,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 14:48:08,418][12883] Updated weights for policy 0, policy_version 154843 (0.0026) +[2024-06-18 14:48:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2537078784. Throughput: 0: 42700.4. Samples: 2537253260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:48:11,994][12645] Avg episode reward: [(0, '0.721')] +[2024-06-18 14:48:12,271][12862] Signal inference workers to stop experience collection... (37100 times) +[2024-06-18 14:48:12,320][12883] InferenceWorker_p0-w0: stopping experience collection (37100 times) +[2024-06-18 14:48:12,329][12862] Signal inference workers to resume experience collection... (37100 times) +[2024-06-18 14:48:12,340][12883] InferenceWorker_p0-w0: resuming experience collection (37100 times) +[2024-06-18 14:48:12,472][12883] Updated weights for policy 0, policy_version 154853 (0.0036) +[2024-06-18 14:48:15,887][12883] Updated weights for policy 0, policy_version 154863 (0.0031) +[2024-06-18 14:48:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2537324544. Throughput: 0: 42921.7. Samples: 2537387480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:48:16,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 14:48:20,424][12883] Updated weights for policy 0, policy_version 154873 (0.0028) +[2024-06-18 14:48:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2537521152. Throughput: 0: 43038.2. Samples: 2537648080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:48:22,000][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 14:48:23,518][12883] Updated weights for policy 0, policy_version 154883 (0.0030) +[2024-06-18 14:48:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 2537717760. Throughput: 0: 42915.4. Samples: 2537902740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) +[2024-06-18 14:48:26,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 14:48:28,131][12883] Updated weights for policy 0, policy_version 154893 (0.0037) +[2024-06-18 14:48:31,060][12883] Updated weights for policy 0, policy_version 154903 (0.0033) +[2024-06-18 14:48:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42873.1, 300 sec: 42876.6). Total num frames: 2537963520. Throughput: 0: 42952.6. Samples: 2538030980. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:48:31,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 14:48:35,596][12883] Updated weights for policy 0, policy_version 154913 (0.0033) +[2024-06-18 14:48:36,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2538176512. Throughput: 0: 43025.4. Samples: 2538294320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:48:36,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 14:48:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154919_2538192896.pth... +[2024-06-18 14:48:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154292_2527920128.pth +[2024-06-18 14:48:38,610][12883] Updated weights for policy 0, policy_version 154923 (0.0029) +[2024-06-18 14:48:41,994][12645] Fps is (10 sec: 40956.5, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 2538373120. Throughput: 0: 42779.6. Samples: 2538542320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:48:41,995][12645] Avg episode reward: [(0, '0.401')] +[2024-06-18 14:48:43,025][12883] Updated weights for policy 0, policy_version 154933 (0.0024) +[2024-06-18 14:48:46,386][12883] Updated weights for policy 0, policy_version 154943 (0.0034) +[2024-06-18 14:48:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2538602496. Throughput: 0: 42832.0. Samples: 2538668280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:48:46,994][12645] Avg episode reward: [(0, '0.438')] +[2024-06-18 14:48:51,039][12883] Updated weights for policy 0, policy_version 154953 (0.0041) +[2024-06-18 14:48:51,994][12645] Fps is (10 sec: 44240.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2538815488. Throughput: 0: 42976.5. Samples: 2538932100. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:48:51,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 14:48:54,195][12883] Updated weights for policy 0, policy_version 154963 (0.0036) +[2024-06-18 14:48:56,997][12645] Fps is (10 sec: 42583.1, 60 sec: 43415.0, 300 sec: 42764.5). Total num frames: 2539028480. Throughput: 0: 42684.2. Samples: 2539174200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:48:56,998][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 14:48:58,648][12883] Updated weights for policy 0, policy_version 154973 (0.0034) +[2024-06-18 14:49:01,808][12883] Updated weights for policy 0, policy_version 154983 (0.0034) +[2024-06-18 14:49:01,995][12645] Fps is (10 sec: 42592.0, 60 sec: 42597.3, 300 sec: 42764.8). Total num frames: 2539241472. Throughput: 0: 42608.0. Samples: 2539304900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:49:01,996][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 14:49:06,096][12883] Updated weights for policy 0, policy_version 154993 (0.0033) +[2024-06-18 14:49:06,994][12645] Fps is (10 sec: 42613.5, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2539454464. Throughput: 0: 42691.6. Samples: 2539569200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:49:06,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 14:49:09,492][12883] Updated weights for policy 0, policy_version 155003 (0.0042) +[2024-06-18 14:49:11,994][12645] Fps is (10 sec: 44243.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2539683840. Throughput: 0: 42509.0. Samples: 2539815640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:49:11,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 14:49:13,640][12883] Updated weights for policy 0, policy_version 155013 (0.0037) +[2024-06-18 14:49:15,318][12862] Signal inference workers to stop experience collection... (37150 times) +[2024-06-18 14:49:15,319][12862] Signal inference workers to resume experience collection... (37150 times) +[2024-06-18 14:49:15,342][12883] InferenceWorker_p0-w0: stopping experience collection (37150 times) +[2024-06-18 14:49:15,342][12883] InferenceWorker_p0-w0: resuming experience collection (37150 times) +[2024-06-18 14:49:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2539880448. Throughput: 0: 42684.4. Samples: 2539951780. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:49:16,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 14:49:17,271][12883] Updated weights for policy 0, policy_version 155023 (0.0030) +[2024-06-18 14:49:21,247][12883] Updated weights for policy 0, policy_version 155033 (0.0027) +[2024-06-18 14:49:21,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2540077056. Throughput: 0: 42733.8. Samples: 2540217440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:49:21,996][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 14:49:24,750][12883] Updated weights for policy 0, policy_version 155043 (0.0032) +[2024-06-18 14:49:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 2540339200. Throughput: 0: 42650.9. Samples: 2540461580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:49:26,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 14:49:28,937][12883] Updated weights for policy 0, policy_version 155053 (0.0036) +[2024-06-18 14:49:31,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 2540519424. Throughput: 0: 42869.8. Samples: 2540597420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) +[2024-06-18 14:49:31,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 14:49:32,304][12883] Updated weights for policy 0, policy_version 155063 (0.0044) +[2024-06-18 14:49:36,427][12883] Updated weights for policy 0, policy_version 155073 (0.0037) +[2024-06-18 14:49:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 2540716032. Throughput: 0: 42670.6. Samples: 2540852280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:49:36,994][12645] Avg episode reward: [(0, '0.686')] +[2024-06-18 14:49:39,788][12883] Updated weights for policy 0, policy_version 155083 (0.0037) +[2024-06-18 14:49:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43145.1, 300 sec: 42709.5). Total num frames: 2540961792. Throughput: 0: 42913.6. Samples: 2541105160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:49:41,994][12645] Avg episode reward: [(0, '0.658')] +[2024-06-18 14:49:44,007][12883] Updated weights for policy 0, policy_version 155093 (0.0047) +[2024-06-18 14:49:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2541174784. Throughput: 0: 42938.2. Samples: 2541237060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:49:46,994][12645] Avg episode reward: [(0, '0.687')] +[2024-06-18 14:49:48,243][12883] Updated weights for policy 0, policy_version 155103 (0.0036) +[2024-06-18 14:49:51,943][12883] Updated weights for policy 0, policy_version 155113 (0.0037) +[2024-06-18 14:49:51,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 2541371392. Throughput: 0: 42605.4. Samples: 2541486540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:49:51,996][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 14:49:55,774][12883] Updated weights for policy 0, policy_version 155123 (0.0034) +[2024-06-18 14:49:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42600.9, 300 sec: 42709.5). Total num frames: 2541584384. Throughput: 0: 42918.7. Samples: 2541746980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:49:56,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 14:49:59,750][12883] Updated weights for policy 0, policy_version 155133 (0.0033) +[2024-06-18 14:50:01,996][12645] Fps is (10 sec: 44236.9, 60 sec: 42870.9, 300 sec: 42764.7). Total num frames: 2541813760. Throughput: 0: 42831.6. Samples: 2541879300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:50:01,996][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 14:50:03,369][12883] Updated weights for policy 0, policy_version 155143 (0.0045) +[2024-06-18 14:50:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2541993984. Throughput: 0: 42396.7. Samples: 2542125200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:50:06,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 14:50:07,464][12883] Updated weights for policy 0, policy_version 155153 (0.0028) +[2024-06-18 14:50:10,966][12883] Updated weights for policy 0, policy_version 155163 (0.0032) +[2024-06-18 14:50:11,996][12645] Fps is (10 sec: 40959.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 2542223360. Throughput: 0: 42696.1. Samples: 2542383000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:50:11,996][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 14:50:15,155][12883] Updated weights for policy 0, policy_version 155173 (0.0036) +[2024-06-18 14:50:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2542436352. Throughput: 0: 42539.9. Samples: 2542511720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:50:16,995][12645] Avg episode reward: [(0, '0.320')] +[2024-06-18 14:50:18,989][12883] Updated weights for policy 0, policy_version 155183 (0.0035) +[2024-06-18 14:50:21,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2542649344. Throughput: 0: 42466.2. Samples: 2542763260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:50:21,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 14:50:22,691][12883] Updated weights for policy 0, policy_version 155193 (0.0037) +[2024-06-18 14:50:26,734][12883] Updated weights for policy 0, policy_version 155203 (0.0042) +[2024-06-18 14:50:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2542862336. Throughput: 0: 42518.1. Samples: 2543018480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:50:26,994][12645] Avg episode reward: [(0, '0.688')] +[2024-06-18 14:50:30,466][12883] Updated weights for policy 0, policy_version 155213 (0.0031) +[2024-06-18 14:50:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2543058944. Throughput: 0: 42357.9. Samples: 2543143160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:50:31,994][12645] Avg episode reward: [(0, '0.716')] +[2024-06-18 14:50:34,429][12883] Updated weights for policy 0, policy_version 155223 (0.0043) +[2024-06-18 14:50:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2543304704. Throughput: 0: 42440.8. Samples: 2543396280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) +[2024-06-18 14:50:36,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 14:50:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155231_2543304704.pth... +[2024-06-18 14:50:37,057][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154605_2533048320.pth +[2024-06-18 14:50:38,108][12883] Updated weights for policy 0, policy_version 155233 (0.0041) +[2024-06-18 14:50:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2543468544. Throughput: 0: 42299.1. Samples: 2543650440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:50:41,994][12645] Avg episode reward: [(0, '0.685')] +[2024-06-18 14:50:42,339][12883] Updated weights for policy 0, policy_version 155243 (0.0049) +[2024-06-18 14:50:45,793][12883] Updated weights for policy 0, policy_version 155253 (0.0033) +[2024-06-18 14:50:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 2543697920. Throughput: 0: 42058.5. Samples: 2543771840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:50:46,994][12645] Avg episode reward: [(0, '0.702')] +[2024-06-18 14:50:49,914][12883] Updated weights for policy 0, policy_version 155263 (0.0045) +[2024-06-18 14:50:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2543927296. Throughput: 0: 42417.4. Samples: 2544033980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:50:51,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 14:50:53,611][12883] Updated weights for policy 0, policy_version 155273 (0.0026) +[2024-06-18 14:50:56,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42050.7, 300 sec: 42598.1). Total num frames: 2544107520. Throughput: 0: 42483.6. Samples: 2544294760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:50:56,996][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 14:50:57,349][12862] Signal inference workers to stop experience collection... (37200 times) +[2024-06-18 14:50:57,377][12883] InferenceWorker_p0-w0: stopping experience collection (37200 times) +[2024-06-18 14:50:57,398][12862] Signal inference workers to resume experience collection... (37200 times) +[2024-06-18 14:50:57,399][12883] InferenceWorker_p0-w0: resuming experience collection (37200 times) +[2024-06-18 14:50:57,550][12883] Updated weights for policy 0, policy_version 155283 (0.0037) +[2024-06-18 14:51:01,278][12883] Updated weights for policy 0, policy_version 155293 (0.0033) +[2024-06-18 14:51:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 2544353280. Throughput: 0: 42245.4. Samples: 2544412760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:01,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 14:51:05,105][12883] Updated weights for policy 0, policy_version 155303 (0.0028) +[2024-06-18 14:51:06,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2544566272. Throughput: 0: 42503.1. Samples: 2544675900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:06,994][12645] Avg episode reward: [(0, '0.678')] +[2024-06-18 14:51:08,730][12883] Updated weights for policy 0, policy_version 155313 (0.0036) +[2024-06-18 14:51:11,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41780.7, 300 sec: 42487.3). Total num frames: 2544730112. Throughput: 0: 42381.3. Samples: 2544925640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:11,995][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 14:51:12,923][12883] Updated weights for policy 0, policy_version 155323 (0.0043) +[2024-06-18 14:51:16,412][12883] Updated weights for policy 0, policy_version 155333 (0.0047) +[2024-06-18 14:51:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2544992256. Throughput: 0: 42386.1. Samples: 2545050540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:16,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 14:51:20,755][12883] Updated weights for policy 0, policy_version 155343 (0.0039) +[2024-06-18 14:51:21,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2545172480. Throughput: 0: 42427.1. Samples: 2545305500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:21,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 14:51:24,189][12883] Updated weights for policy 0, policy_version 155353 (0.0032) +[2024-06-18 14:51:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 2545385472. Throughput: 0: 42447.0. Samples: 2545560560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:26,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 14:51:28,735][12883] Updated weights for policy 0, policy_version 155363 (0.0033) +[2024-06-18 14:51:31,693][12883] Updated weights for policy 0, policy_version 155373 (0.0036) +[2024-06-18 14:51:31,994][12645] Fps is (10 sec: 45872.4, 60 sec: 42871.1, 300 sec: 42653.9). Total num frames: 2545631232. Throughput: 0: 42653.3. Samples: 2545691260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:31,995][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 14:51:36,535][12883] Updated weights for policy 0, policy_version 155383 (0.0036) +[2024-06-18 14:51:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42487.6). Total num frames: 2545811456. Throughput: 0: 42538.2. Samples: 2545948200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:36,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 14:51:39,304][12883] Updated weights for policy 0, policy_version 155393 (0.0036) +[2024-06-18 14:51:41,994][12645] Fps is (10 sec: 40962.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2546040832. Throughput: 0: 42308.4. Samples: 2546198540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) +[2024-06-18 14:51:41,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 14:51:44,104][12883] Updated weights for policy 0, policy_version 155403 (0.0024) +[2024-06-18 14:51:46,793][12883] Updated weights for policy 0, policy_version 155413 (0.0040) +[2024-06-18 14:51:46,994][12645] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42709.9). Total num frames: 2546286592. Throughput: 0: 42590.3. Samples: 2546329320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:51:47,000][12645] Avg episode reward: [(0, '0.699')] +[2024-06-18 14:51:51,805][12883] Updated weights for policy 0, policy_version 155423 (0.0039) +[2024-06-18 14:51:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2546450432. Throughput: 0: 42550.7. Samples: 2546590680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:51:51,994][12645] Avg episode reward: [(0, '0.696')] +[2024-06-18 14:51:54,377][12883] Updated weights for policy 0, policy_version 155433 (0.0037) +[2024-06-18 14:51:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 2546696192. Throughput: 0: 42510.4. Samples: 2546838600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:51:56,994][12645] Avg episode reward: [(0, '0.452')] +[2024-06-18 14:51:59,478][12883] Updated weights for policy 0, policy_version 155443 (0.0030) +[2024-06-18 14:52:01,994][12645] Fps is (10 sec: 47512.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2546925568. Throughput: 0: 42672.3. Samples: 2546970800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:01,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 14:52:02,434][12883] Updated weights for policy 0, policy_version 155453 (0.0033) +[2024-06-18 14:52:06,996][12645] Fps is (10 sec: 39312.6, 60 sec: 42050.7, 300 sec: 42542.6). Total num frames: 2547089408. Throughput: 0: 42752.9. Samples: 2547229480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:06,996][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 14:52:07,190][12883] Updated weights for policy 0, policy_version 155463 (0.0029) +[2024-06-18 14:52:09,981][12883] Updated weights for policy 0, policy_version 155473 (0.0031) +[2024-06-18 14:52:11,994][12645] Fps is (10 sec: 39322.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2547318784. Throughput: 0: 42600.6. Samples: 2547477580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:11,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 14:52:14,891][12883] Updated weights for policy 0, policy_version 155483 (0.0041) +[2024-06-18 14:52:16,742][12862] Signal inference workers to stop experience collection... (37250 times) +[2024-06-18 14:52:16,743][12862] Signal inference workers to resume experience collection... (37250 times) +[2024-06-18 14:52:16,786][12883] InferenceWorker_p0-w0: stopping experience collection (37250 times) +[2024-06-18 14:52:16,786][12883] InferenceWorker_p0-w0: resuming experience collection (37250 times) +[2024-06-18 14:52:16,994][12645] Fps is (10 sec: 47524.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2547564544. Throughput: 0: 42712.0. Samples: 2547613280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:16,994][12645] Avg episode reward: [(0, '0.564')] +[2024-06-18 14:52:17,542][12883] Updated weights for policy 0, policy_version 155493 (0.0036) +[2024-06-18 14:52:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2547728384. Throughput: 0: 42669.3. Samples: 2547868320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:21,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 14:52:22,437][12883] Updated weights for policy 0, policy_version 155503 (0.0026) +[2024-06-18 14:52:25,477][12883] Updated weights for policy 0, policy_version 155513 (0.0032) +[2024-06-18 14:52:27,000][12645] Fps is (10 sec: 40934.3, 60 sec: 43140.1, 300 sec: 42653.4). Total num frames: 2547974144. Throughput: 0: 42428.2. Samples: 2548108080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:27,001][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 14:52:29,959][12883] Updated weights for policy 0, policy_version 155523 (0.0034) +[2024-06-18 14:52:31,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42871.9, 300 sec: 42709.5). Total num frames: 2548203520. Throughput: 0: 42670.3. Samples: 2548249480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:31,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 14:52:33,170][12883] Updated weights for policy 0, policy_version 155533 (0.0027) +[2024-06-18 14:52:36,994][12645] Fps is (10 sec: 40985.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2548383744. Throughput: 0: 42679.5. Samples: 2548511260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:36,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 14:52:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155541_2548383744.pth... +[2024-06-18 14:52:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154919_2538192896.pth +[2024-06-18 14:52:37,623][12883] Updated weights for policy 0, policy_version 155543 (0.0043) +[2024-06-18 14:52:40,922][12883] Updated weights for policy 0, policy_version 155553 (0.0031) +[2024-06-18 14:52:41,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2548613120. Throughput: 0: 42567.8. Samples: 2548754160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 14:52:41,995][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 14:52:45,273][12883] Updated weights for policy 0, policy_version 155563 (0.0035) +[2024-06-18 14:52:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2548826112. Throughput: 0: 42645.5. Samples: 2548889840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:52:46,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 14:52:48,746][12883] Updated weights for policy 0, policy_version 155573 (0.0037) +[2024-06-18 14:52:51,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2549006336. Throughput: 0: 42614.1. Samples: 2549147020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:52:51,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 14:52:52,803][12883] Updated weights for policy 0, policy_version 155583 (0.0039) +[2024-06-18 14:52:56,529][12883] Updated weights for policy 0, policy_version 155593 (0.0037) +[2024-06-18 14:52:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2549252096. Throughput: 0: 42538.7. Samples: 2549391820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:52:56,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 14:53:00,482][12883] Updated weights for policy 0, policy_version 155603 (0.0032) +[2024-06-18 14:53:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 2549465088. Throughput: 0: 42530.3. Samples: 2549527140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:01,994][12645] Avg episode reward: [(0, '0.450')] +[2024-06-18 14:53:04,414][12883] Updated weights for policy 0, policy_version 155613 (0.0030) +[2024-06-18 14:53:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2549645312. Throughput: 0: 42485.8. Samples: 2549780180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:06,994][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 14:53:08,532][12883] Updated weights for policy 0, policy_version 155623 (0.0036) +[2024-06-18 14:53:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2549874688. Throughput: 0: 42603.3. Samples: 2550024960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:11,994][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 14:53:12,031][12883] Updated weights for policy 0, policy_version 155633 (0.0045) +[2024-06-18 14:53:16,394][12883] Updated weights for policy 0, policy_version 155643 (0.0046) +[2024-06-18 14:53:16,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 2550104064. Throughput: 0: 42420.5. Samples: 2550158500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:16,996][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 14:53:19,574][12883] Updated weights for policy 0, policy_version 155653 (0.0030) +[2024-06-18 14:53:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2550267904. Throughput: 0: 42180.0. Samples: 2550409360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:21,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 14:53:24,047][12883] Updated weights for policy 0, policy_version 155663 (0.0025) +[2024-06-18 14:53:26,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42602.9, 300 sec: 42598.4). Total num frames: 2550530048. Throughput: 0: 42489.5. Samples: 2550666180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:26,994][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 14:53:27,122][12883] Updated weights for policy 0, policy_version 155673 (0.0031) +[2024-06-18 14:53:31,652][12883] Updated weights for policy 0, policy_version 155683 (0.0040) +[2024-06-18 14:53:31,658][12862] Signal inference workers to stop experience collection... (37300 times) +[2024-06-18 14:53:31,659][12862] Signal inference workers to resume experience collection... (37300 times) +[2024-06-18 14:53:31,675][12883] InferenceWorker_p0-w0: stopping experience collection (37300 times) +[2024-06-18 14:53:31,675][12883] InferenceWorker_p0-w0: resuming experience collection (37300 times) +[2024-06-18 14:53:31,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2550743040. Throughput: 0: 42498.2. Samples: 2550802260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:31,994][12645] Avg episode reward: [(0, '0.724')] +[2024-06-18 14:53:34,783][12883] Updated weights for policy 0, policy_version 155693 (0.0032) +[2024-06-18 14:53:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42543.0). Total num frames: 2550923264. Throughput: 0: 42241.3. Samples: 2551047880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:36,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 14:53:39,224][12883] Updated weights for policy 0, policy_version 155703 (0.0042) +[2024-06-18 14:53:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2551136256. Throughput: 0: 42463.4. Samples: 2551302680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:41,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 14:53:42,718][12883] Updated weights for policy 0, policy_version 155713 (0.0049) +[2024-06-18 14:53:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2551365632. Throughput: 0: 42314.3. Samples: 2551431280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 14:53:46,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 14:53:46,995][12883] Updated weights for policy 0, policy_version 155723 (0.0033) +[2024-06-18 14:53:50,370][12883] Updated weights for policy 0, policy_version 155733 (0.0027) +[2024-06-18 14:53:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.8). Total num frames: 2551562240. Throughput: 0: 42317.8. Samples: 2551684480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:53:51,994][12645] Avg episode reward: [(0, '0.371')] +[2024-06-18 14:53:54,627][12883] Updated weights for policy 0, policy_version 155743 (0.0043) +[2024-06-18 14:53:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42543.1). Total num frames: 2551791616. Throughput: 0: 42713.3. Samples: 2551947060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:53:56,994][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 14:53:58,054][12883] Updated weights for policy 0, policy_version 155753 (0.0040) +[2024-06-18 14:54:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2551988224. Throughput: 0: 42680.0. Samples: 2552079000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:01,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 14:54:02,235][12883] Updated weights for policy 0, policy_version 155763 (0.0037) +[2024-06-18 14:54:05,775][12883] Updated weights for policy 0, policy_version 155773 (0.0025) +[2024-06-18 14:54:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2552217600. Throughput: 0: 42686.3. Samples: 2552330240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:06,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 14:54:09,914][12883] Updated weights for policy 0, policy_version 155783 (0.0053) +[2024-06-18 14:54:11,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2552430592. Throughput: 0: 42653.7. Samples: 2552585600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:11,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 14:54:13,663][12883] Updated weights for policy 0, policy_version 155793 (0.0023) +[2024-06-18 14:54:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42053.9, 300 sec: 42543.2). Total num frames: 2552627200. Throughput: 0: 42414.3. Samples: 2552710900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:16,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 14:54:17,627][12883] Updated weights for policy 0, policy_version 155803 (0.0022) +[2024-06-18 14:54:21,272][12883] Updated weights for policy 0, policy_version 155813 (0.0032) +[2024-06-18 14:54:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 2552856576. Throughput: 0: 42603.5. Samples: 2552965040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:21,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 14:54:25,103][12883] Updated weights for policy 0, policy_version 155823 (0.0029) +[2024-06-18 14:54:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2553069568. Throughput: 0: 42720.5. Samples: 2553225100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:26,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 14:54:28,855][12883] Updated weights for policy 0, policy_version 155833 (0.0044) +[2024-06-18 14:54:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2553249792. Throughput: 0: 42701.6. Samples: 2553352860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:31,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 14:54:32,946][12883] Updated weights for policy 0, policy_version 155843 (0.0031) +[2024-06-18 14:54:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 2553479168. Throughput: 0: 42620.6. Samples: 2553602400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:36,994][12645] Avg episode reward: [(0, '0.702')] +[2024-06-18 14:54:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155853_2553495552.pth... +[2024-06-18 14:54:37,022][12883] Updated weights for policy 0, policy_version 155853 (0.0033) +[2024-06-18 14:54:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155231_2543304704.pth +[2024-06-18 14:54:40,946][12883] Updated weights for policy 0, policy_version 155863 (0.0027) +[2024-06-18 14:54:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2553708544. Throughput: 0: 42371.9. Samples: 2553853800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:41,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 14:54:45,024][12883] Updated weights for policy 0, policy_version 155873 (0.0046) +[2024-06-18 14:54:46,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42487.6). Total num frames: 2553905152. Throughput: 0: 42372.7. Samples: 2553985780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:46,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 14:54:48,544][12883] Updated weights for policy 0, policy_version 155883 (0.0031) +[2024-06-18 14:54:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2554134528. Throughput: 0: 42556.3. Samples: 2554245280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) +[2024-06-18 14:54:51,994][12645] Avg episode reward: [(0, '0.608')] +[2024-06-18 14:54:52,658][12883] Updated weights for policy 0, policy_version 155893 (0.0039) +[2024-06-18 14:54:56,009][12883] Updated weights for policy 0, policy_version 155903 (0.0033) +[2024-06-18 14:54:56,618][12862] Signal inference workers to stop experience collection... (37350 times) +[2024-06-18 14:54:56,618][12862] Signal inference workers to resume experience collection... (37350 times) +[2024-06-18 14:54:56,662][12883] InferenceWorker_p0-w0: stopping experience collection (37350 times) +[2024-06-18 14:54:56,662][12883] InferenceWorker_p0-w0: resuming experience collection (37350 times) +[2024-06-18 14:54:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 2554347520. Throughput: 0: 42507.1. Samples: 2554498420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:54:56,999][12645] Avg episode reward: [(0, '0.559')] +[2024-06-18 14:55:00,067][12883] Updated weights for policy 0, policy_version 155913 (0.0036) +[2024-06-18 14:55:02,000][12645] Fps is (10 sec: 40935.7, 60 sec: 42594.1, 300 sec: 42542.0). Total num frames: 2554544128. Throughput: 0: 42523.2. Samples: 2554624700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:02,000][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 14:55:04,195][12883] Updated weights for policy 0, policy_version 155923 (0.0043) +[2024-06-18 14:55:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2554757120. Throughput: 0: 42486.7. Samples: 2554876940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:06,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 14:55:07,747][12883] Updated weights for policy 0, policy_version 155933 (0.0025) +[2024-06-18 14:55:11,958][12883] Updated weights for policy 0, policy_version 155943 (0.0042) +[2024-06-18 14:55:11,994][12645] Fps is (10 sec: 42624.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2554970112. Throughput: 0: 42336.1. Samples: 2555130220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:11,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 14:55:15,331][12883] Updated weights for policy 0, policy_version 155953 (0.0029) +[2024-06-18 14:55:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2555183104. Throughput: 0: 42381.7. Samples: 2555260040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:16,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 14:55:19,381][12883] Updated weights for policy 0, policy_version 155963 (0.0037) +[2024-06-18 14:55:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2555379712. Throughput: 0: 42439.8. Samples: 2555512200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:21,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 14:55:23,214][12883] Updated weights for policy 0, policy_version 155973 (0.0031) +[2024-06-18 14:55:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2555609088. Throughput: 0: 42576.0. Samples: 2555769720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:26,994][12645] Avg episode reward: [(0, '0.543')] +[2024-06-18 14:55:27,237][12883] Updated weights for policy 0, policy_version 155983 (0.0032) +[2024-06-18 14:55:31,038][12883] Updated weights for policy 0, policy_version 155993 (0.0024) +[2024-06-18 14:55:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2555822080. Throughput: 0: 42436.1. Samples: 2555895400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:31,994][12645] Avg episode reward: [(0, '0.275')] +[2024-06-18 14:55:35,013][12883] Updated weights for policy 0, policy_version 156003 (0.0027) +[2024-06-18 14:55:36,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 2556018688. Throughput: 0: 42309.1. Samples: 2556149280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:36,996][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 14:55:38,719][12883] Updated weights for policy 0, policy_version 156013 (0.0029) +[2024-06-18 14:55:41,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2556248064. Throughput: 0: 42284.6. Samples: 2556401320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:41,996][12645] Avg episode reward: [(0, '0.607')] +[2024-06-18 14:55:42,805][12883] Updated weights for policy 0, policy_version 156023 (0.0036) +[2024-06-18 14:55:46,324][12883] Updated weights for policy 0, policy_version 156033 (0.0039) +[2024-06-18 14:55:46,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2556461056. Throughput: 0: 42366.1. Samples: 2556530920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:46,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 14:55:50,445][12883] Updated weights for policy 0, policy_version 156043 (0.0032) +[2024-06-18 14:55:51,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 2556657664. Throughput: 0: 42399.1. Samples: 2556784900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:51,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 14:55:53,954][12883] Updated weights for policy 0, policy_version 156053 (0.0035) +[2024-06-18 14:55:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2556887040. Throughput: 0: 42511.1. Samples: 2557043220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) +[2024-06-18 14:55:56,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 14:55:58,086][12883] Updated weights for policy 0, policy_version 156063 (0.0030) +[2024-06-18 14:56:01,674][12883] Updated weights for policy 0, policy_version 156073 (0.0047) +[2024-06-18 14:56:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42875.7, 300 sec: 42542.9). Total num frames: 2557116416. Throughput: 0: 42473.9. Samples: 2557171360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:01,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 14:56:06,074][12883] Updated weights for policy 0, policy_version 156083 (0.0030) +[2024-06-18 14:56:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2557296640. Throughput: 0: 42602.3. Samples: 2557429300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:06,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 14:56:09,244][12883] Updated weights for policy 0, policy_version 156093 (0.0033) +[2024-06-18 14:56:11,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42869.8, 300 sec: 42542.6). Total num frames: 2557542400. Throughput: 0: 42459.3. Samples: 2557680480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:11,996][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 14:56:13,675][12883] Updated weights for policy 0, policy_version 156103 (0.0042) +[2024-06-18 14:56:16,930][12883] Updated weights for policy 0, policy_version 156113 (0.0026) +[2024-06-18 14:56:16,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2557755392. Throughput: 0: 42509.7. Samples: 2557808340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:17,000][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 14:56:21,397][12883] Updated weights for policy 0, policy_version 156123 (0.0033) +[2024-06-18 14:56:21,994][12645] Fps is (10 sec: 37691.8, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 2557919232. Throughput: 0: 42512.9. Samples: 2558062260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:21,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 14:56:24,247][12862] Signal inference workers to stop experience collection... (37400 times) +[2024-06-18 14:56:24,247][12862] Signal inference workers to resume experience collection... (37400 times) +[2024-06-18 14:56:24,316][12883] InferenceWorker_p0-w0: stopping experience collection (37400 times) +[2024-06-18 14:56:24,320][12883] InferenceWorker_p0-w0: resuming experience collection (37400 times) +[2024-06-18 14:56:24,799][12883] Updated weights for policy 0, policy_version 156133 (0.0041) +[2024-06-18 14:56:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 2558164992. Throughput: 0: 42618.9. Samples: 2558319080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:26,994][12645] Avg episode reward: [(0, '0.439')] +[2024-06-18 14:56:29,110][12883] Updated weights for policy 0, policy_version 156143 (0.0028) +[2024-06-18 14:56:31,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2558377984. Throughput: 0: 42695.1. Samples: 2558452200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:31,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 14:56:32,584][12883] Updated weights for policy 0, policy_version 156153 (0.0030) +[2024-06-18 14:56:36,708][12883] Updated weights for policy 0, policy_version 156163 (0.0033) +[2024-06-18 14:56:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 2558574592. Throughput: 0: 42660.2. Samples: 2558704600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:36,994][12645] Avg episode reward: [(0, '0.800')] +[2024-06-18 14:56:37,088][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156164_2558590976.pth... +[2024-06-18 14:56:37,152][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155541_2548383744.pth +[2024-06-18 14:56:40,347][12883] Updated weights for policy 0, policy_version 156173 (0.0035) +[2024-06-18 14:56:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 2558820352. Throughput: 0: 42514.2. Samples: 2558956360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:41,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 14:56:44,431][12883] Updated weights for policy 0, policy_version 156183 (0.0027) +[2024-06-18 14:56:46,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2559016960. Throughput: 0: 42655.2. Samples: 2559090940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:46,997][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 14:56:48,030][12883] Updated weights for policy 0, policy_version 156193 (0.0031) +[2024-06-18 14:56:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2559213568. Throughput: 0: 42559.1. Samples: 2559344460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:51,994][12645] Avg episode reward: [(0, '0.273')] +[2024-06-18 14:56:52,155][12883] Updated weights for policy 0, policy_version 156203 (0.0038) +[2024-06-18 14:56:55,769][12883] Updated weights for policy 0, policy_version 156213 (0.0041) +[2024-06-18 14:56:56,996][12645] Fps is (10 sec: 44236.9, 60 sec: 42869.9, 300 sec: 42487.0). Total num frames: 2559459328. Throughput: 0: 42516.4. Samples: 2559593720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:56:56,996][12645] Avg episode reward: [(0, '0.279')] +[2024-06-18 14:56:59,833][12883] Updated weights for policy 0, policy_version 156223 (0.0031) +[2024-06-18 14:57:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 2559639552. Throughput: 0: 42556.2. Samples: 2559723360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 14:57:01,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 14:57:03,419][12883] Updated weights for policy 0, policy_version 156233 (0.0029) +[2024-06-18 14:57:06,996][12645] Fps is (10 sec: 39321.5, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 2559852544. Throughput: 0: 42590.7. Samples: 2559978940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:06,996][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 14:57:07,350][12883] Updated weights for policy 0, policy_version 156243 (0.0032) +[2024-06-18 14:57:11,293][12883] Updated weights for policy 0, policy_version 156253 (0.0037) +[2024-06-18 14:57:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42326.9, 300 sec: 42431.8). Total num frames: 2560081920. Throughput: 0: 42426.8. Samples: 2560228280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:11,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 14:57:14,977][12883] Updated weights for policy 0, policy_version 156263 (0.0034) +[2024-06-18 14:57:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2560278528. Throughput: 0: 42366.7. Samples: 2560358700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:16,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 14:57:18,945][12883] Updated weights for policy 0, policy_version 156273 (0.0036) +[2024-06-18 14:57:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42432.7). Total num frames: 2560491520. Throughput: 0: 42379.9. Samples: 2560611700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:21,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 14:57:23,267][12883] Updated weights for policy 0, policy_version 156283 (0.0027) +[2024-06-18 14:57:26,807][12883] Updated weights for policy 0, policy_version 156293 (0.0030) +[2024-06-18 14:57:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 2560704512. Throughput: 0: 42332.1. Samples: 2560861300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:26,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 14:57:30,759][12883] Updated weights for policy 0, policy_version 156303 (0.0037) +[2024-06-18 14:57:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2560917504. Throughput: 0: 42165.6. Samples: 2560988300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:31,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 14:57:34,541][12883] Updated weights for policy 0, policy_version 156313 (0.0036) +[2024-06-18 14:57:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 2561114112. Throughput: 0: 42114.7. Samples: 2561239620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:36,997][12645] Avg episode reward: [(0, '0.616')] +[2024-06-18 14:57:38,480][12883] Updated weights for policy 0, policy_version 156323 (0.0033) +[2024-06-18 14:57:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 2561327104. Throughput: 0: 42208.3. Samples: 2561493000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:41,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 14:57:42,217][12883] Updated weights for policy 0, policy_version 156333 (0.0042) +[2024-06-18 14:57:46,241][12883] Updated weights for policy 0, policy_version 156343 (0.0025) +[2024-06-18 14:57:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42326.8, 300 sec: 42542.8). Total num frames: 2561556480. Throughput: 0: 42149.2. Samples: 2561620080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:46,994][12645] Avg episode reward: [(0, '0.304')] +[2024-06-18 14:57:48,759][12862] Signal inference workers to stop experience collection... (37450 times) +[2024-06-18 14:57:48,796][12883] InferenceWorker_p0-w0: stopping experience collection (37450 times) +[2024-06-18 14:57:48,811][12862] Signal inference workers to resume experience collection... (37450 times) +[2024-06-18 14:57:48,821][12883] InferenceWorker_p0-w0: resuming experience collection (37450 times) +[2024-06-18 14:57:49,734][12883] Updated weights for policy 0, policy_version 156353 (0.0032) +[2024-06-18 14:57:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2561769472. Throughput: 0: 42125.6. Samples: 2561874500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:51,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 14:57:54,038][12883] Updated weights for policy 0, policy_version 156363 (0.0027) +[2024-06-18 14:57:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 2561982464. Throughput: 0: 42362.2. Samples: 2562134580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:57:56,994][12645] Avg episode reward: [(0, '0.325')] +[2024-06-18 14:57:57,191][12883] Updated weights for policy 0, policy_version 156373 (0.0042) +[2024-06-18 14:58:01,665][12883] Updated weights for policy 0, policy_version 156383 (0.0035) +[2024-06-18 14:58:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2562195456. Throughput: 0: 42354.6. Samples: 2562264660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:01,994][12645] Avg episode reward: [(0, '0.627')] +[2024-06-18 14:58:05,135][12883] Updated weights for policy 0, policy_version 156393 (0.0053) +[2024-06-18 14:58:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42326.9, 300 sec: 42431.8). Total num frames: 2562392064. Throughput: 0: 42282.2. Samples: 2562514400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:06,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 14:58:09,259][12883] Updated weights for policy 0, policy_version 156403 (0.0029) +[2024-06-18 14:58:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42432.1). Total num frames: 2562621440. Throughput: 0: 42438.5. Samples: 2562771040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:11,996][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 14:58:12,834][12883] Updated weights for policy 0, policy_version 156413 (0.0032) +[2024-06-18 14:58:16,921][12883] Updated weights for policy 0, policy_version 156423 (0.0033) +[2024-06-18 14:58:16,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2562834432. Throughput: 0: 42408.6. Samples: 2562896680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:16,994][12645] Avg episode reward: [(0, '0.508')] +[2024-06-18 14:58:20,542][12883] Updated weights for policy 0, policy_version 156433 (0.0037) +[2024-06-18 14:58:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 2563031040. Throughput: 0: 42410.1. Samples: 2563148080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:21,995][12645] Avg episode reward: [(0, '0.679')] +[2024-06-18 14:58:24,923][12883] Updated weights for policy 0, policy_version 156443 (0.0048) +[2024-06-18 14:58:26,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 2563227648. Throughput: 0: 42592.4. Samples: 2563409660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:26,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 14:58:28,324][12883] Updated weights for policy 0, policy_version 156453 (0.0029) +[2024-06-18 14:58:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2563457024. Throughput: 0: 42484.6. Samples: 2563531880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:31,994][12645] Avg episode reward: [(0, '0.474')] +[2024-06-18 14:58:32,524][12883] Updated weights for policy 0, policy_version 156463 (0.0041) +[2024-06-18 14:58:36,100][12883] Updated weights for policy 0, policy_version 156473 (0.0026) +[2024-06-18 14:58:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2563670016. Throughput: 0: 42456.5. Samples: 2563785040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:36,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 14:58:37,078][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156475_2563686400.pth... +[2024-06-18 14:58:37,135][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155853_2553495552.pth +[2024-06-18 14:58:40,265][12883] Updated weights for policy 0, policy_version 156483 (0.0041) +[2024-06-18 14:58:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2563883008. Throughput: 0: 42504.7. Samples: 2564047300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:41,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 14:58:43,580][12883] Updated weights for policy 0, policy_version 156493 (0.0035) +[2024-06-18 14:58:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2564096000. Throughput: 0: 42411.6. Samples: 2564173180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:46,994][12645] Avg episode reward: [(0, '0.396')] +[2024-06-18 14:58:47,783][12883] Updated weights for policy 0, policy_version 156503 (0.0037) +[2024-06-18 14:58:51,307][12883] Updated weights for policy 0, policy_version 156513 (0.0039) +[2024-06-18 14:58:51,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2564308992. Throughput: 0: 42589.4. Samples: 2564430920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:51,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 14:58:55,455][12883] Updated weights for policy 0, policy_version 156523 (0.0043) +[2024-06-18 14:58:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2564538368. Throughput: 0: 42515.1. Samples: 2564684220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:58:56,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 14:58:59,547][12883] Updated weights for policy 0, policy_version 156533 (0.0037) +[2024-06-18 14:59:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 2564718592. Throughput: 0: 42661.2. Samples: 2564816440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:59:01,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 14:59:02,990][12883] Updated weights for policy 0, policy_version 156543 (0.0037) +[2024-06-18 14:59:06,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42596.8, 300 sec: 42431.5). Total num frames: 2564947968. Throughput: 0: 42631.3. Samples: 2565066580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 14:59:06,997][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 14:59:07,121][12883] Updated weights for policy 0, policy_version 156553 (0.0030) +[2024-06-18 14:59:07,934][12862] Signal inference workers to stop experience collection... (37500 times) +[2024-06-18 14:59:07,935][12862] Signal inference workers to resume experience collection... (37500 times) +[2024-06-18 14:59:07,967][12883] InferenceWorker_p0-w0: stopping experience collection (37500 times) +[2024-06-18 14:59:07,967][12883] InferenceWorker_p0-w0: resuming experience collection (37500 times) +[2024-06-18 14:59:11,027][12883] Updated weights for policy 0, policy_version 156563 (0.0039) +[2024-06-18 14:59:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2565177344. Throughput: 0: 42452.9. Samples: 2565320040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:11,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 14:59:15,678][12883] Updated weights for policy 0, policy_version 156574 (0.0031) +[2024-06-18 14:59:16,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 2565373952. Throughput: 0: 42680.3. Samples: 2565452500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:16,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 14:59:19,068][12883] Updated weights for policy 0, policy_version 156584 (0.0035) +[2024-06-18 14:59:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 2565586944. Throughput: 0: 42626.7. Samples: 2565703240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:21,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 14:59:23,388][12883] Updated weights for policy 0, policy_version 156594 (0.0050) +[2024-06-18 14:59:26,798][12883] Updated weights for policy 0, policy_version 156604 (0.0036) +[2024-06-18 14:59:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2565816320. Throughput: 0: 42476.6. Samples: 2565958740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:26,994][12645] Avg episode reward: [(0, '0.492')] +[2024-06-18 14:59:31,134][12883] Updated weights for policy 0, policy_version 156614 (0.0038) +[2024-06-18 14:59:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2565996544. Throughput: 0: 42412.5. Samples: 2566081740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:31,994][12645] Avg episode reward: [(0, '0.386')] +[2024-06-18 14:59:34,476][12883] Updated weights for policy 0, policy_version 156624 (0.0032) +[2024-06-18 14:59:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2566225920. Throughput: 0: 42271.1. Samples: 2566333120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:36,994][12645] Avg episode reward: [(0, '0.318')] +[2024-06-18 14:59:38,815][12883] Updated weights for policy 0, policy_version 156634 (0.0028) +[2024-06-18 14:59:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 2566422528. Throughput: 0: 42412.1. Samples: 2566592760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:41,994][12645] Avg episode reward: [(0, '0.168')] +[2024-06-18 14:59:42,268][12883] Updated weights for policy 0, policy_version 156644 (0.0036) +[2024-06-18 14:59:46,420][12883] Updated weights for policy 0, policy_version 156654 (0.0038) +[2024-06-18 14:59:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 2566635520. Throughput: 0: 42153.0. Samples: 2566713320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:46,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 14:59:50,122][12883] Updated weights for policy 0, policy_version 156664 (0.0040) +[2024-06-18 14:59:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2566881280. Throughput: 0: 42256.3. Samples: 2566968020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:51,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 14:59:54,043][12883] Updated weights for policy 0, policy_version 156674 (0.0032) +[2024-06-18 14:59:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42432.6). Total num frames: 2567061504. Throughput: 0: 42424.8. Samples: 2567229160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 14:59:56,994][12645] Avg episode reward: [(0, '0.408')] +[2024-06-18 14:59:57,985][12883] Updated weights for policy 0, policy_version 156684 (0.0047) +[2024-06-18 15:00:01,649][12883] Updated weights for policy 0, policy_version 156694 (0.0032) +[2024-06-18 15:00:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2567274496. Throughput: 0: 42144.0. Samples: 2567348980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 15:00:01,995][12645] Avg episode reward: [(0, '0.345')] +[2024-06-18 15:00:05,640][12883] Updated weights for policy 0, policy_version 156704 (0.0041) +[2024-06-18 15:00:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 2567503872. Throughput: 0: 42485.3. Samples: 2567615080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 15:00:06,994][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 15:00:09,542][12883] Updated weights for policy 0, policy_version 156714 (0.0023) +[2024-06-18 15:00:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2567700480. Throughput: 0: 42435.5. Samples: 2567868340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) +[2024-06-18 15:00:11,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 15:00:13,235][12883] Updated weights for policy 0, policy_version 156724 (0.0035) +[2024-06-18 15:00:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2567913472. Throughput: 0: 42382.1. Samples: 2567988940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:16,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 15:00:17,418][12883] Updated weights for policy 0, policy_version 156734 (0.0055) +[2024-06-18 15:00:20,936][12883] Updated weights for policy 0, policy_version 156744 (0.0042) +[2024-06-18 15:00:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2568159232. Throughput: 0: 42695.0. Samples: 2568254400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:21,994][12645] Avg episode reward: [(0, '0.671')] +[2024-06-18 15:00:24,946][12883] Updated weights for policy 0, policy_version 156754 (0.0049) +[2024-06-18 15:00:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 2568323072. Throughput: 0: 42669.7. Samples: 2568512900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:26,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 15:00:28,664][12883] Updated weights for policy 0, policy_version 156764 (0.0043) +[2024-06-18 15:00:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 2568552448. Throughput: 0: 42616.3. Samples: 2568631060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:31,995][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 15:00:32,783][12883] Updated weights for policy 0, policy_version 156774 (0.0041) +[2024-06-18 15:00:36,083][12883] Updated weights for policy 0, policy_version 156784 (0.0044) +[2024-06-18 15:00:36,981][12862] Signal inference workers to stop experience collection... (37550 times) +[2024-06-18 15:00:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2568781824. Throughput: 0: 42756.0. Samples: 2568892040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:36,994][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 15:00:37,026][12883] InferenceWorker_p0-w0: stopping experience collection (37550 times) +[2024-06-18 15:00:37,033][12862] Signal inference workers to resume experience collection... (37550 times) +[2024-06-18 15:00:37,043][12883] InferenceWorker_p0-w0: resuming experience collection (37550 times) +[2024-06-18 15:00:37,168][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156788_2568814592.pth... +[2024-06-18 15:00:37,210][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156164_2558590976.pth +[2024-06-18 15:00:40,363][12883] Updated weights for policy 0, policy_version 156794 (0.0043) +[2024-06-18 15:00:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2568978432. Throughput: 0: 42726.3. Samples: 2569151840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:41,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 15:00:43,869][12883] Updated weights for policy 0, policy_version 156804 (0.0027) +[2024-06-18 15:00:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2569207808. Throughput: 0: 42585.9. Samples: 2569265340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:46,994][12645] Avg episode reward: [(0, '0.710')] +[2024-06-18 15:00:47,945][12883] Updated weights for policy 0, policy_version 156814 (0.0032) +[2024-06-18 15:00:51,525][12883] Updated weights for policy 0, policy_version 156824 (0.0028) +[2024-06-18 15:00:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2569420800. Throughput: 0: 42576.8. Samples: 2569531040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:51,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 15:00:55,467][12883] Updated weights for policy 0, policy_version 156834 (0.0038) +[2024-06-18 15:00:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 2569601024. Throughput: 0: 42706.3. Samples: 2569790120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:00:56,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 15:00:59,429][12883] Updated weights for policy 0, policy_version 156844 (0.0022) +[2024-06-18 15:01:01,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2569846784. Throughput: 0: 42739.7. Samples: 2569912220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:01:01,994][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 15:01:03,061][12883] Updated weights for policy 0, policy_version 156854 (0.0035) +[2024-06-18 15:01:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 2570043392. Throughput: 0: 42447.2. Samples: 2570164520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:01:06,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 15:01:07,309][12883] Updated weights for policy 0, policy_version 156864 (0.0048) +[2024-06-18 15:01:10,762][12883] Updated weights for policy 0, policy_version 156874 (0.0034) +[2024-06-18 15:01:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 2570240000. Throughput: 0: 42404.9. Samples: 2570421120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:01:11,994][12645] Avg episode reward: [(0, '0.616')] +[2024-06-18 15:01:14,934][12883] Updated weights for policy 0, policy_version 156884 (0.0047) +[2024-06-18 15:01:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2570502144. Throughput: 0: 42647.7. Samples: 2570550200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) +[2024-06-18 15:01:16,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 15:01:18,452][12883] Updated weights for policy 0, policy_version 156894 (0.0031) +[2024-06-18 15:01:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2570682368. Throughput: 0: 42647.6. Samples: 2570811180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:01:21,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 15:01:22,558][12883] Updated weights for policy 0, policy_version 156904 (0.0030) +[2024-06-18 15:01:26,174][12883] Updated weights for policy 0, policy_version 156914 (0.0033) +[2024-06-18 15:01:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2570895360. Throughput: 0: 42590.7. Samples: 2571068420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:01:26,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 15:01:30,067][12883] Updated weights for policy 0, policy_version 156924 (0.0036) +[2024-06-18 15:01:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2571141120. Throughput: 0: 42904.9. Samples: 2571196060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:01:31,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 15:01:34,030][12883] Updated weights for policy 0, policy_version 156934 (0.0030) +[2024-06-18 15:01:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2571337728. Throughput: 0: 42885.0. Samples: 2571460860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:01:36,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 15:01:37,506][12883] Updated weights for policy 0, policy_version 156944 (0.0030) +[2024-06-18 15:01:41,801][12883] Updated weights for policy 0, policy_version 156954 (0.0035) +[2024-06-18 15:01:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42487.6). Total num frames: 2571550720. Throughput: 0: 42677.7. Samples: 2571710620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:01:41,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 15:01:43,308][12862] Signal inference workers to stop experience collection... (37600 times) +[2024-06-18 15:01:43,308][12862] Signal inference workers to resume experience collection... (37600 times) +[2024-06-18 15:01:43,322][12883] InferenceWorker_p0-w0: stopping experience collection (37600 times) +[2024-06-18 15:01:43,322][12883] InferenceWorker_p0-w0: resuming experience collection (37600 times) +[2024-06-18 15:01:45,227][12883] Updated weights for policy 0, policy_version 156964 (0.0040) +[2024-06-18 15:01:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2571780096. Throughput: 0: 42788.4. Samples: 2571837700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:01:46,994][12645] Avg episode reward: [(0, '0.700')] +[2024-06-18 15:01:49,332][12883] Updated weights for policy 0, policy_version 156974 (0.0033) +[2024-06-18 15:01:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 2571976704. Throughput: 0: 42928.0. Samples: 2572096280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:01:51,994][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 15:01:52,832][12883] Updated weights for policy 0, policy_version 156984 (0.0042) +[2024-06-18 15:01:56,874][12883] Updated weights for policy 0, policy_version 156994 (0.0037) +[2024-06-18 15:01:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2572189696. Throughput: 0: 42875.5. Samples: 2572350520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:01:56,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 15:02:00,665][12883] Updated weights for policy 0, policy_version 157004 (0.0031) +[2024-06-18 15:02:01,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42869.8, 300 sec: 42598.4). Total num frames: 2572419072. Throughput: 0: 42866.3. Samples: 2572479280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:02:01,997][12645] Avg episode reward: [(0, '0.246')] +[2024-06-18 15:02:04,320][12883] Updated weights for policy 0, policy_version 157014 (0.0031) +[2024-06-18 15:02:06,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2572615680. Throughput: 0: 42811.1. Samples: 2572737680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:02:06,995][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 15:02:08,544][12883] Updated weights for policy 0, policy_version 157024 (0.0033) +[2024-06-18 15:02:11,931][12883] Updated weights for policy 0, policy_version 157034 (0.0022) +[2024-06-18 15:02:11,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 2572845056. Throughput: 0: 42807.5. Samples: 2572994760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:02:11,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 15:02:15,978][12883] Updated weights for policy 0, policy_version 157044 (0.0035) +[2024-06-18 15:02:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2573074432. Throughput: 0: 42943.1. Samples: 2573128500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:02:16,994][12645] Avg episode reward: [(0, '0.616')] +[2024-06-18 15:02:19,619][12883] Updated weights for policy 0, policy_version 157054 (0.0036) +[2024-06-18 15:02:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2573254656. Throughput: 0: 42693.4. Samples: 2573382060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) +[2024-06-18 15:02:21,994][12645] Avg episode reward: [(0, '0.680')] +[2024-06-18 15:02:23,417][12883] Updated weights for policy 0, policy_version 157064 (0.0039) +[2024-06-18 15:02:27,000][12645] Fps is (10 sec: 37659.8, 60 sec: 42594.0, 300 sec: 42486.4). Total num frames: 2573451264. Throughput: 0: 42854.1. Samples: 2573639320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:02:27,001][12645] Avg episode reward: [(0, '0.727')] +[2024-06-18 15:02:27,485][12883] Updated weights for policy 0, policy_version 157074 (0.0036) +[2024-06-18 15:02:30,835][12883] Updated weights for policy 0, policy_version 157084 (0.0023) +[2024-06-18 15:02:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2573713408. Throughput: 0: 42903.2. Samples: 2573768340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:02:31,994][12645] Avg episode reward: [(0, '0.727')] +[2024-06-18 15:02:34,965][12883] Updated weights for policy 0, policy_version 157094 (0.0030) +[2024-06-18 15:02:36,994][12645] Fps is (10 sec: 45903.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2573910016. Throughput: 0: 42989.2. Samples: 2574030800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:02:36,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 15:02:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157099_2573910016.pth... +[2024-06-18 15:02:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156475_2563686400.pth +[2024-06-18 15:02:38,317][12883] Updated weights for policy 0, policy_version 157104 (0.0039) +[2024-06-18 15:02:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2574106624. Throughput: 0: 43126.9. Samples: 2574291220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:02:41,994][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 15:02:42,430][12883] Updated weights for policy 0, policy_version 157114 (0.0036) +[2024-06-18 15:02:45,900][12883] Updated weights for policy 0, policy_version 157124 (0.0037) +[2024-06-18 15:02:46,999][12645] Fps is (10 sec: 44212.8, 60 sec: 42867.5, 300 sec: 42653.1). Total num frames: 2574352384. Throughput: 0: 43078.6. Samples: 2574417960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:02:47,000][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 15:02:50,152][12883] Updated weights for policy 0, policy_version 157134 (0.0036) +[2024-06-18 15:02:51,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2574548992. Throughput: 0: 43039.9. Samples: 2574674480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:02:51,995][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 15:02:53,469][12883] Updated weights for policy 0, policy_version 157144 (0.0043) +[2024-06-18 15:02:57,000][12645] Fps is (10 sec: 40957.2, 60 sec: 42867.1, 300 sec: 42597.5). Total num frames: 2574761984. Throughput: 0: 43113.1. Samples: 2574935120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:02:57,000][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 15:02:58,136][12883] Updated weights for policy 0, policy_version 157154 (0.0030) +[2024-06-18 15:03:00,874][12862] Signal inference workers to stop experience collection... (37650 times) +[2024-06-18 15:03:00,874][12862] Signal inference workers to resume experience collection... (37650 times) +[2024-06-18 15:03:00,931][12883] InferenceWorker_p0-w0: stopping experience collection (37650 times) +[2024-06-18 15:03:00,931][12883] InferenceWorker_p0-w0: resuming experience collection (37650 times) +[2024-06-18 15:03:01,024][12883] Updated weights for policy 0, policy_version 157164 (0.0044) +[2024-06-18 15:03:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 2574991360. Throughput: 0: 42951.9. Samples: 2575061340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:03:01,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 15:03:05,692][12883] Updated weights for policy 0, policy_version 157174 (0.0032) +[2024-06-18 15:03:06,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2575187968. Throughput: 0: 43114.6. Samples: 2575322220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:03:06,994][12645] Avg episode reward: [(0, '0.732')] +[2024-06-18 15:03:08,627][12883] Updated weights for policy 0, policy_version 157184 (0.0033) +[2024-06-18 15:03:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2575400960. Throughput: 0: 43053.6. Samples: 2575576460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:03:11,994][12645] Avg episode reward: [(0, '0.696')] +[2024-06-18 15:03:13,247][12883] Updated weights for policy 0, policy_version 157194 (0.0026) +[2024-06-18 15:03:16,986][12883] Updated weights for policy 0, policy_version 157204 (0.0038) +[2024-06-18 15:03:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2575630336. Throughput: 0: 43005.6. Samples: 2575703600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:03:16,994][12645] Avg episode reward: [(0, '0.668')] +[2024-06-18 15:03:20,748][12883] Updated weights for policy 0, policy_version 157214 (0.0025) +[2024-06-18 15:03:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2575843328. Throughput: 0: 42981.1. Samples: 2575964940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:03:21,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 15:03:24,511][12883] Updated weights for policy 0, policy_version 157224 (0.0027) +[2024-06-18 15:03:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43422.2, 300 sec: 42709.5). Total num frames: 2576056320. Throughput: 0: 42969.8. Samples: 2576224860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:03:26,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 15:03:28,316][12883] Updated weights for policy 0, policy_version 157234 (0.0042) +[2024-06-18 15:03:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2576269312. Throughput: 0: 43035.6. Samples: 2576354320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:03:31,994][12645] Avg episode reward: [(0, '0.666')] +[2024-06-18 15:03:32,123][12883] Updated weights for policy 0, policy_version 157244 (0.0034) +[2024-06-18 15:03:36,274][12883] Updated weights for policy 0, policy_version 157254 (0.0046) +[2024-06-18 15:03:36,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42870.0, 300 sec: 42709.2). Total num frames: 2576482304. Throughput: 0: 42948.7. Samples: 2576607260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:03:36,996][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 15:03:39,803][12883] Updated weights for policy 0, policy_version 157264 (0.0033) +[2024-06-18 15:03:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2576711680. Throughput: 0: 42806.0. Samples: 2576861120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:03:41,994][12645] Avg episode reward: [(0, '0.747')] +[2024-06-18 15:03:43,857][12883] Updated weights for policy 0, policy_version 157274 (0.0031) +[2024-06-18 15:03:46,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42329.3, 300 sec: 42653.9). Total num frames: 2576891904. Throughput: 0: 42862.4. Samples: 2576990140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:03:46,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 15:03:47,585][12883] Updated weights for policy 0, policy_version 157284 (0.0037) +[2024-06-18 15:03:51,512][12883] Updated weights for policy 0, policy_version 157294 (0.0024) +[2024-06-18 15:03:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2577137664. Throughput: 0: 42749.7. Samples: 2577245960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:03:51,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 15:03:55,185][12883] Updated weights for policy 0, policy_version 157304 (0.0045) +[2024-06-18 15:03:56,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 2577350656. Throughput: 0: 42625.8. Samples: 2577494620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:03:56,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 15:03:59,379][12883] Updated weights for policy 0, policy_version 157314 (0.0032) +[2024-06-18 15:04:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2577530880. Throughput: 0: 42700.0. Samples: 2577625100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:04:01,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 15:04:02,806][12883] Updated weights for policy 0, policy_version 157324 (0.0025) +[2024-06-18 15:04:06,813][12862] Signal inference workers to stop experience collection... (37700 times) +[2024-06-18 15:04:06,845][12883] InferenceWorker_p0-w0: stopping experience collection (37700 times) +[2024-06-18 15:04:06,879][12862] Signal inference workers to resume experience collection... (37700 times) +[2024-06-18 15:04:06,880][12883] InferenceWorker_p0-w0: resuming experience collection (37700 times) +[2024-06-18 15:04:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2577743872. Throughput: 0: 42664.0. Samples: 2577884820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:04:06,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 15:04:07,017][12883] Updated weights for policy 0, policy_version 157334 (0.0032) +[2024-06-18 15:04:10,632][12883] Updated weights for policy 0, policy_version 157344 (0.0025) +[2024-06-18 15:04:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2577989632. Throughput: 0: 42351.9. Samples: 2578130700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:04:11,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 15:04:14,860][12883] Updated weights for policy 0, policy_version 157354 (0.0035) +[2024-06-18 15:04:16,998][12645] Fps is (10 sec: 44216.3, 60 sec: 42595.2, 300 sec: 42708.8). Total num frames: 2578186240. Throughput: 0: 42487.2. Samples: 2578266440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:04:16,999][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 15:04:18,349][12883] Updated weights for policy 0, policy_version 157364 (0.0034) +[2024-06-18 15:04:21,995][12645] Fps is (10 sec: 39317.2, 60 sec: 42324.5, 300 sec: 42598.2). Total num frames: 2578382848. Throughput: 0: 42388.2. Samples: 2578514680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:04:21,995][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 15:04:22,626][12883] Updated weights for policy 0, policy_version 157374 (0.0050) +[2024-06-18 15:04:25,924][12883] Updated weights for policy 0, policy_version 157384 (0.0032) +[2024-06-18 15:04:26,994][12645] Fps is (10 sec: 42617.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2578612224. Throughput: 0: 42528.8. Samples: 2578774920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:04:26,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 15:04:29,988][12883] Updated weights for policy 0, policy_version 157394 (0.0028) +[2024-06-18 15:04:31,994][12645] Fps is (10 sec: 44241.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2578825216. Throughput: 0: 42634.1. Samples: 2578908680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) +[2024-06-18 15:04:31,994][12645] Avg episode reward: [(0, '0.722')] +[2024-06-18 15:04:33,513][12883] Updated weights for policy 0, policy_version 157404 (0.0038) +[2024-06-18 15:04:36,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 2579021824. Throughput: 0: 42459.3. Samples: 2579156720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:04:36,997][12645] Avg episode reward: [(0, '0.679')] +[2024-06-18 15:04:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157411_2579021824.pth... +[2024-06-18 15:04:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156788_2568814592.pth +[2024-06-18 15:04:37,576][12883] Updated weights for policy 0, policy_version 157414 (0.0027) +[2024-06-18 15:04:41,449][12883] Updated weights for policy 0, policy_version 157424 (0.0035) +[2024-06-18 15:04:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2579267584. Throughput: 0: 42705.3. Samples: 2579416360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:04:41,994][12645] Avg episode reward: [(0, '0.728')] +[2024-06-18 15:04:45,245][12883] Updated weights for policy 0, policy_version 157434 (0.0043) +[2024-06-18 15:04:46,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2579464192. Throughput: 0: 42812.1. Samples: 2579551640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:04:46,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 15:04:49,068][12883] Updated weights for policy 0, policy_version 157444 (0.0031) +[2024-06-18 15:04:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2579677184. Throughput: 0: 42581.7. Samples: 2579801000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:04:51,995][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 15:04:52,759][12883] Updated weights for policy 0, policy_version 157454 (0.0033) +[2024-06-18 15:04:56,617][12883] Updated weights for policy 0, policy_version 157464 (0.0045) +[2024-06-18 15:04:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2579890176. Throughput: 0: 42814.3. Samples: 2580057340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:04:56,994][12645] Avg episode reward: [(0, '0.263')] +[2024-06-18 15:05:00,243][12883] Updated weights for policy 0, policy_version 157474 (0.0030) +[2024-06-18 15:05:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2580086784. Throughput: 0: 42711.9. Samples: 2580188280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:05:01,994][12645] Avg episode reward: [(0, '0.227')] +[2024-06-18 15:05:04,138][12883] Updated weights for policy 0, policy_version 157484 (0.0033) +[2024-06-18 15:05:06,994][12645] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2580332544. Throughput: 0: 42792.0. Samples: 2580440280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:05:06,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 15:05:08,071][12883] Updated weights for policy 0, policy_version 157494 (0.0041) +[2024-06-18 15:05:11,754][12883] Updated weights for policy 0, policy_version 157504 (0.0032) +[2024-06-18 15:05:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2580545536. Throughput: 0: 42654.3. Samples: 2580694360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:05:11,994][12645] Avg episode reward: [(0, '0.695')] +[2024-06-18 15:05:15,632][12883] Updated weights for policy 0, policy_version 157514 (0.0027) +[2024-06-18 15:05:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42601.7, 300 sec: 42654.0). Total num frames: 2580742144. Throughput: 0: 42492.1. Samples: 2580820820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:05:16,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 15:05:19,584][12883] Updated weights for policy 0, policy_version 157524 (0.0034) +[2024-06-18 15:05:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43145.2, 300 sec: 42876.1). Total num frames: 2580971520. Throughput: 0: 42731.8. Samples: 2581079560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:05:21,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 15:05:23,439][12883] Updated weights for policy 0, policy_version 157534 (0.0030) +[2024-06-18 15:05:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2581168128. Throughput: 0: 42724.9. Samples: 2581338980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:05:26,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 15:05:27,164][12883] Updated weights for policy 0, policy_version 157544 (0.0038) +[2024-06-18 15:05:31,355][12883] Updated weights for policy 0, policy_version 157554 (0.0031) +[2024-06-18 15:05:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2581381120. Throughput: 0: 42420.8. Samples: 2581460580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:05:31,994][12645] Avg episode reward: [(0, '0.749')] +[2024-06-18 15:05:34,249][12862] Signal inference workers to stop experience collection... (37750 times) +[2024-06-18 15:05:34,249][12862] Signal inference workers to resume experience collection... (37750 times) +[2024-06-18 15:05:34,284][12883] InferenceWorker_p0-w0: stopping experience collection (37750 times) +[2024-06-18 15:05:34,285][12883] InferenceWorker_p0-w0: resuming experience collection (37750 times) +[2024-06-18 15:05:34,730][12883] Updated weights for policy 0, policy_version 157564 (0.0039) +[2024-06-18 15:05:36,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43419.2, 300 sec: 42876.1). Total num frames: 2581626880. Throughput: 0: 42629.8. Samples: 2581719340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:05:36,994][12645] Avg episode reward: [(0, '0.633')] +[2024-06-18 15:05:38,984][12883] Updated weights for policy 0, policy_version 157574 (0.0034) +[2024-06-18 15:05:41,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2581807104. Throughput: 0: 42657.8. Samples: 2581976940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:05:41,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 15:05:42,613][12883] Updated weights for policy 0, policy_version 157584 (0.0052) +[2024-06-18 15:05:46,747][12883] Updated weights for policy 0, policy_version 157594 (0.0034) +[2024-06-18 15:05:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2582020096. Throughput: 0: 42486.2. Samples: 2582100160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:05:47,000][12645] Avg episode reward: [(0, '0.288')] +[2024-06-18 15:05:50,325][12883] Updated weights for policy 0, policy_version 157604 (0.0031) +[2024-06-18 15:05:52,000][12645] Fps is (10 sec: 44208.7, 60 sec: 42867.1, 300 sec: 42875.2). Total num frames: 2582249472. Throughput: 0: 42545.8. Samples: 2582355100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:05:52,001][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 15:05:54,299][12883] Updated weights for policy 0, policy_version 157614 (0.0038) +[2024-06-18 15:05:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2582462464. Throughput: 0: 42678.7. Samples: 2582614900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:05:56,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 15:05:58,011][12883] Updated weights for policy 0, policy_version 157624 (0.0028) +[2024-06-18 15:06:01,968][12883] Updated weights for policy 0, policy_version 157634 (0.0028) +[2024-06-18 15:06:01,996][12645] Fps is (10 sec: 42615.4, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 2582675456. Throughput: 0: 42717.4. Samples: 2582743200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:06:01,997][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 15:06:05,845][12883] Updated weights for policy 0, policy_version 157644 (0.0036) +[2024-06-18 15:06:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2582872064. Throughput: 0: 42602.4. Samples: 2582996660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:06:06,999][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 15:06:09,829][12883] Updated weights for policy 0, policy_version 157654 (0.0036) +[2024-06-18 15:06:11,999][12645] Fps is (10 sec: 40948.5, 60 sec: 42321.8, 300 sec: 42653.2). Total num frames: 2583085056. Throughput: 0: 42528.5. Samples: 2583252980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:06:11,999][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 15:06:13,501][12883] Updated weights for policy 0, policy_version 157664 (0.0022) +[2024-06-18 15:06:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2583298048. Throughput: 0: 42639.6. Samples: 2583379360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:06:16,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 15:06:17,328][12883] Updated weights for policy 0, policy_version 157674 (0.0040) +[2024-06-18 15:06:21,088][12883] Updated weights for policy 0, policy_version 157684 (0.0037) +[2024-06-18 15:06:21,994][12645] Fps is (10 sec: 42620.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2583511040. Throughput: 0: 42578.8. Samples: 2583635380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:06:21,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 15:06:25,045][12883] Updated weights for policy 0, policy_version 157694 (0.0027) +[2024-06-18 15:06:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2583724032. Throughput: 0: 42538.5. Samples: 2583891180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:06:26,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 15:06:28,854][12883] Updated weights for policy 0, policy_version 157704 (0.0046) +[2024-06-18 15:06:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2583937024. Throughput: 0: 42638.7. Samples: 2584018900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:06:31,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 15:06:32,582][12883] Updated weights for policy 0, policy_version 157714 (0.0033) +[2024-06-18 15:06:36,440][12883] Updated weights for policy 0, policy_version 157724 (0.0040) +[2024-06-18 15:06:36,999][12645] Fps is (10 sec: 44215.1, 60 sec: 42321.9, 300 sec: 42764.3). Total num frames: 2584166400. Throughput: 0: 42714.6. Samples: 2584277200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) +[2024-06-18 15:06:36,999][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 15:06:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157725_2584166400.pth... +[2024-06-18 15:06:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157099_2573910016.pth +[2024-06-18 15:06:40,968][12883] Updated weights for policy 0, policy_version 157734 (0.0036) +[2024-06-18 15:06:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2584363008. Throughput: 0: 42641.8. Samples: 2584533780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:06:41,994][12645] Avg episode reward: [(0, '0.544')] +[2024-06-18 15:06:43,971][12883] Updated weights for policy 0, policy_version 157744 (0.0038) +[2024-06-18 15:06:46,994][12645] Fps is (10 sec: 42619.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2584592384. Throughput: 0: 42504.4. Samples: 2584655800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:06:46,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 15:06:48,400][12883] Updated weights for policy 0, policy_version 157754 (0.0039) +[2024-06-18 15:06:49,508][12862] Signal inference workers to stop experience collection... (37800 times) +[2024-06-18 15:06:49,511][12862] Signal inference workers to resume experience collection... (37800 times) +[2024-06-18 15:06:49,558][12883] InferenceWorker_p0-w0: stopping experience collection (37800 times) +[2024-06-18 15:06:49,564][12883] InferenceWorker_p0-w0: resuming experience collection (37800 times) +[2024-06-18 15:06:51,538][12883] Updated weights for policy 0, policy_version 157764 (0.0042) +[2024-06-18 15:06:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42875.8, 300 sec: 42820.6). Total num frames: 2584821760. Throughput: 0: 42680.3. Samples: 2584917280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:06:51,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 15:06:55,866][12883] Updated weights for policy 0, policy_version 157774 (0.0030) +[2024-06-18 15:06:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2585001984. Throughput: 0: 42773.7. Samples: 2585177580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:06:56,994][12645] Avg episode reward: [(0, '0.310')] +[2024-06-18 15:06:59,214][12883] Updated weights for policy 0, policy_version 157784 (0.0028) +[2024-06-18 15:07:01,996][12645] Fps is (10 sec: 42589.8, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 2585247744. Throughput: 0: 42658.3. Samples: 2585299080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:01,997][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 15:07:03,747][12883] Updated weights for policy 0, policy_version 157794 (0.0030) +[2024-06-18 15:07:06,832][12883] Updated weights for policy 0, policy_version 157804 (0.0045) +[2024-06-18 15:07:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2585460736. Throughput: 0: 42736.4. Samples: 2585558520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:06,994][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 15:07:11,698][12883] Updated weights for policy 0, policy_version 157814 (0.0032) +[2024-06-18 15:07:11,996][12645] Fps is (10 sec: 39321.5, 60 sec: 42600.4, 300 sec: 42598.1). Total num frames: 2585640960. Throughput: 0: 42865.0. Samples: 2585820200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:11,996][12645] Avg episode reward: [(0, '0.843')] +[2024-06-18 15:07:14,534][12883] Updated weights for policy 0, policy_version 157824 (0.0031) +[2024-06-18 15:07:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 2585886720. Throughput: 0: 42640.9. Samples: 2585937840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:16,997][12645] Avg episode reward: [(0, '0.843')] +[2024-06-18 15:07:19,592][12883] Updated weights for policy 0, policy_version 157834 (0.0030) +[2024-06-18 15:07:21,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 2586099712. Throughput: 0: 42714.0. Samples: 2586199120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:21,994][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 15:07:22,386][12883] Updated weights for policy 0, policy_version 157844 (0.0032) +[2024-06-18 15:07:26,994][12645] Fps is (10 sec: 37691.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2586263552. Throughput: 0: 42796.0. Samples: 2586459600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:26,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 15:07:27,278][12883] Updated weights for policy 0, policy_version 157854 (0.0029) +[2024-06-18 15:07:29,902][12883] Updated weights for policy 0, policy_version 157864 (0.0028) +[2024-06-18 15:07:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2586525696. Throughput: 0: 42775.1. Samples: 2586580680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:31,994][12645] Avg episode reward: [(0, '0.264')] +[2024-06-18 15:07:34,734][12883] Updated weights for policy 0, policy_version 157874 (0.0035) +[2024-06-18 15:07:36,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42875.0, 300 sec: 42820.5). Total num frames: 2586738688. Throughput: 0: 42812.1. Samples: 2586843820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:36,994][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 15:07:37,942][12883] Updated weights for policy 0, policy_version 157884 (0.0032) +[2024-06-18 15:07:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42543.7). Total num frames: 2586902528. Throughput: 0: 42602.3. Samples: 2587094680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) +[2024-06-18 15:07:41,994][12645] Avg episode reward: [(0, '0.591')] +[2024-06-18 15:07:42,355][12883] Updated weights for policy 0, policy_version 157894 (0.0032) +[2024-06-18 15:07:45,522][12883] Updated weights for policy 0, policy_version 157904 (0.0035) +[2024-06-18 15:07:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2587164672. Throughput: 0: 42660.7. Samples: 2587218720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:07:46,994][12645] Avg episode reward: [(0, '0.693')] +[2024-06-18 15:07:50,310][12883] Updated weights for policy 0, policy_version 157914 (0.0031) +[2024-06-18 15:07:51,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 2587361280. Throughput: 0: 42738.6. Samples: 2587481760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:07:51,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 15:07:53,193][12883] Updated weights for policy 0, policy_version 157924 (0.0030) +[2024-06-18 15:07:56,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2587541504. Throughput: 0: 42608.4. Samples: 2587737480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:07:56,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 15:07:57,863][12883] Updated weights for policy 0, policy_version 157934 (0.0032) +[2024-06-18 15:08:01,003][12883] Updated weights for policy 0, policy_version 157944 (0.0030) +[2024-06-18 15:08:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 2587803648. Throughput: 0: 42693.3. Samples: 2587858940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:01,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 15:08:05,433][12883] Updated weights for policy 0, policy_version 157954 (0.0031) +[2024-06-18 15:08:05,441][12862] Signal inference workers to stop experience collection... (37850 times) +[2024-06-18 15:08:05,441][12862] Signal inference workers to resume experience collection... (37850 times) +[2024-06-18 15:08:05,451][12883] InferenceWorker_p0-w0: stopping experience collection (37850 times) +[2024-06-18 15:08:05,463][12883] InferenceWorker_p0-w0: resuming experience collection (37850 times) +[2024-06-18 15:08:06,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2588000256. Throughput: 0: 42686.2. Samples: 2588120000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:06,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 15:08:08,531][12883] Updated weights for policy 0, policy_version 157964 (0.0031) +[2024-06-18 15:08:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2588196864. Throughput: 0: 42435.6. Samples: 2588369200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:11,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 15:08:12,907][12883] Updated weights for policy 0, policy_version 157974 (0.0032) +[2024-06-18 15:08:16,521][12883] Updated weights for policy 0, policy_version 157984 (0.0028) +[2024-06-18 15:08:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2588442624. Throughput: 0: 42682.6. Samples: 2588501400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:16,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 15:08:20,453][12883] Updated weights for policy 0, policy_version 157994 (0.0039) +[2024-06-18 15:08:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2588639232. Throughput: 0: 42649.3. Samples: 2588763040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:21,995][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 15:08:23,989][12883] Updated weights for policy 0, policy_version 158004 (0.0031) +[2024-06-18 15:08:26,996][12645] Fps is (10 sec: 42588.9, 60 sec: 43416.0, 300 sec: 42709.1). Total num frames: 2588868608. Throughput: 0: 42720.4. Samples: 2589017200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:26,997][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 15:08:28,185][12883] Updated weights for policy 0, policy_version 158014 (0.0037) +[2024-06-18 15:08:31,619][12883] Updated weights for policy 0, policy_version 158024 (0.0035) +[2024-06-18 15:08:31,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2589097984. Throughput: 0: 42947.6. Samples: 2589151360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:31,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 15:08:35,605][12883] Updated weights for policy 0, policy_version 158034 (0.0035) +[2024-06-18 15:08:36,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2589294592. Throughput: 0: 42884.1. Samples: 2589411540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:36,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 15:08:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158038_2589294592.pth... +[2024-06-18 15:08:37,049][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157411_2579021824.pth +[2024-06-18 15:08:39,200][12883] Updated weights for policy 0, policy_version 158044 (0.0042) +[2024-06-18 15:08:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2589507584. Throughput: 0: 42825.8. Samples: 2589664640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:41,994][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 15:08:43,074][12883] Updated weights for policy 0, policy_version 158054 (0.0036) +[2024-06-18 15:08:46,712][12883] Updated weights for policy 0, policy_version 158064 (0.0033) +[2024-06-18 15:08:46,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2589736960. Throughput: 0: 43040.3. Samples: 2589795760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) +[2024-06-18 15:08:46,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 15:08:50,469][12883] Updated weights for policy 0, policy_version 158074 (0.0042) +[2024-06-18 15:08:51,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2589933568. Throughput: 0: 42820.4. Samples: 2590046920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:08:51,995][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 15:08:54,390][12883] Updated weights for policy 0, policy_version 158084 (0.0041) +[2024-06-18 15:08:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2590146560. Throughput: 0: 43072.9. Samples: 2590307480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:08:56,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 15:08:58,050][12883] Updated weights for policy 0, policy_version 158094 (0.0032) +[2024-06-18 15:09:01,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2590343168. Throughput: 0: 42983.6. Samples: 2590435660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:01,994][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 15:09:02,344][12883] Updated weights for policy 0, policy_version 158104 (0.0032) +[2024-06-18 15:09:05,628][12883] Updated weights for policy 0, policy_version 158114 (0.0042) +[2024-06-18 15:09:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2590572544. Throughput: 0: 42642.6. Samples: 2590681960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:06,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 15:09:10,262][12883] Updated weights for policy 0, policy_version 158124 (0.0035) +[2024-06-18 15:09:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42654.6). Total num frames: 2590769152. Throughput: 0: 42711.0. Samples: 2590939100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:11,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 15:09:13,469][12883] Updated weights for policy 0, policy_version 158134 (0.0028) +[2024-06-18 15:09:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 2590982144. Throughput: 0: 42515.5. Samples: 2591064560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:16,994][12645] Avg episode reward: [(0, '0.711')] +[2024-06-18 15:09:18,059][12883] Updated weights for policy 0, policy_version 158144 (0.0033) +[2024-06-18 15:09:21,042][12883] Updated weights for policy 0, policy_version 158154 (0.0031) +[2024-06-18 15:09:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2591211520. Throughput: 0: 42397.7. Samples: 2591319440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:21,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 15:09:25,831][12883] Updated weights for policy 0, policy_version 158164 (0.0036) +[2024-06-18 15:09:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 2591408128. Throughput: 0: 42569.3. Samples: 2591580260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:26,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 15:09:28,807][12883] Updated weights for policy 0, policy_version 158174 (0.0041) +[2024-06-18 15:09:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 2591637504. Throughput: 0: 42332.1. Samples: 2591700700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:31,994][12645] Avg episode reward: [(0, '0.373')] +[2024-06-18 15:09:33,366][12883] Updated weights for policy 0, policy_version 158184 (0.0037) +[2024-06-18 15:09:36,515][12883] Updated weights for policy 0, policy_version 158194 (0.0038) +[2024-06-18 15:09:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2591866880. Throughput: 0: 42565.4. Samples: 2591962360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:36,994][12645] Avg episode reward: [(0, '0.586')] +[2024-06-18 15:09:39,068][12862] Signal inference workers to stop experience collection... (37900 times) +[2024-06-18 15:09:39,068][12862] Signal inference workers to resume experience collection... (37900 times) +[2024-06-18 15:09:39,118][12883] InferenceWorker_p0-w0: stopping experience collection (37900 times) +[2024-06-18 15:09:39,118][12883] InferenceWorker_p0-w0: resuming experience collection (37900 times) +[2024-06-18 15:09:41,080][12883] Updated weights for policy 0, policy_version 158204 (0.0045) +[2024-06-18 15:09:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2592047104. Throughput: 0: 42486.1. Samples: 2592219360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:41,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 15:09:44,211][12883] Updated weights for policy 0, policy_version 158214 (0.0027) +[2024-06-18 15:09:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2592276480. Throughput: 0: 42285.7. Samples: 2592338520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:46,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 15:09:49,006][12883] Updated weights for policy 0, policy_version 158224 (0.0038) +[2024-06-18 15:09:51,755][12883] Updated weights for policy 0, policy_version 158234 (0.0032) +[2024-06-18 15:09:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2592505856. Throughput: 0: 42620.5. Samples: 2592599880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) +[2024-06-18 15:09:51,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 15:09:56,954][12883] Updated weights for policy 0, policy_version 158244 (0.0027) +[2024-06-18 15:09:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2592669696. Throughput: 0: 42633.3. Samples: 2592857600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:09:56,994][12645] Avg episode reward: [(0, '0.352')] +[2024-06-18 15:09:59,466][12883] Updated weights for policy 0, policy_version 158254 (0.0035) +[2024-06-18 15:10:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2592899072. Throughput: 0: 42351.1. Samples: 2592970360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:02,003][12645] Avg episode reward: [(0, '0.707')] +[2024-06-18 15:10:04,507][12883] Updated weights for policy 0, policy_version 158264 (0.0030) +[2024-06-18 15:10:06,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2593144832. Throughput: 0: 42737.2. Samples: 2593242620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:06,994][12645] Avg episode reward: [(0, '0.788')] +[2024-06-18 15:10:07,148][12883] Updated weights for policy 0, policy_version 158274 (0.0029) +[2024-06-18 15:10:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2593308672. Throughput: 0: 42463.5. Samples: 2593491120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:11,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 15:10:12,103][12883] Updated weights for policy 0, policy_version 158284 (0.0043) +[2024-06-18 15:10:15,509][12883] Updated weights for policy 0, policy_version 158294 (0.0035) +[2024-06-18 15:10:16,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2593554432. Throughput: 0: 42374.8. Samples: 2593607560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:16,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 15:10:19,754][12883] Updated weights for policy 0, policy_version 158304 (0.0028) +[2024-06-18 15:10:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2593767424. Throughput: 0: 42609.9. Samples: 2593879800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:21,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 15:10:23,069][12883] Updated weights for policy 0, policy_version 158314 (0.0049) +[2024-06-18 15:10:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2593947648. Throughput: 0: 42539.2. Samples: 2594133620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:26,994][12645] Avg episode reward: [(0, '0.266')] +[2024-06-18 15:10:27,460][12883] Updated weights for policy 0, policy_version 158324 (0.0041) +[2024-06-18 15:10:30,481][12883] Updated weights for policy 0, policy_version 158334 (0.0031) +[2024-06-18 15:10:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2594193408. Throughput: 0: 42478.7. Samples: 2594250060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:31,994][12645] Avg episode reward: [(0, '0.682')] +[2024-06-18 15:10:35,111][12883] Updated weights for policy 0, policy_version 158344 (0.0043) +[2024-06-18 15:10:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2594390016. Throughput: 0: 42575.2. Samples: 2594515760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:36,994][12645] Avg episode reward: [(0, '0.635')] +[2024-06-18 15:10:37,064][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158350_2594406400.pth... +[2024-06-18 15:10:37,109][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157725_2584166400.pth +[2024-06-18 15:10:38,285][12883] Updated weights for policy 0, policy_version 158354 (0.0027) +[2024-06-18 15:10:39,464][12862] Signal inference workers to stop experience collection... (37950 times) +[2024-06-18 15:10:39,464][12862] Signal inference workers to resume experience collection... (37950 times) +[2024-06-18 15:10:39,511][12883] InferenceWorker_p0-w0: stopping experience collection (37950 times) +[2024-06-18 15:10:39,512][12883] InferenceWorker_p0-w0: resuming experience collection (37950 times) +[2024-06-18 15:10:41,996][12645] Fps is (10 sec: 39313.0, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 2594586624. Throughput: 0: 42355.3. Samples: 2594763680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:41,996][12645] Avg episode reward: [(0, '0.654')] +[2024-06-18 15:10:43,181][12883] Updated weights for policy 0, policy_version 158364 (0.0034) +[2024-06-18 15:10:46,100][12883] Updated weights for policy 0, policy_version 158374 (0.0035) +[2024-06-18 15:10:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 2594832384. Throughput: 0: 42726.8. Samples: 2594893060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:46,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 15:10:50,687][12883] Updated weights for policy 0, policy_version 158384 (0.0031) +[2024-06-18 15:10:51,994][12645] Fps is (10 sec: 42607.6, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 2595012608. Throughput: 0: 42402.2. Samples: 2595150720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:51,994][12645] Avg episode reward: [(0, '0.271')] +[2024-06-18 15:10:53,728][12883] Updated weights for policy 0, policy_version 158394 (0.0032) +[2024-06-18 15:10:56,994][12645] Fps is (10 sec: 39319.4, 60 sec: 42598.1, 300 sec: 42543.1). Total num frames: 2595225600. Throughput: 0: 42314.6. Samples: 2595395300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) +[2024-06-18 15:10:56,995][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 15:10:58,652][12883] Updated weights for policy 0, policy_version 158404 (0.0038) +[2024-06-18 15:11:01,500][12883] Updated weights for policy 0, policy_version 158414 (0.0031) +[2024-06-18 15:11:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2595471360. Throughput: 0: 42725.6. Samples: 2595530220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:01,994][12645] Avg episode reward: [(0, '0.309')] +[2024-06-18 15:11:06,402][12883] Updated weights for policy 0, policy_version 158424 (0.0033) +[2024-06-18 15:11:06,994][12645] Fps is (10 sec: 42600.6, 60 sec: 41779.3, 300 sec: 42599.1). Total num frames: 2595651584. Throughput: 0: 42333.7. Samples: 2595784820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:06,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 15:11:09,221][12883] Updated weights for policy 0, policy_version 158434 (0.0031) +[2024-06-18 15:11:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2595880960. Throughput: 0: 42238.6. Samples: 2596034360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:11,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 15:11:13,926][12883] Updated weights for policy 0, policy_version 158444 (0.0030) +[2024-06-18 15:11:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2596093952. Throughput: 0: 42652.5. Samples: 2596169420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:16,994][12645] Avg episode reward: [(0, '0.218')] +[2024-06-18 15:11:17,057][12883] Updated weights for policy 0, policy_version 158454 (0.0041) +[2024-06-18 15:11:21,638][12883] Updated weights for policy 0, policy_version 158464 (0.0040) +[2024-06-18 15:11:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2596290560. Throughput: 0: 42391.0. Samples: 2596423360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:21,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 15:11:24,984][12883] Updated weights for policy 0, policy_version 158474 (0.0041) +[2024-06-18 15:11:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2596519936. Throughput: 0: 42295.8. Samples: 2596666900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:26,994][12645] Avg episode reward: [(0, '0.695')] +[2024-06-18 15:11:29,379][12883] Updated weights for policy 0, policy_version 158484 (0.0024) +[2024-06-18 15:11:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42599.1). Total num frames: 2596732928. Throughput: 0: 42510.2. Samples: 2596806020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:31,994][12645] Avg episode reward: [(0, '0.685')] +[2024-06-18 15:11:32,525][12883] Updated weights for policy 0, policy_version 158494 (0.0037) +[2024-06-18 15:11:36,938][12883] Updated weights for policy 0, policy_version 158504 (0.0027) +[2024-06-18 15:11:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2596929536. Throughput: 0: 42392.2. Samples: 2597058360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:36,994][12645] Avg episode reward: [(0, '0.737')] +[2024-06-18 15:11:40,431][12883] Updated weights for policy 0, policy_version 158514 (0.0043) +[2024-06-18 15:11:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 2597175296. Throughput: 0: 42391.6. Samples: 2597302900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:41,994][12645] Avg episode reward: [(0, '0.671')] +[2024-06-18 15:11:44,658][12883] Updated weights for policy 0, policy_version 158524 (0.0027) +[2024-06-18 15:11:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2597371904. Throughput: 0: 42416.1. Samples: 2597438940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:46,994][12645] Avg episode reward: [(0, '0.700')] +[2024-06-18 15:11:48,087][12883] Updated weights for policy 0, policy_version 158534 (0.0028) +[2024-06-18 15:11:51,994][12645] Fps is (10 sec: 36044.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2597535744. Throughput: 0: 42232.4. Samples: 2597685280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:51,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 15:11:52,337][12883] Updated weights for policy 0, policy_version 158544 (0.0032) +[2024-06-18 15:11:55,892][12883] Updated weights for policy 0, policy_version 158554 (0.0033) +[2024-06-18 15:11:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.8, 300 sec: 42487.6). Total num frames: 2597781504. Throughput: 0: 42312.1. Samples: 2597938400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) +[2024-06-18 15:11:56,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 15:12:00,349][12883] Updated weights for policy 0, policy_version 158564 (0.0041) +[2024-06-18 15:12:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2597994496. Throughput: 0: 42367.1. Samples: 2598075940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:01,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 15:12:03,601][12883] Updated weights for policy 0, policy_version 158574 (0.0033) +[2024-06-18 15:12:06,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.8, 300 sec: 42542.9). Total num frames: 2598191104. Throughput: 0: 42258.4. Samples: 2598325080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:06,996][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 15:12:07,903][12883] Updated weights for policy 0, policy_version 158584 (0.0046) +[2024-06-18 15:12:09,748][12862] Signal inference workers to stop experience collection... (38000 times) +[2024-06-18 15:12:09,748][12862] Signal inference workers to resume experience collection... (38000 times) +[2024-06-18 15:12:09,768][12883] InferenceWorker_p0-w0: stopping experience collection (38000 times) +[2024-06-18 15:12:09,769][12883] InferenceWorker_p0-w0: resuming experience collection (38000 times) +[2024-06-18 15:12:11,085][12883] Updated weights for policy 0, policy_version 158594 (0.0029) +[2024-06-18 15:12:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2598436864. Throughput: 0: 42535.6. Samples: 2598581000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:11,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 15:12:15,700][12883] Updated weights for policy 0, policy_version 158604 (0.0040) +[2024-06-18 15:12:16,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2598633472. Throughput: 0: 42434.2. Samples: 2598715560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:16,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 15:12:18,718][12883] Updated weights for policy 0, policy_version 158614 (0.0028) +[2024-06-18 15:12:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2598830080. Throughput: 0: 42347.1. Samples: 2598963980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:21,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 15:12:23,191][12883] Updated weights for policy 0, policy_version 158624 (0.0039) +[2024-06-18 15:12:26,356][12883] Updated weights for policy 0, policy_version 158634 (0.0034) +[2024-06-18 15:12:26,998][12645] Fps is (10 sec: 44216.3, 60 sec: 42595.1, 300 sec: 42542.2). Total num frames: 2599075840. Throughput: 0: 42641.3. Samples: 2599221960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:26,999][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 15:12:30,897][12883] Updated weights for policy 0, policy_version 158644 (0.0048) +[2024-06-18 15:12:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2599272448. Throughput: 0: 42499.9. Samples: 2599351440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:31,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 15:12:34,079][12883] Updated weights for policy 0, policy_version 158654 (0.0042) +[2024-06-18 15:12:36,994][12645] Fps is (10 sec: 39339.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2599469056. Throughput: 0: 42593.8. Samples: 2599602000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:36,994][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 15:12:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158659_2599469056.pth... +[2024-06-18 15:12:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158038_2589294592.pth +[2024-06-18 15:12:38,374][12883] Updated weights for policy 0, policy_version 158664 (0.0029) +[2024-06-18 15:12:41,965][12883] Updated weights for policy 0, policy_version 158674 (0.0037) +[2024-06-18 15:12:41,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2599714816. Throughput: 0: 42571.5. Samples: 2599854120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:41,994][12645] Avg episode reward: [(0, '0.723')] +[2024-06-18 15:12:45,936][12883] Updated weights for policy 0, policy_version 158684 (0.0040) +[2024-06-18 15:12:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2599911424. Throughput: 0: 42537.7. Samples: 2599990140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:46,994][12645] Avg episode reward: [(0, '0.803')] +[2024-06-18 15:12:49,555][12883] Updated weights for policy 0, policy_version 158694 (0.0034) +[2024-06-18 15:12:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2600108032. Throughput: 0: 42455.9. Samples: 2600235500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:51,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 15:12:53,904][12883] Updated weights for policy 0, policy_version 158704 (0.0034) +[2024-06-18 15:12:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2600353792. Throughput: 0: 42637.8. Samples: 2600499700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:12:56,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 15:12:57,087][12883] Updated weights for policy 0, policy_version 158714 (0.0034) +[2024-06-18 15:13:01,482][12883] Updated weights for policy 0, policy_version 158724 (0.0026) +[2024-06-18 15:13:01,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2600566784. Throughput: 0: 42531.1. Samples: 2600629460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:13:01,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 15:13:04,856][12883] Updated weights for policy 0, policy_version 158734 (0.0025) +[2024-06-18 15:13:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42599.9, 300 sec: 42542.8). Total num frames: 2600747008. Throughput: 0: 42652.7. Samples: 2600883360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:06,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 15:13:08,889][12883] Updated weights for policy 0, policy_version 158744 (0.0030) +[2024-06-18 15:13:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2600992768. Throughput: 0: 42679.6. Samples: 2601142340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:11,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 15:13:12,391][12883] Updated weights for policy 0, policy_version 158754 (0.0045) +[2024-06-18 15:13:16,554][12883] Updated weights for policy 0, policy_version 158764 (0.0037) +[2024-06-18 15:13:16,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2601222144. Throughput: 0: 42847.7. Samples: 2601279580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:16,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 15:13:19,999][12883] Updated weights for policy 0, policy_version 158774 (0.0032) +[2024-06-18 15:13:22,000][12645] Fps is (10 sec: 39297.0, 60 sec: 42593.9, 300 sec: 42431.2). Total num frames: 2601385984. Throughput: 0: 42813.2. Samples: 2601528860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:22,000][12645] Avg episode reward: [(0, '0.366')] +[2024-06-18 15:13:24,277][12883] Updated weights for policy 0, policy_version 158784 (0.0048) +[2024-06-18 15:13:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42601.7, 300 sec: 42487.3). Total num frames: 2601631744. Throughput: 0: 42867.5. Samples: 2601783160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:26,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 15:13:27,783][12883] Updated weights for policy 0, policy_version 158794 (0.0035) +[2024-06-18 15:13:31,916][12883] Updated weights for policy 0, policy_version 158804 (0.0027) +[2024-06-18 15:13:31,994][12645] Fps is (10 sec: 45904.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2601844736. Throughput: 0: 42923.7. Samples: 2601921700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:31,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 15:13:35,351][12883] Updated weights for policy 0, policy_version 158814 (0.0027) +[2024-06-18 15:13:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2602041344. Throughput: 0: 42882.2. Samples: 2602165200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:36,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 15:13:39,683][12883] Updated weights for policy 0, policy_version 158824 (0.0033) +[2024-06-18 15:13:41,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 2602270720. Throughput: 0: 42768.3. Samples: 2602424360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:41,996][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 15:13:42,532][12862] Signal inference workers to stop experience collection... (38050 times) +[2024-06-18 15:13:42,533][12862] Signal inference workers to resume experience collection... (38050 times) +[2024-06-18 15:13:42,575][12883] InferenceWorker_p0-w0: stopping experience collection (38050 times) +[2024-06-18 15:13:42,575][12883] InferenceWorker_p0-w0: resuming experience collection (38050 times) +[2024-06-18 15:13:42,871][12883] Updated weights for policy 0, policy_version 158834 (0.0028) +[2024-06-18 15:13:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2602467328. Throughput: 0: 42881.8. Samples: 2602559140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:46,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 15:13:47,270][12883] Updated weights for policy 0, policy_version 158844 (0.0041) +[2024-06-18 15:13:50,737][12883] Updated weights for policy 0, policy_version 158854 (0.0044) +[2024-06-18 15:13:51,994][12645] Fps is (10 sec: 42607.1, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2602696704. Throughput: 0: 42714.8. Samples: 2602805520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:51,994][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 15:13:55,357][12883] Updated weights for policy 0, policy_version 158864 (0.0040) +[2024-06-18 15:13:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2602909696. Throughput: 0: 42624.4. Samples: 2603060440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:13:56,994][12645] Avg episode reward: [(0, '0.767')] +[2024-06-18 15:13:58,440][12883] Updated weights for policy 0, policy_version 158874 (0.0021) +[2024-06-18 15:14:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 2603106304. Throughput: 0: 42532.1. Samples: 2603193520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:14:01,994][12645] Avg episode reward: [(0, '0.762')] +[2024-06-18 15:14:02,852][12883] Updated weights for policy 0, policy_version 158884 (0.0039) +[2024-06-18 15:14:06,133][12883] Updated weights for policy 0, policy_version 158894 (0.0037) +[2024-06-18 15:14:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2603335680. Throughput: 0: 42672.1. Samples: 2603448840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) +[2024-06-18 15:14:07,003][12645] Avg episode reward: [(0, '0.392')] +[2024-06-18 15:14:10,468][12883] Updated weights for policy 0, policy_version 158904 (0.0029) +[2024-06-18 15:14:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2603548672. Throughput: 0: 42627.1. Samples: 2603701380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:11,994][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 15:14:13,631][12883] Updated weights for policy 0, policy_version 158914 (0.0034) +[2024-06-18 15:14:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2603745280. Throughput: 0: 42538.2. Samples: 2603835920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:16,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 15:14:17,955][12883] Updated weights for policy 0, policy_version 158924 (0.0039) +[2024-06-18 15:14:21,312][12883] Updated weights for policy 0, policy_version 158934 (0.0029) +[2024-06-18 15:14:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43422.1, 300 sec: 42653.9). Total num frames: 2603991040. Throughput: 0: 42828.0. Samples: 2604092460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:21,994][12645] Avg episode reward: [(0, '0.733')] +[2024-06-18 15:14:25,924][12883] Updated weights for policy 0, policy_version 158944 (0.0026) +[2024-06-18 15:14:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2604204032. Throughput: 0: 42706.3. Samples: 2604346060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:26,994][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 15:14:29,001][12883] Updated weights for policy 0, policy_version 158954 (0.0041) +[2024-06-18 15:14:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2604384256. Throughput: 0: 42496.5. Samples: 2604471480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:31,994][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 15:14:33,567][12883] Updated weights for policy 0, policy_version 158964 (0.0045) +[2024-06-18 15:14:36,537][12883] Updated weights for policy 0, policy_version 158974 (0.0037) +[2024-06-18 15:14:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2604630016. Throughput: 0: 42680.0. Samples: 2604726120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:36,994][12645] Avg episode reward: [(0, '0.406')] +[2024-06-18 15:14:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158974_2604630016.pth... +[2024-06-18 15:14:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158350_2594406400.pth +[2024-06-18 15:14:41,299][12883] Updated weights for policy 0, policy_version 158984 (0.0028) +[2024-06-18 15:14:41,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42599.7, 300 sec: 42542.8). Total num frames: 2604826624. Throughput: 0: 42730.1. Samples: 2604983300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:41,994][12645] Avg episode reward: [(0, '0.294')] +[2024-06-18 15:14:44,155][12883] Updated weights for policy 0, policy_version 158994 (0.0031) +[2024-06-18 15:14:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2605023232. Throughput: 0: 42460.0. Samples: 2605104220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:46,994][12645] Avg episode reward: [(0, '0.670')] +[2024-06-18 15:14:48,934][12883] Updated weights for policy 0, policy_version 159004 (0.0045) +[2024-06-18 15:14:51,755][12883] Updated weights for policy 0, policy_version 159014 (0.0050) +[2024-06-18 15:14:51,996][12645] Fps is (10 sec: 45865.8, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2605285376. Throughput: 0: 42549.5. Samples: 2605363660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:51,996][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 15:14:54,368][12862] Signal inference workers to stop experience collection... (38100 times) +[2024-06-18 15:14:54,372][12862] Signal inference workers to resume experience collection... (38100 times) +[2024-06-18 15:14:54,416][12883] InferenceWorker_p0-w0: stopping experience collection (38100 times) +[2024-06-18 15:14:54,420][12883] InferenceWorker_p0-w0: resuming experience collection (38100 times) +[2024-06-18 15:14:56,819][12883] Updated weights for policy 0, policy_version 159024 (0.0026) +[2024-06-18 15:14:56,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2605465600. Throughput: 0: 42781.2. Samples: 2605626540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:14:56,994][12645] Avg episode reward: [(0, '0.390')] +[2024-06-18 15:14:59,685][12883] Updated weights for policy 0, policy_version 159034 (0.0026) +[2024-06-18 15:15:01,994][12645] Fps is (10 sec: 37691.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2605662208. Throughput: 0: 42473.4. Samples: 2605747220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:15:01,994][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 15:15:04,530][12883] Updated weights for policy 0, policy_version 159044 (0.0026) +[2024-06-18 15:15:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2605924352. Throughput: 0: 42753.2. Samples: 2606016360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:15:06,994][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 15:15:07,587][12883] Updated weights for policy 0, policy_version 159054 (0.0030) +[2024-06-18 15:15:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2606088192. Throughput: 0: 42745.7. Samples: 2606269620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) +[2024-06-18 15:15:11,994][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 15:15:12,234][12883] Updated weights for policy 0, policy_version 159064 (0.0029) +[2024-06-18 15:15:14,992][12883] Updated weights for policy 0, policy_version 159074 (0.0033) +[2024-06-18 15:15:16,996][12645] Fps is (10 sec: 39313.0, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 2606317568. Throughput: 0: 42584.0. Samples: 2606387860. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:16,997][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 15:15:19,718][12883] Updated weights for policy 0, policy_version 159084 (0.0046) +[2024-06-18 15:15:21,994][12645] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2606546944. Throughput: 0: 42838.4. Samples: 2606653840. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:21,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 15:15:22,639][12883] Updated weights for policy 0, policy_version 159094 (0.0040) +[2024-06-18 15:15:26,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2606743552. Throughput: 0: 42958.8. Samples: 2606916440. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:26,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 15:15:27,218][12883] Updated weights for policy 0, policy_version 159104 (0.0023) +[2024-06-18 15:15:29,958][12883] Updated weights for policy 0, policy_version 159114 (0.0025) +[2024-06-18 15:15:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2606972928. Throughput: 0: 42955.0. Samples: 2607037200. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:31,994][12645] Avg episode reward: [(0, '0.362')] +[2024-06-18 15:15:34,852][12883] Updated weights for policy 0, policy_version 159124 (0.0046) +[2024-06-18 15:15:36,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2607202304. Throughput: 0: 43026.0. Samples: 2607299740. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:36,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 15:15:37,896][12883] Updated weights for policy 0, policy_version 159134 (0.0033) +[2024-06-18 15:15:41,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42323.9, 300 sec: 42487.0). Total num frames: 2607366144. Throughput: 0: 42937.5. Samples: 2607558820. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:41,996][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 15:15:42,385][12883] Updated weights for policy 0, policy_version 159144 (0.0039) +[2024-06-18 15:15:45,744][12883] Updated weights for policy 0, policy_version 159154 (0.0043) +[2024-06-18 15:15:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2607611904. Throughput: 0: 42887.0. Samples: 2607677140. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:46,994][12645] Avg episode reward: [(0, '0.389')] +[2024-06-18 15:15:50,077][12883] Updated weights for policy 0, policy_version 159164 (0.0038) +[2024-06-18 15:15:51,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42053.8, 300 sec: 42654.0). Total num frames: 2607808512. Throughput: 0: 42625.0. Samples: 2607934480. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:51,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 15:15:53,387][12883] Updated weights for policy 0, policy_version 159174 (0.0042) +[2024-06-18 15:15:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2608005120. Throughput: 0: 42673.0. Samples: 2608189900. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:15:56,994][12645] Avg episode reward: [(0, '0.625')] +[2024-06-18 15:15:57,849][12883] Updated weights for policy 0, policy_version 159184 (0.0028) +[2024-06-18 15:16:00,972][12883] Updated weights for policy 0, policy_version 159194 (0.0037) +[2024-06-18 15:16:01,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2608267264. Throughput: 0: 42871.4. Samples: 2608316980. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:16:01,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 15:16:05,775][12883] Updated weights for policy 0, policy_version 159204 (0.0032) +[2024-06-18 15:16:06,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2608447488. Throughput: 0: 42720.7. Samples: 2608576280. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:16:06,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 15:16:08,818][12883] Updated weights for policy 0, policy_version 159214 (0.0036) +[2024-06-18 15:16:11,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2608644096. Throughput: 0: 42535.1. Samples: 2608830520. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:16:11,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 15:16:13,548][12883] Updated weights for policy 0, policy_version 159224 (0.0036) +[2024-06-18 15:16:16,467][12883] Updated weights for policy 0, policy_version 159234 (0.0047) +[2024-06-18 15:16:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 2608906240. Throughput: 0: 42643.5. Samples: 2608956160. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) +[2024-06-18 15:16:16,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 15:16:21,042][12862] Signal inference workers to stop experience collection... (38150 times) +[2024-06-18 15:16:21,043][12862] Signal inference workers to resume experience collection... (38150 times) +[2024-06-18 15:16:21,072][12883] InferenceWorker_p0-w0: stopping experience collection (38150 times) +[2024-06-18 15:16:21,073][12883] InferenceWorker_p0-w0: resuming experience collection (38150 times) +[2024-06-18 15:16:21,180][12883] Updated weights for policy 0, policy_version 159244 (0.0037) +[2024-06-18 15:16:21,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2609086464. Throughput: 0: 42613.8. Samples: 2609217360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:16:21,994][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 15:16:24,625][12883] Updated weights for policy 0, policy_version 159254 (0.0036) +[2024-06-18 15:16:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2609299456. Throughput: 0: 42516.3. Samples: 2609471960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:16:26,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 15:16:29,082][12883] Updated weights for policy 0, policy_version 159264 (0.0037) +[2024-06-18 15:16:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 2609528832. Throughput: 0: 42640.8. Samples: 2609595980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:16:31,994][12645] Avg episode reward: [(0, '0.440')] +[2024-06-18 15:16:32,254][12883] Updated weights for policy 0, policy_version 159274 (0.0036) +[2024-06-18 15:16:36,818][12883] Updated weights for policy 0, policy_version 159284 (0.0028) +[2024-06-18 15:16:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 2609725440. Throughput: 0: 42588.8. Samples: 2609850980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:16:36,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 15:16:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159285_2609725440.pth... +[2024-06-18 15:16:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158659_2599469056.pth +[2024-06-18 15:16:39,698][12883] Updated weights for policy 0, policy_version 159294 (0.0027) +[2024-06-18 15:16:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2609938432. Throughput: 0: 42586.7. Samples: 2610106300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:16:41,994][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 15:16:44,380][12883] Updated weights for policy 0, policy_version 159304 (0.0029) +[2024-06-18 15:16:46,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2610184192. Throughput: 0: 42722.2. Samples: 2610239480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:16:46,995][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 15:16:47,164][12883] Updated weights for policy 0, policy_version 159314 (0.0027) +[2024-06-18 15:16:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2610348032. Throughput: 0: 42575.1. Samples: 2610492160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:16:51,994][12645] Avg episode reward: [(0, '0.660')] +[2024-06-18 15:16:52,277][12883] Updated weights for policy 0, policy_version 159324 (0.0040) +[2024-06-18 15:16:55,083][12883] Updated weights for policy 0, policy_version 159334 (0.0048) +[2024-06-18 15:16:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2610593792. Throughput: 0: 42663.4. Samples: 2610750380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:16:56,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 15:16:59,787][12883] Updated weights for policy 0, policy_version 159344 (0.0027) +[2024-06-18 15:17:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 2610806784. Throughput: 0: 42759.6. Samples: 2610880340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:17:01,994][12645] Avg episode reward: [(0, '0.136')] +[2024-06-18 15:17:02,706][12883] Updated weights for policy 0, policy_version 159354 (0.0030) +[2024-06-18 15:17:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2611003392. Throughput: 0: 42586.2. Samples: 2611133740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:17:06,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 15:17:07,236][12883] Updated weights for policy 0, policy_version 159364 (0.0037) +[2024-06-18 15:17:10,442][12883] Updated weights for policy 0, policy_version 159374 (0.0034) +[2024-06-18 15:17:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2611232768. Throughput: 0: 42672.6. Samples: 2611392220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:17:11,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 15:17:14,859][12883] Updated weights for policy 0, policy_version 159384 (0.0028) +[2024-06-18 15:17:16,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2611445760. Throughput: 0: 42835.3. Samples: 2611523560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:17:16,994][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 15:17:17,989][12883] Updated weights for policy 0, policy_version 159394 (0.0063) +[2024-06-18 15:17:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42599.1). Total num frames: 2611642368. Throughput: 0: 42694.8. Samples: 2611772240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) +[2024-06-18 15:17:21,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 15:17:22,690][12883] Updated weights for policy 0, policy_version 159404 (0.0033) +[2024-06-18 15:17:25,669][12883] Updated weights for policy 0, policy_version 159414 (0.0036) +[2024-06-18 15:17:26,996][12645] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 2611871744. Throughput: 0: 42671.5. Samples: 2612026620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:17:26,997][12645] Avg episode reward: [(0, '0.562')] +[2024-06-18 15:17:30,304][12883] Updated weights for policy 0, policy_version 159424 (0.0035) +[2024-06-18 15:17:31,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2612101120. Throughput: 0: 42787.5. Samples: 2612164920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:17:31,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 15:17:33,353][12883] Updated weights for policy 0, policy_version 159434 (0.0026) +[2024-06-18 15:17:36,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2612281344. Throughput: 0: 42611.0. Samples: 2612409660. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:17:36,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 15:17:37,850][12883] Updated weights for policy 0, policy_version 159444 (0.0027) +[2024-06-18 15:17:41,277][12883] Updated weights for policy 0, policy_version 159454 (0.0032) +[2024-06-18 15:17:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2612510720. Throughput: 0: 42475.3. Samples: 2612661760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:17:41,994][12645] Avg episode reward: [(0, '0.499')] +[2024-06-18 15:17:44,892][12862] Signal inference workers to stop experience collection... (38200 times) +[2024-06-18 15:17:44,946][12862] Signal inference workers to resume experience collection... (38200 times) +[2024-06-18 15:17:44,946][12883] InferenceWorker_p0-w0: stopping experience collection (38200 times) +[2024-06-18 15:17:44,961][12883] InferenceWorker_p0-w0: resuming experience collection (38200 times) +[2024-06-18 15:17:45,665][12883] Updated weights for policy 0, policy_version 159464 (0.0043) +[2024-06-18 15:17:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2612723712. Throughput: 0: 42560.5. Samples: 2612795560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:17:46,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 15:17:48,989][12883] Updated weights for policy 0, policy_version 159474 (0.0035) +[2024-06-18 15:17:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2612920320. Throughput: 0: 42614.7. Samples: 2613051400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:17:51,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 15:17:53,605][12883] Updated weights for policy 0, policy_version 159484 (0.0033) +[2024-06-18 15:17:56,675][12883] Updated weights for policy 0, policy_version 159494 (0.0022) +[2024-06-18 15:17:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2613166080. Throughput: 0: 42485.7. Samples: 2613304080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:17:56,994][12645] Avg episode reward: [(0, '0.311')] +[2024-06-18 15:18:01,171][12883] Updated weights for policy 0, policy_version 159504 (0.0031) +[2024-06-18 15:18:01,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 2613362688. Throughput: 0: 42554.3. Samples: 2613438600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:18:01,996][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 15:18:04,332][12883] Updated weights for policy 0, policy_version 159514 (0.0045) +[2024-06-18 15:18:06,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2613559296. Throughput: 0: 42596.3. Samples: 2613689080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:18:06,994][12645] Avg episode reward: [(0, '0.477')] +[2024-06-18 15:18:08,776][12883] Updated weights for policy 0, policy_version 159524 (0.0038) +[2024-06-18 15:18:11,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2613788672. Throughput: 0: 42560.4. Samples: 2613941740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:18:11,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 15:18:12,131][12883] Updated weights for policy 0, policy_version 159534 (0.0040) +[2024-06-18 15:18:16,198][12883] Updated weights for policy 0, policy_version 159544 (0.0033) +[2024-06-18 15:18:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 2614001664. Throughput: 0: 42432.5. Samples: 2614074380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:18:16,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 15:18:19,941][12883] Updated weights for policy 0, policy_version 159554 (0.0031) +[2024-06-18 15:18:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2614214656. Throughput: 0: 42623.1. Samples: 2614327700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:18:21,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 15:18:23,777][12883] Updated weights for policy 0, policy_version 159564 (0.0042) +[2024-06-18 15:18:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 2614411264. Throughput: 0: 42663.1. Samples: 2614581600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) +[2024-06-18 15:18:26,994][12645] Avg episode reward: [(0, '0.580')] +[2024-06-18 15:18:27,645][12883] Updated weights for policy 0, policy_version 159574 (0.0034) +[2024-06-18 15:18:31,348][12883] Updated weights for policy 0, policy_version 159584 (0.0025) +[2024-06-18 15:18:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2614640640. Throughput: 0: 42601.8. Samples: 2614712640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:18:31,994][12645] Avg episode reward: [(0, '0.639')] +[2024-06-18 15:18:35,532][12883] Updated weights for policy 0, policy_version 159594 (0.0037) +[2024-06-18 15:18:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2614837248. Throughput: 0: 42421.7. Samples: 2614960380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:18:36,994][12645] Avg episode reward: [(0, '0.673')] +[2024-06-18 15:18:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159597_2614837248.pth... +[2024-06-18 15:18:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158974_2604630016.pth +[2024-06-18 15:18:39,113][12883] Updated weights for policy 0, policy_version 159604 (0.0034) +[2024-06-18 15:18:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2615066624. Throughput: 0: 42327.5. Samples: 2615208820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:18:41,994][12645] Avg episode reward: [(0, '0.583')] +[2024-06-18 15:18:43,254][12883] Updated weights for policy 0, policy_version 159614 (0.0035) +[2024-06-18 15:18:46,901][12883] Updated weights for policy 0, policy_version 159624 (0.0043) +[2024-06-18 15:18:46,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2615279616. Throughput: 0: 42353.3. Samples: 2615344500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:18:46,996][12645] Avg episode reward: [(0, '0.560')] +[2024-06-18 15:18:50,768][12883] Updated weights for policy 0, policy_version 159634 (0.0033) +[2024-06-18 15:18:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2615476224. Throughput: 0: 42385.9. Samples: 2615596440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:18:51,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 15:18:54,762][12883] Updated weights for policy 0, policy_version 159644 (0.0035) +[2024-06-18 15:18:56,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2615705600. Throughput: 0: 42393.8. Samples: 2615849460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:18:56,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 15:18:58,447][12883] Updated weights for policy 0, policy_version 159654 (0.0036) +[2024-06-18 15:19:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42053.8, 300 sec: 42542.9). Total num frames: 2615885824. Throughput: 0: 42344.8. Samples: 2615979900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:19:01,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 15:19:02,446][12883] Updated weights for policy 0, policy_version 159664 (0.0039) +[2024-06-18 15:19:06,307][12883] Updated weights for policy 0, policy_version 159674 (0.0042) +[2024-06-18 15:19:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2616115200. Throughput: 0: 42361.8. Samples: 2616233980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:19:06,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 15:19:09,257][12862] Signal inference workers to stop experience collection... (38250 times) +[2024-06-18 15:19:09,258][12862] Signal inference workers to resume experience collection... (38250 times) +[2024-06-18 15:19:09,268][12883] InferenceWorker_p0-w0: stopping experience collection (38250 times) +[2024-06-18 15:19:09,268][12883] InferenceWorker_p0-w0: resuming experience collection (38250 times) +[2024-06-18 15:19:10,367][12883] Updated weights for policy 0, policy_version 159684 (0.0032) +[2024-06-18 15:19:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2616344576. Throughput: 0: 42252.3. Samples: 2616482960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:19:11,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 15:19:14,095][12883] Updated weights for policy 0, policy_version 159694 (0.0027) +[2024-06-18 15:19:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2616524800. Throughput: 0: 42265.7. Samples: 2616614600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:19:16,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 15:19:17,871][12883] Updated weights for policy 0, policy_version 159704 (0.0028) +[2024-06-18 15:19:21,644][12883] Updated weights for policy 0, policy_version 159714 (0.0038) +[2024-06-18 15:19:21,998][12645] Fps is (10 sec: 42580.4, 60 sec: 42595.4, 300 sec: 42597.8). Total num frames: 2616770560. Throughput: 0: 42481.4. Samples: 2616872220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:19:21,998][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 15:19:25,611][12883] Updated weights for policy 0, policy_version 159724 (0.0028) +[2024-06-18 15:19:26,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2616983552. Throughput: 0: 42551.2. Samples: 2617123620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) +[2024-06-18 15:19:26,994][12645] Avg episode reward: [(0, '0.619')] +[2024-06-18 15:19:29,455][12883] Updated weights for policy 0, policy_version 159734 (0.0039) +[2024-06-18 15:19:31,996][12645] Fps is (10 sec: 39329.6, 60 sec: 42050.7, 300 sec: 42487.0). Total num frames: 2617163776. Throughput: 0: 42335.6. Samples: 2617249600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:19:31,997][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 15:19:33,320][12883] Updated weights for policy 0, policy_version 159744 (0.0032) +[2024-06-18 15:19:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2617393152. Throughput: 0: 42445.7. Samples: 2617506500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:19:36,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 15:19:37,454][12883] Updated weights for policy 0, policy_version 159754 (0.0038) +[2024-06-18 15:19:41,149][12883] Updated weights for policy 0, policy_version 159764 (0.0040) +[2024-06-18 15:19:41,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2617622528. Throughput: 0: 42392.5. Samples: 2617757120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:19:41,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 15:19:44,986][12883] Updated weights for policy 0, policy_version 159774 (0.0035) +[2024-06-18 15:19:46,994][12645] Fps is (10 sec: 39322.4, 60 sec: 41780.8, 300 sec: 42376.6). Total num frames: 2617786368. Throughput: 0: 42406.8. Samples: 2617888200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:19:46,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 15:19:48,754][12883] Updated weights for policy 0, policy_version 159784 (0.0039) +[2024-06-18 15:19:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2618032128. Throughput: 0: 42489.8. Samples: 2618146020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:19:51,994][12645] Avg episode reward: [(0, '0.404')] +[2024-06-18 15:19:52,500][12883] Updated weights for policy 0, policy_version 159794 (0.0023) +[2024-06-18 15:19:56,435][12883] Updated weights for policy 0, policy_version 159804 (0.0034) +[2024-06-18 15:19:56,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2618261504. Throughput: 0: 42736.0. Samples: 2618406080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:19:56,994][12645] Avg episode reward: [(0, '0.262')] +[2024-06-18 15:20:00,093][12883] Updated weights for policy 0, policy_version 159814 (0.0045) +[2024-06-18 15:20:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2618458112. Throughput: 0: 42733.4. Samples: 2618537600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:20:01,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 15:20:04,302][12883] Updated weights for policy 0, policy_version 159824 (0.0036) +[2024-06-18 15:20:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2618671104. Throughput: 0: 42568.4. Samples: 2618787620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:20:06,994][12645] Avg episode reward: [(0, '0.399')] +[2024-06-18 15:20:07,664][12883] Updated weights for policy 0, policy_version 159834 (0.0030) +[2024-06-18 15:20:10,488][12862] Signal inference workers to stop experience collection... (38300 times) +[2024-06-18 15:20:10,541][12862] Signal inference workers to resume experience collection... (38300 times) +[2024-06-18 15:20:10,546][12883] InferenceWorker_p0-w0: stopping experience collection (38300 times) +[2024-06-18 15:20:10,568][12883] InferenceWorker_p0-w0: resuming experience collection (38300 times) +[2024-06-18 15:20:11,832][12883] Updated weights for policy 0, policy_version 159844 (0.0036) +[2024-06-18 15:20:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2618900480. Throughput: 0: 42812.7. Samples: 2619050200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:20:11,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 15:20:15,191][12883] Updated weights for policy 0, policy_version 159854 (0.0032) +[2024-06-18 15:20:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2619097088. Throughput: 0: 42793.6. Samples: 2619175220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:20:16,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 15:20:19,389][12883] Updated weights for policy 0, policy_version 159864 (0.0027) +[2024-06-18 15:20:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42601.4, 300 sec: 42653.9). Total num frames: 2619326464. Throughput: 0: 42698.7. Samples: 2619427940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:20:21,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 15:20:22,837][12883] Updated weights for policy 0, policy_version 159874 (0.0032) +[2024-06-18 15:20:26,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2619523072. Throughput: 0: 43042.3. Samples: 2619694020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:20:26,994][12645] Avg episode reward: [(0, '0.570')] +[2024-06-18 15:20:27,007][12883] Updated weights for policy 0, policy_version 159884 (0.0030) +[2024-06-18 15:20:30,783][12883] Updated weights for policy 0, policy_version 159894 (0.0033) +[2024-06-18 15:20:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 2619736064. Throughput: 0: 42828.3. Samples: 2619815480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) +[2024-06-18 15:20:31,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 15:20:34,592][12883] Updated weights for policy 0, policy_version 159904 (0.0038) +[2024-06-18 15:20:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2619965440. Throughput: 0: 42694.7. Samples: 2620067280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:20:36,994][12645] Avg episode reward: [(0, '0.252')] +[2024-06-18 15:20:37,041][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159911_2619981824.pth... +[2024-06-18 15:20:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159285_2609725440.pth +[2024-06-18 15:20:38,851][12883] Updated weights for policy 0, policy_version 159914 (0.0026) +[2024-06-18 15:20:41,996][12645] Fps is (10 sec: 44227.9, 60 sec: 42597.0, 300 sec: 42598.1). Total num frames: 2620178432. Throughput: 0: 42775.0. Samples: 2620331040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:20:41,996][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 15:20:42,191][12883] Updated weights for policy 0, policy_version 159924 (0.0031) +[2024-06-18 15:20:46,583][12883] Updated weights for policy 0, policy_version 159934 (0.0027) +[2024-06-18 15:20:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2620375040. Throughput: 0: 42592.0. Samples: 2620454240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:20:46,994][12645] Avg episode reward: [(0, '0.648')] +[2024-06-18 15:20:50,014][12883] Updated weights for policy 0, policy_version 159944 (0.0037) +[2024-06-18 15:20:51,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2620588032. Throughput: 0: 42647.7. Samples: 2620706760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:20:51,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 15:20:54,123][12883] Updated weights for policy 0, policy_version 159954 (0.0034) +[2024-06-18 15:20:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2620801024. Throughput: 0: 42705.8. Samples: 2620971960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:20:56,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 15:20:57,574][12883] Updated weights for policy 0, policy_version 159964 (0.0031) +[2024-06-18 15:21:01,706][12883] Updated weights for policy 0, policy_version 159974 (0.0040) +[2024-06-18 15:21:01,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2621030400. Throughput: 0: 42624.2. Samples: 2621093400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:21:01,996][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 15:21:05,641][12883] Updated weights for policy 0, policy_version 159984 (0.0040) +[2024-06-18 15:21:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2621243392. Throughput: 0: 42858.3. Samples: 2621356560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:21:06,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 15:21:09,273][12883] Updated weights for policy 0, policy_version 159994 (0.0043) +[2024-06-18 15:21:11,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2621440000. Throughput: 0: 42559.9. Samples: 2621609220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:21:11,994][12645] Avg episode reward: [(0, '0.564')] +[2024-06-18 15:21:13,181][12883] Updated weights for policy 0, policy_version 160004 (0.0053) +[2024-06-18 15:21:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2621652992. Throughput: 0: 42567.5. Samples: 2621731020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:21:16,994][12645] Avg episode reward: [(0, '0.608')] +[2024-06-18 15:21:17,306][12883] Updated weights for policy 0, policy_version 160014 (0.0041) +[2024-06-18 15:21:20,863][12883] Updated weights for policy 0, policy_version 160024 (0.0030) +[2024-06-18 15:21:21,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2621898752. Throughput: 0: 42766.6. Samples: 2621991780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:21:21,994][12645] Avg episode reward: [(0, '0.608')] +[2024-06-18 15:21:24,668][12883] Updated weights for policy 0, policy_version 160034 (0.0039) +[2024-06-18 15:21:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2622078976. Throughput: 0: 42732.1. Samples: 2622253900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:21:26,994][12645] Avg episode reward: [(0, '0.862')] +[2024-06-18 15:21:28,379][12883] Updated weights for policy 0, policy_version 160044 (0.0040) +[2024-06-18 15:21:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2622308352. Throughput: 0: 42718.2. Samples: 2622376560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:21:31,994][12645] Avg episode reward: [(0, '0.686')] +[2024-06-18 15:21:32,004][12862] Signal inference workers to stop experience collection... (38350 times) +[2024-06-18 15:21:32,005][12862] Signal inference workers to resume experience collection... (38350 times) +[2024-06-18 15:21:32,027][12883] InferenceWorker_p0-w0: stopping experience collection (38350 times) +[2024-06-18 15:21:32,027][12883] InferenceWorker_p0-w0: resuming experience collection (38350 times) +[2024-06-18 15:21:32,151][12883] Updated weights for policy 0, policy_version 160054 (0.0027) +[2024-06-18 15:21:36,203][12883] Updated weights for policy 0, policy_version 160064 (0.0046) +[2024-06-18 15:21:36,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2622537728. Throughput: 0: 42982.2. Samples: 2622640960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) +[2024-06-18 15:21:36,994][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 15:21:39,938][12883] Updated weights for policy 0, policy_version 160074 (0.0028) +[2024-06-18 15:21:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.7, 300 sec: 42487.3). Total num frames: 2622717952. Throughput: 0: 42882.6. Samples: 2622901680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:21:41,994][12645] Avg episode reward: [(0, '0.681')] +[2024-06-18 15:21:43,694][12883] Updated weights for policy 0, policy_version 160084 (0.0027) +[2024-06-18 15:21:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2622963712. Throughput: 0: 42902.1. Samples: 2623023900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:21:46,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 15:21:47,706][12883] Updated weights for policy 0, policy_version 160094 (0.0040) +[2024-06-18 15:21:51,218][12883] Updated weights for policy 0, policy_version 160104 (0.0036) +[2024-06-18 15:21:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2623176704. Throughput: 0: 42827.0. Samples: 2623283780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:21:51,994][12645] Avg episode reward: [(0, '0.675')] +[2024-06-18 15:21:55,525][12883] Updated weights for policy 0, policy_version 160114 (0.0029) +[2024-06-18 15:21:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2623356928. Throughput: 0: 42903.4. Samples: 2623539880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:21:56,994][12645] Avg episode reward: [(0, '0.649')] +[2024-06-18 15:21:58,847][12883] Updated weights for policy 0, policy_version 160124 (0.0027) +[2024-06-18 15:22:01,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42871.4, 300 sec: 42709.2). Total num frames: 2623602688. Throughput: 0: 42858.3. Samples: 2623659740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:01,996][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 15:22:03,363][12883] Updated weights for policy 0, policy_version 160134 (0.0037) +[2024-06-18 15:22:06,530][12883] Updated weights for policy 0, policy_version 160144 (0.0039) +[2024-06-18 15:22:06,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2623815680. Throughput: 0: 42953.5. Samples: 2623924680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:06,994][12645] Avg episode reward: [(0, '0.555')] +[2024-06-18 15:22:11,249][12883] Updated weights for policy 0, policy_version 160154 (0.0041) +[2024-06-18 15:22:11,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2624012288. Throughput: 0: 42856.9. Samples: 2624182460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:11,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 15:22:14,313][12883] Updated weights for policy 0, policy_version 160164 (0.0042) +[2024-06-18 15:22:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2624241664. Throughput: 0: 42755.5. Samples: 2624300560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:16,995][12645] Avg episode reward: [(0, '0.535')] +[2024-06-18 15:22:18,892][12883] Updated weights for policy 0, policy_version 160174 (0.0036) +[2024-06-18 15:22:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 2624421888. Throughput: 0: 42601.3. Samples: 2624558020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:21,994][12645] Avg episode reward: [(0, '0.478')] +[2024-06-18 15:22:22,248][12883] Updated weights for policy 0, policy_version 160184 (0.0039) +[2024-06-18 15:22:26,759][12883] Updated weights for policy 0, policy_version 160194 (0.0043) +[2024-06-18 15:22:26,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2624618496. Throughput: 0: 42546.7. Samples: 2624816280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:26,996][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 15:22:29,794][12883] Updated weights for policy 0, policy_version 160204 (0.0036) +[2024-06-18 15:22:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2624880640. Throughput: 0: 42569.4. Samples: 2624939520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:31,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 15:22:34,375][12883] Updated weights for policy 0, policy_version 160214 (0.0032) +[2024-06-18 15:22:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2625077248. Throughput: 0: 42433.3. Samples: 2625193280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:37,000][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 15:22:37,149][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160223_2625093632.pth... +[2024-06-18 15:22:37,207][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159597_2614837248.pth +[2024-06-18 15:22:37,525][12883] Updated weights for policy 0, policy_version 160224 (0.0038) +[2024-06-18 15:22:41,924][12883] Updated weights for policy 0, policy_version 160234 (0.0044) +[2024-06-18 15:22:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2625273856. Throughput: 0: 42494.3. Samples: 2625452120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) +[2024-06-18 15:22:41,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 15:22:45,092][12883] Updated weights for policy 0, policy_version 160244 (0.0031) +[2024-06-18 15:22:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2625503232. Throughput: 0: 42604.0. Samples: 2625576820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:22:46,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 15:22:49,468][12883] Updated weights for policy 0, policy_version 160254 (0.0036) +[2024-06-18 15:22:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2625716224. Throughput: 0: 42519.9. Samples: 2625838080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:22:51,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 15:22:52,790][12883] Updated weights for policy 0, policy_version 160264 (0.0031) +[2024-06-18 15:22:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 2625912832. Throughput: 0: 42335.1. Samples: 2626087540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:22:56,994][12645] Avg episode reward: [(0, '0.683')] +[2024-06-18 15:22:57,486][12883] Updated weights for policy 0, policy_version 160274 (0.0027) +[2024-06-18 15:23:00,437][12883] Updated weights for policy 0, policy_version 160284 (0.0032) +[2024-06-18 15:23:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2626158592. Throughput: 0: 42556.5. Samples: 2626215600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:01,994][12645] Avg episode reward: [(0, '0.683')] +[2024-06-18 15:23:05,203][12883] Updated weights for policy 0, policy_version 160294 (0.0033) +[2024-06-18 15:23:06,207][12862] Signal inference workers to stop experience collection... (38400 times) +[2024-06-18 15:23:06,207][12862] Signal inference workers to resume experience collection... (38400 times) +[2024-06-18 15:23:06,245][12883] InferenceWorker_p0-w0: stopping experience collection (38400 times) +[2024-06-18 15:23:06,245][12883] InferenceWorker_p0-w0: resuming experience collection (38400 times) +[2024-06-18 15:23:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2626355200. Throughput: 0: 42531.2. Samples: 2626471920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:06,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 15:23:08,393][12883] Updated weights for policy 0, policy_version 160304 (0.0035) +[2024-06-18 15:23:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2626568192. Throughput: 0: 42316.0. Samples: 2626720500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:11,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 15:23:12,794][12883] Updated weights for policy 0, policy_version 160314 (0.0032) +[2024-06-18 15:23:16,070][12883] Updated weights for policy 0, policy_version 160324 (0.0033) +[2024-06-18 15:23:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2626781184. Throughput: 0: 42425.8. Samples: 2626848680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:16,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 15:23:20,203][12883] Updated weights for policy 0, policy_version 160334 (0.0029) +[2024-06-18 15:23:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2626961408. Throughput: 0: 42615.6. Samples: 2627110980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:21,994][12645] Avg episode reward: [(0, '0.605')] +[2024-06-18 15:23:23,810][12883] Updated weights for policy 0, policy_version 160344 (0.0033) +[2024-06-18 15:23:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 2627223552. Throughput: 0: 42320.4. Samples: 2627356540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:26,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 15:23:28,438][12883] Updated weights for policy 0, policy_version 160354 (0.0047) +[2024-06-18 15:23:31,670][12883] Updated weights for policy 0, policy_version 160364 (0.0023) +[2024-06-18 15:23:31,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 2627420160. Throughput: 0: 42573.4. Samples: 2627492720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:31,996][12645] Avg episode reward: [(0, '0.654')] +[2024-06-18 15:23:36,044][12883] Updated weights for policy 0, policy_version 160374 (0.0030) +[2024-06-18 15:23:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2627600384. Throughput: 0: 42299.1. Samples: 2627741540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:36,994][12645] Avg episode reward: [(0, '0.623')] +[2024-06-18 15:23:39,299][12883] Updated weights for policy 0, policy_version 160384 (0.0035) +[2024-06-18 15:23:41,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2627846144. Throughput: 0: 42283.6. Samples: 2627990300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:41,994][12645] Avg episode reward: [(0, '0.410')] +[2024-06-18 15:23:43,550][12883] Updated weights for policy 0, policy_version 160394 (0.0040) +[2024-06-18 15:23:46,913][12883] Updated weights for policy 0, policy_version 160404 (0.0029) +[2024-06-18 15:23:46,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2628059136. Throughput: 0: 42493.4. Samples: 2628127800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) +[2024-06-18 15:23:46,994][12645] Avg episode reward: [(0, '0.732')] +[2024-06-18 15:23:51,440][12883] Updated weights for policy 0, policy_version 160414 (0.0033) +[2024-06-18 15:23:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2628239360. Throughput: 0: 42307.0. Samples: 2628375740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:23:51,994][12645] Avg episode reward: [(0, '0.664')] +[2024-06-18 15:23:54,592][12883] Updated weights for policy 0, policy_version 160424 (0.0033) +[2024-06-18 15:23:56,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2628485120. Throughput: 0: 42438.4. Samples: 2628630320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:23:56,996][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 15:23:59,043][12883] Updated weights for policy 0, policy_version 160434 (0.0034) +[2024-06-18 15:24:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2628698112. Throughput: 0: 42702.1. Samples: 2628770280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:01,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 15:24:02,159][12883] Updated weights for policy 0, policy_version 160444 (0.0034) +[2024-06-18 15:24:06,582][12883] Updated weights for policy 0, policy_version 160454 (0.0038) +[2024-06-18 15:24:06,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2628894720. Throughput: 0: 42438.7. Samples: 2629020720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:06,994][12645] Avg episode reward: [(0, '0.697')] +[2024-06-18 15:24:09,747][12883] Updated weights for policy 0, policy_version 160464 (0.0032) +[2024-06-18 15:24:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2629140480. Throughput: 0: 42582.2. Samples: 2629272740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:11,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 15:24:14,126][12883] Updated weights for policy 0, policy_version 160474 (0.0031) +[2024-06-18 15:24:16,086][12862] Signal inference workers to stop experience collection... (38450 times) +[2024-06-18 15:24:16,128][12883] InferenceWorker_p0-w0: stopping experience collection (38450 times) +[2024-06-18 15:24:16,198][12862] Signal inference workers to resume experience collection... (38450 times) +[2024-06-18 15:24:16,199][12883] InferenceWorker_p0-w0: resuming experience collection (38450 times) +[2024-06-18 15:24:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42599.0). Total num frames: 2629337088. Throughput: 0: 42451.8. Samples: 2629402960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:16,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 15:24:17,444][12883] Updated weights for policy 0, policy_version 160484 (0.0038) +[2024-06-18 15:24:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2629517312. Throughput: 0: 42550.7. Samples: 2629656320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:21,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 15:24:22,150][12883] Updated weights for policy 0, policy_version 160494 (0.0046) +[2024-06-18 15:24:25,276][12883] Updated weights for policy 0, policy_version 160504 (0.0037) +[2024-06-18 15:24:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 2629763072. Throughput: 0: 42526.3. Samples: 2629903980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:26,994][12645] Avg episode reward: [(0, '0.443')] +[2024-06-18 15:24:29,910][12883] Updated weights for policy 0, policy_version 160514 (0.0041) +[2024-06-18 15:24:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.8, 300 sec: 42542.9). Total num frames: 2629943296. Throughput: 0: 42557.3. Samples: 2630042880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:31,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 15:24:33,015][12883] Updated weights for policy 0, policy_version 160524 (0.0037) +[2024-06-18 15:24:36,994][12645] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2630139904. Throughput: 0: 42471.9. Samples: 2630286980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:36,995][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 15:24:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160531_2630139904.pth... +[2024-06-18 15:24:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159911_2619981824.pth +[2024-06-18 15:24:37,631][12883] Updated weights for policy 0, policy_version 160534 (0.0032) +[2024-06-18 15:24:40,840][12883] Updated weights for policy 0, policy_version 160544 (0.0036) +[2024-06-18 15:24:41,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2630418432. Throughput: 0: 42465.2. Samples: 2630541160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:41,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 15:24:45,539][12883] Updated weights for policy 0, policy_version 160554 (0.0037) +[2024-06-18 15:24:46,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2630582272. Throughput: 0: 42453.9. Samples: 2630680700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:46,994][12645] Avg episode reward: [(0, '0.269')] +[2024-06-18 15:24:48,498][12883] Updated weights for policy 0, policy_version 160564 (0.0040) +[2024-06-18 15:24:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2630795264. Throughput: 0: 42355.0. Samples: 2630926700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) +[2024-06-18 15:24:51,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 15:24:53,339][12883] Updated weights for policy 0, policy_version 160574 (0.0031) +[2024-06-18 15:24:55,999][12883] Updated weights for policy 0, policy_version 160584 (0.0037) +[2024-06-18 15:24:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 2631041024. Throughput: 0: 42437.4. Samples: 2631182420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:24:56,994][12645] Avg episode reward: [(0, '0.779')] +[2024-06-18 15:25:00,970][12883] Updated weights for policy 0, policy_version 160594 (0.0031) +[2024-06-18 15:25:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2631237632. Throughput: 0: 42517.4. Samples: 2631316240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:01,994][12645] Avg episode reward: [(0, '0.771')] +[2024-06-18 15:25:04,037][12883] Updated weights for policy 0, policy_version 160604 (0.0035) +[2024-06-18 15:25:06,995][12645] Fps is (10 sec: 40953.8, 60 sec: 42597.3, 300 sec: 42542.7). Total num frames: 2631450624. Throughput: 0: 42486.2. Samples: 2631568260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:06,996][12645] Avg episode reward: [(0, '0.775')] +[2024-06-18 15:25:08,493][12883] Updated weights for policy 0, policy_version 160614 (0.0030) +[2024-06-18 15:25:11,588][12883] Updated weights for policy 0, policy_version 160624 (0.0027) +[2024-06-18 15:25:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2631680000. Throughput: 0: 42577.2. Samples: 2631819960. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:11,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 15:25:16,170][12883] Updated weights for policy 0, policy_version 160634 (0.0046) +[2024-06-18 15:25:17,000][12645] Fps is (10 sec: 40940.5, 60 sec: 42047.9, 300 sec: 42486.4). Total num frames: 2631860224. Throughput: 0: 42479.9. Samples: 2631954740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:17,000][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 15:25:19,369][12883] Updated weights for policy 0, policy_version 160644 (0.0046) +[2024-06-18 15:25:21,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 2632089600. Throughput: 0: 42474.0. Samples: 2632198400. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:21,996][12645] Avg episode reward: [(0, '0.804')] +[2024-06-18 15:25:23,693][12883] Updated weights for policy 0, policy_version 160654 (0.0037) +[2024-06-18 15:25:26,869][12883] Updated weights for policy 0, policy_version 160664 (0.0044) +[2024-06-18 15:25:26,994][12645] Fps is (10 sec: 45903.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2632318976. Throughput: 0: 42700.4. Samples: 2632462680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:26,994][12645] Avg episode reward: [(0, '0.678')] +[2024-06-18 15:25:31,263][12883] Updated weights for policy 0, policy_version 160674 (0.0030) +[2024-06-18 15:25:31,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2632515584. Throughput: 0: 42435.0. Samples: 2632590280. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:31,996][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 15:25:33,297][12862] Signal inference workers to stop experience collection... (38500 times) +[2024-06-18 15:25:33,333][12883] InferenceWorker_p0-w0: stopping experience collection (38500 times) +[2024-06-18 15:25:33,356][12862] Signal inference workers to resume experience collection... (38500 times) +[2024-06-18 15:25:33,357][12883] InferenceWorker_p0-w0: resuming experience collection (38500 times) +[2024-06-18 15:25:34,637][12883] Updated weights for policy 0, policy_version 160684 (0.0037) +[2024-06-18 15:25:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42598.7). Total num frames: 2632744960. Throughput: 0: 42492.0. Samples: 2632838840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:36,994][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 15:25:39,167][12883] Updated weights for policy 0, policy_version 160694 (0.0030) +[2024-06-18 15:25:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2632941568. Throughput: 0: 42610.6. Samples: 2633099900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:41,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 15:25:42,391][12883] Updated weights for policy 0, policy_version 160704 (0.0029) +[2024-06-18 15:25:46,870][12883] Updated weights for policy 0, policy_version 160714 (0.0039) +[2024-06-18 15:25:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2633138176. Throughput: 0: 42421.8. Samples: 2633225220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:46,994][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 15:25:50,131][12883] Updated weights for policy 0, policy_version 160724 (0.0028) +[2024-06-18 15:25:52,000][12645] Fps is (10 sec: 45846.8, 60 sec: 43413.1, 300 sec: 42708.6). Total num frames: 2633400320. Throughput: 0: 42537.8. Samples: 2633482660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:52,000][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 15:25:54,448][12883] Updated weights for policy 0, policy_version 160734 (0.0033) +[2024-06-18 15:25:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2633596928. Throughput: 0: 42777.8. Samples: 2633744960. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) +[2024-06-18 15:25:56,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 15:25:57,756][12883] Updated weights for policy 0, policy_version 160744 (0.0044) +[2024-06-18 15:26:01,994][12645] Fps is (10 sec: 37707.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2633777152. Throughput: 0: 42568.7. Samples: 2633870060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:01,994][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 15:26:02,115][12883] Updated weights for policy 0, policy_version 160754 (0.0036) +[2024-06-18 15:26:05,415][12883] Updated weights for policy 0, policy_version 160764 (0.0041) +[2024-06-18 15:26:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42872.6, 300 sec: 42653.9). Total num frames: 2634022912. Throughput: 0: 42763.0. Samples: 2634122640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:06,994][12645] Avg episode reward: [(0, '0.685')] +[2024-06-18 15:26:09,764][12883] Updated weights for policy 0, policy_version 160774 (0.0032) +[2024-06-18 15:26:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2634219520. Throughput: 0: 42770.7. Samples: 2634387360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:11,994][12645] Avg episode reward: [(0, '0.695')] +[2024-06-18 15:26:13,089][12883] Updated weights for policy 0, policy_version 160784 (0.0041) +[2024-06-18 15:26:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42876.0, 300 sec: 42487.3). Total num frames: 2634432512. Throughput: 0: 42654.3. Samples: 2634509720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:16,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 15:26:17,431][12883] Updated weights for policy 0, policy_version 160794 (0.0024) +[2024-06-18 15:26:20,799][12883] Updated weights for policy 0, policy_version 160804 (0.0031) +[2024-06-18 15:26:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42873.0, 300 sec: 42653.9). Total num frames: 2634661888. Throughput: 0: 42821.4. Samples: 2634765800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:21,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 15:26:25,250][12883] Updated weights for policy 0, policy_version 160814 (0.0045) +[2024-06-18 15:26:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2634858496. Throughput: 0: 42803.7. Samples: 2635026060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:26,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 15:26:28,542][12883] Updated weights for policy 0, policy_version 160824 (0.0033) +[2024-06-18 15:26:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2635071488. Throughput: 0: 42680.3. Samples: 2635145840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:31,994][12645] Avg episode reward: [(0, '0.697')] +[2024-06-18 15:26:32,829][12883] Updated weights for policy 0, policy_version 160834 (0.0049) +[2024-06-18 15:26:36,214][12883] Updated weights for policy 0, policy_version 160844 (0.0040) +[2024-06-18 15:26:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2635284480. Throughput: 0: 42695.6. Samples: 2635403700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:36,994][12645] Avg episode reward: [(0, '0.666')] +[2024-06-18 15:26:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160845_2635284480.pth... +[2024-06-18 15:26:37,098][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160223_2625093632.pth +[2024-06-18 15:26:40,470][12883] Updated weights for policy 0, policy_version 160854 (0.0049) +[2024-06-18 15:26:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2635497472. Throughput: 0: 42471.5. Samples: 2635656180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:41,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 15:26:44,003][12883] Updated weights for policy 0, policy_version 160864 (0.0029) +[2024-06-18 15:26:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2635710464. Throughput: 0: 42512.0. Samples: 2635783100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:46,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 15:26:47,994][12883] Updated weights for policy 0, policy_version 160874 (0.0025) +[2024-06-18 15:26:51,595][12883] Updated weights for policy 0, policy_version 160884 (0.0043) +[2024-06-18 15:26:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42329.6, 300 sec: 42653.9). Total num frames: 2635939840. Throughput: 0: 42720.8. Samples: 2636045080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:51,994][12645] Avg episode reward: [(0, '0.716')] +[2024-06-18 15:26:55,898][12883] Updated weights for policy 0, policy_version 160894 (0.0031) +[2024-06-18 15:26:56,996][12645] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42487.3). Total num frames: 2636136448. Throughput: 0: 42460.9. Samples: 2636298200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) +[2024-06-18 15:26:56,996][12645] Avg episode reward: [(0, '0.653')] +[2024-06-18 15:26:59,338][12883] Updated weights for policy 0, policy_version 160904 (0.0040) +[2024-06-18 15:27:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2636365824. Throughput: 0: 42610.5. Samples: 2636427200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:01,994][12645] Avg episode reward: [(0, '0.548')] +[2024-06-18 15:27:03,576][12883] Updated weights for policy 0, policy_version 160914 (0.0044) +[2024-06-18 15:27:06,994][12645] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2636562432. Throughput: 0: 42634.7. Samples: 2636684360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:06,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 15:27:07,086][12883] Updated weights for policy 0, policy_version 160924 (0.0038) +[2024-06-18 15:27:07,089][12862] Signal inference workers to stop experience collection... (38550 times) +[2024-06-18 15:27:07,089][12862] Signal inference workers to resume experience collection... (38550 times) +[2024-06-18 15:27:07,124][12883] InferenceWorker_p0-w0: stopping experience collection (38550 times) +[2024-06-18 15:27:07,124][12883] InferenceWorker_p0-w0: resuming experience collection (38550 times) +[2024-06-18 15:27:11,023][12883] Updated weights for policy 0, policy_version 160934 (0.0042) +[2024-06-18 15:27:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2636791808. Throughput: 0: 42516.4. Samples: 2636939300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:11,994][12645] Avg episode reward: [(0, '0.355')] +[2024-06-18 15:27:14,660][12883] Updated weights for policy 0, policy_version 160944 (0.0051) +[2024-06-18 15:27:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2637004800. Throughput: 0: 42701.9. Samples: 2637067420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:16,994][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 15:27:18,560][12883] Updated weights for policy 0, policy_version 160954 (0.0046) +[2024-06-18 15:27:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2637201408. Throughput: 0: 42589.5. Samples: 2637320220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:21,994][12645] Avg episode reward: [(0, '0.711')] +[2024-06-18 15:27:22,389][12883] Updated weights for policy 0, policy_version 160964 (0.0048) +[2024-06-18 15:27:26,559][12883] Updated weights for policy 0, policy_version 160974 (0.0034) +[2024-06-18 15:27:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2637430784. Throughput: 0: 42863.2. Samples: 2637585020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:26,994][12645] Avg episode reward: [(0, '0.643')] +[2024-06-18 15:27:30,105][12883] Updated weights for policy 0, policy_version 160984 (0.0034) +[2024-06-18 15:27:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2637643776. Throughput: 0: 42846.7. Samples: 2637711200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:31,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 15:27:34,062][12883] Updated weights for policy 0, policy_version 160994 (0.0037) +[2024-06-18 15:27:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2637856768. Throughput: 0: 42606.7. Samples: 2637962380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:36,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 15:27:37,826][12883] Updated weights for policy 0, policy_version 161004 (0.0024) +[2024-06-18 15:27:41,636][12883] Updated weights for policy 0, policy_version 161014 (0.0035) +[2024-06-18 15:27:41,995][12645] Fps is (10 sec: 42592.9, 60 sec: 42870.6, 300 sec: 42598.2). Total num frames: 2638069760. Throughput: 0: 42870.5. Samples: 2638227320. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:41,995][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 15:27:45,537][12883] Updated weights for policy 0, policy_version 161024 (0.0028) +[2024-06-18 15:27:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2638282752. Throughput: 0: 42752.1. Samples: 2638351040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:46,994][12645] Avg episode reward: [(0, '0.680')] +[2024-06-18 15:27:49,058][12883] Updated weights for policy 0, policy_version 161034 (0.0039) +[2024-06-18 15:27:51,994][12645] Fps is (10 sec: 42603.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2638495744. Throughput: 0: 42794.2. Samples: 2638610100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:51,994][12645] Avg episode reward: [(0, '0.680')] +[2024-06-18 15:27:53,195][12883] Updated weights for policy 0, policy_version 161044 (0.0028) +[2024-06-18 15:27:56,629][12883] Updated weights for policy 0, policy_version 161054 (0.0034) +[2024-06-18 15:27:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 2638725120. Throughput: 0: 42852.0. Samples: 2638867640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:27:56,994][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 15:28:00,878][12883] Updated weights for policy 0, policy_version 161064 (0.0030) +[2024-06-18 15:28:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2638921728. Throughput: 0: 42872.0. Samples: 2638996660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) +[2024-06-18 15:28:01,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 15:28:04,041][12883] Updated weights for policy 0, policy_version 161074 (0.0028) +[2024-06-18 15:28:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2639151104. Throughput: 0: 43110.9. Samples: 2639260220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:06,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 15:28:08,418][12883] Updated weights for policy 0, policy_version 161084 (0.0035) +[2024-06-18 15:28:11,899][12883] Updated weights for policy 0, policy_version 161094 (0.0025) +[2024-06-18 15:28:11,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2639364096. Throughput: 0: 42960.0. Samples: 2639518320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:11,997][12645] Avg episode reward: [(0, '0.381')] +[2024-06-18 15:28:16,167][12883] Updated weights for policy 0, policy_version 161104 (0.0031) +[2024-06-18 15:28:16,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2639560704. Throughput: 0: 43006.6. Samples: 2639646500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:16,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 15:28:19,431][12883] Updated weights for policy 0, policy_version 161114 (0.0033) +[2024-06-18 15:28:21,994][12645] Fps is (10 sec: 44246.7, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 2639806464. Throughput: 0: 43266.3. Samples: 2639909360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:21,994][12645] Avg episode reward: [(0, '0.486')] +[2024-06-18 15:28:23,731][12883] Updated weights for policy 0, policy_version 161124 (0.0031) +[2024-06-18 15:28:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 2640003072. Throughput: 0: 43117.2. Samples: 2640167540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:26,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 15:28:27,119][12883] Updated weights for policy 0, policy_version 161134 (0.0028) +[2024-06-18 15:28:31,141][12883] Updated weights for policy 0, policy_version 161144 (0.0043) +[2024-06-18 15:28:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2640216064. Throughput: 0: 43260.9. Samples: 2640297780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:31,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 15:28:34,490][12883] Updated weights for policy 0, policy_version 161154 (0.0044) +[2024-06-18 15:28:35,592][12862] Signal inference workers to stop experience collection... (38600 times) +[2024-06-18 15:28:35,593][12862] Signal inference workers to resume experience collection... (38600 times) +[2024-06-18 15:28:35,637][12883] InferenceWorker_p0-w0: stopping experience collection (38600 times) +[2024-06-18 15:28:35,637][12883] InferenceWorker_p0-w0: resuming experience collection (38600 times) +[2024-06-18 15:28:37,000][12645] Fps is (10 sec: 45846.3, 60 sec: 43413.1, 300 sec: 42764.1). Total num frames: 2640461824. Throughput: 0: 43245.9. Samples: 2640556440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:37,001][12645] Avg episode reward: [(0, '0.405')] +[2024-06-18 15:28:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161161_2640461824.pth... +[2024-06-18 15:28:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160531_2630139904.pth +[2024-06-18 15:28:38,696][12883] Updated weights for policy 0, policy_version 161164 (0.0045) +[2024-06-18 15:28:41,980][12883] Updated weights for policy 0, policy_version 161174 (0.0030) +[2024-06-18 15:28:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43418.4, 300 sec: 42765.0). Total num frames: 2640674816. Throughput: 0: 43249.7. Samples: 2640813880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:41,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 15:28:46,186][12883] Updated weights for policy 0, policy_version 161184 (0.0041) +[2024-06-18 15:28:47,000][12645] Fps is (10 sec: 39321.9, 60 sec: 42867.1, 300 sec: 42764.1). Total num frames: 2640855040. Throughput: 0: 43152.7. Samples: 2640938800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:47,000][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 15:28:49,714][12883] Updated weights for policy 0, policy_version 161194 (0.0033) +[2024-06-18 15:28:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 2641084416. Throughput: 0: 43068.7. Samples: 2641198300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:51,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 15:28:54,010][12883] Updated weights for policy 0, policy_version 161204 (0.0029) +[2024-06-18 15:28:56,997][12645] Fps is (10 sec: 45886.5, 60 sec: 43141.8, 300 sec: 42764.5). Total num frames: 2641313792. Throughput: 0: 43113.7. Samples: 2641458500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:28:56,998][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 15:28:57,317][12883] Updated weights for policy 0, policy_version 161214 (0.0028) +[2024-06-18 15:29:01,394][12883] Updated weights for policy 0, policy_version 161224 (0.0037) +[2024-06-18 15:29:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2641510400. Throughput: 0: 43113.2. Samples: 2641586600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:29:01,995][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 15:29:05,437][12883] Updated weights for policy 0, policy_version 161234 (0.0042) +[2024-06-18 15:29:06,994][12645] Fps is (10 sec: 42614.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2641739776. Throughput: 0: 43083.2. Samples: 2641848100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) +[2024-06-18 15:29:06,994][12645] Avg episode reward: [(0, '0.618')] +[2024-06-18 15:29:09,199][12883] Updated weights for policy 0, policy_version 161244 (0.0033) +[2024-06-18 15:29:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2641936384. Throughput: 0: 42880.9. Samples: 2642097180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:11,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 15:29:12,981][12883] Updated weights for policy 0, policy_version 161254 (0.0027) +[2024-06-18 15:29:16,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2642132992. Throughput: 0: 42868.9. Samples: 2642226880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:16,994][12645] Avg episode reward: [(0, '0.430')] +[2024-06-18 15:29:17,211][12883] Updated weights for policy 0, policy_version 161264 (0.0048) +[2024-06-18 15:29:20,742][12883] Updated weights for policy 0, policy_version 161274 (0.0037) +[2024-06-18 15:29:21,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 2642362368. Throughput: 0: 42719.6. Samples: 2642478560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:21,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 15:29:24,706][12883] Updated weights for policy 0, policy_version 161284 (0.0038) +[2024-06-18 15:29:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2642575360. Throughput: 0: 42710.6. Samples: 2642735860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:26,994][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 15:29:28,317][12883] Updated weights for policy 0, policy_version 161294 (0.0026) +[2024-06-18 15:29:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2642771968. Throughput: 0: 42809.9. Samples: 2642864980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:31,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 15:29:32,455][12883] Updated weights for policy 0, policy_version 161304 (0.0028) +[2024-06-18 15:29:35,951][12883] Updated weights for policy 0, policy_version 161314 (0.0036) +[2024-06-18 15:29:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42329.6, 300 sec: 42653.9). Total num frames: 2643001344. Throughput: 0: 42673.5. Samples: 2643118620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:36,994][12645] Avg episode reward: [(0, '0.394')] +[2024-06-18 15:29:39,943][12883] Updated weights for policy 0, policy_version 161324 (0.0034) +[2024-06-18 15:29:41,993][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 2643197952. Throughput: 0: 42774.4. Samples: 2643383180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:41,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 15:29:43,361][12883] Updated weights for policy 0, policy_version 161334 (0.0038) +[2024-06-18 15:29:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42875.9, 300 sec: 42820.6). Total num frames: 2643427328. Throughput: 0: 42608.0. Samples: 2643503960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:46,994][12645] Avg episode reward: [(0, '0.581')] +[2024-06-18 15:29:47,908][12883] Updated weights for policy 0, policy_version 161344 (0.0031) +[2024-06-18 15:29:50,847][12883] Updated weights for policy 0, policy_version 161354 (0.0036) +[2024-06-18 15:29:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2643656704. Throughput: 0: 42383.6. Samples: 2643755360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:51,994][12645] Avg episode reward: [(0, '0.385')] +[2024-06-18 15:29:55,535][12883] Updated weights for policy 0, policy_version 161364 (0.0035) +[2024-06-18 15:29:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42327.9, 300 sec: 42765.0). Total num frames: 2643853312. Throughput: 0: 42803.0. Samples: 2644023320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:29:56,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 15:29:58,705][12883] Updated weights for policy 0, policy_version 161374 (0.0027) +[2024-06-18 15:30:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.2). Total num frames: 2644066304. Throughput: 0: 42611.3. Samples: 2644144380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:30:01,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 15:30:03,066][12883] Updated weights for policy 0, policy_version 161384 (0.0030) +[2024-06-18 15:30:06,494][12883] Updated weights for policy 0, policy_version 161394 (0.0037) +[2024-06-18 15:30:06,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2644295680. Throughput: 0: 42774.4. Samples: 2644403400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:30:06,994][12645] Avg episode reward: [(0, '0.494')] +[2024-06-18 15:30:10,719][12883] Updated weights for policy 0, policy_version 161404 (0.0033) +[2024-06-18 15:30:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 2644492288. Throughput: 0: 42765.0. Samples: 2644660280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) +[2024-06-18 15:30:11,994][12645] Avg episode reward: [(0, '0.652')] +[2024-06-18 15:30:14,147][12883] Updated weights for policy 0, policy_version 161414 (0.0043) +[2024-06-18 15:30:14,152][12862] Signal inference workers to stop experience collection... (38650 times) +[2024-06-18 15:30:14,153][12862] Signal inference workers to resume experience collection... (38650 times) +[2024-06-18 15:30:14,196][12883] InferenceWorker_p0-w0: stopping experience collection (38650 times) +[2024-06-18 15:30:14,196][12883] InferenceWorker_p0-w0: resuming experience collection (38650 times) +[2024-06-18 15:30:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42765.3). Total num frames: 2644705280. Throughput: 0: 42686.3. Samples: 2644785860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:16,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 15:30:18,191][12883] Updated weights for policy 0, policy_version 161424 (0.0030) +[2024-06-18 15:30:21,625][12883] Updated weights for policy 0, policy_version 161434 (0.0037) +[2024-06-18 15:30:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2644951040. Throughput: 0: 42881.6. Samples: 2645048280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:21,994][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 15:30:25,918][12883] Updated weights for policy 0, policy_version 161444 (0.0031) +[2024-06-18 15:30:26,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2645131264. Throughput: 0: 42832.2. Samples: 2645310640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:26,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 15:30:29,200][12883] Updated weights for policy 0, policy_version 161454 (0.0039) +[2024-06-18 15:30:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2645344256. Throughput: 0: 42771.7. Samples: 2645428680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:31,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 15:30:33,630][12883] Updated weights for policy 0, policy_version 161464 (0.0027) +[2024-06-18 15:30:36,859][12883] Updated weights for policy 0, policy_version 161474 (0.0024) +[2024-06-18 15:30:36,994][12645] Fps is (10 sec: 47514.5, 60 sec: 43417.8, 300 sec: 42931.6). Total num frames: 2645606400. Throughput: 0: 42979.9. Samples: 2645689460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:36,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 15:30:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161475_2645606400.pth... +[2024-06-18 15:30:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160845_2635284480.pth +[2024-06-18 15:30:41,383][12883] Updated weights for policy 0, policy_version 161484 (0.0033) +[2024-06-18 15:30:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2645770240. Throughput: 0: 42769.4. Samples: 2645947940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:41,994][12645] Avg episode reward: [(0, '0.687')] +[2024-06-18 15:30:44,520][12883] Updated weights for policy 0, policy_version 161494 (0.0028) +[2024-06-18 15:30:46,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 2645983232. Throughput: 0: 42760.8. Samples: 2646068620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:46,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 15:30:49,088][12883] Updated weights for policy 0, policy_version 161504 (0.0042) +[2024-06-18 15:30:51,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2646228992. Throughput: 0: 42925.0. Samples: 2646335020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:51,994][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 15:30:52,127][12883] Updated weights for policy 0, policy_version 161514 (0.0035) +[2024-06-18 15:30:56,607][12883] Updated weights for policy 0, policy_version 161524 (0.0023) +[2024-06-18 15:30:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2646425600. Throughput: 0: 42992.8. Samples: 2646594960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:30:56,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 15:30:59,720][12883] Updated weights for policy 0, policy_version 161534 (0.0022) +[2024-06-18 15:31:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2646638592. Throughput: 0: 42923.9. Samples: 2646717440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:31:01,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 15:31:04,271][12883] Updated weights for policy 0, policy_version 161544 (0.0028) +[2024-06-18 15:31:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2646884352. Throughput: 0: 42836.0. Samples: 2646975900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:31:06,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 15:31:07,380][12883] Updated weights for policy 0, policy_version 161554 (0.0034) +[2024-06-18 15:31:11,947][12883] Updated weights for policy 0, policy_version 161564 (0.0028) +[2024-06-18 15:31:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2647064576. Throughput: 0: 42847.6. Samples: 2647238780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:31:11,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 15:31:15,341][12883] Updated weights for policy 0, policy_version 161574 (0.0046) +[2024-06-18 15:31:16,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2647277568. Throughput: 0: 42740.9. Samples: 2647352120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) +[2024-06-18 15:31:16,996][12645] Avg episode reward: [(0, '0.212')] +[2024-06-18 15:31:19,716][12883] Updated weights for policy 0, policy_version 161584 (0.0037) +[2024-06-18 15:31:21,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2647523328. Throughput: 0: 42805.7. Samples: 2647615720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:31:21,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 15:31:22,925][12862] Signal inference workers to stop experience collection... (38700 times) +[2024-06-18 15:31:22,951][12883] InferenceWorker_p0-w0: stopping experience collection (38700 times) +[2024-06-18 15:31:23,035][12862] Signal inference workers to resume experience collection... (38700 times) +[2024-06-18 15:31:23,035][12883] InferenceWorker_p0-w0: resuming experience collection (38700 times) +[2024-06-18 15:31:23,037][12883] Updated weights for policy 0, policy_version 161594 (0.0027) +[2024-06-18 15:31:26,994][12645] Fps is (10 sec: 39330.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2647670784. Throughput: 0: 42710.3. Samples: 2647869900. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:31:26,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 15:31:27,431][12883] Updated weights for policy 0, policy_version 161604 (0.0033) +[2024-06-18 15:31:30,550][12883] Updated weights for policy 0, policy_version 161614 (0.0050) +[2024-06-18 15:31:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2647916544. Throughput: 0: 42627.6. Samples: 2647986860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:31:31,994][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 15:31:35,138][12883] Updated weights for policy 0, policy_version 161624 (0.0037) +[2024-06-18 15:31:36,994][12645] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2648162304. Throughput: 0: 42629.3. Samples: 2648253340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:31:36,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 15:31:38,153][12883] Updated weights for policy 0, policy_version 161634 (0.0026) +[2024-06-18 15:31:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2648326144. Throughput: 0: 42575.2. Samples: 2648510840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:31:41,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 15:31:42,748][12883] Updated weights for policy 0, policy_version 161644 (0.0035) +[2024-06-18 15:31:46,081][12883] Updated weights for policy 0, policy_version 161654 (0.0040) +[2024-06-18 15:31:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2648555520. Throughput: 0: 42484.9. Samples: 2648629260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:31:46,994][12645] Avg episode reward: [(0, '0.910')] +[2024-06-18 15:31:50,729][12883] Updated weights for policy 0, policy_version 161664 (0.0036) +[2024-06-18 15:31:51,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 2648801280. Throughput: 0: 42593.8. Samples: 2648892620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:31:51,994][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 15:31:53,896][12883] Updated weights for policy 0, policy_version 161674 (0.0027) +[2024-06-18 15:31:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2648965120. Throughput: 0: 42380.5. Samples: 2649145900. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:31:56,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 15:31:58,463][12883] Updated weights for policy 0, policy_version 161684 (0.0024) +[2024-06-18 15:32:01,559][12883] Updated weights for policy 0, policy_version 161694 (0.0022) +[2024-06-18 15:32:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2649194496. Throughput: 0: 42411.9. Samples: 2649260560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:32:01,994][12645] Avg episode reward: [(0, '0.324')] +[2024-06-18 15:32:06,034][12883] Updated weights for policy 0, policy_version 161704 (0.0032) +[2024-06-18 15:32:06,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2649423872. Throughput: 0: 42480.1. Samples: 2649527320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:32:06,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 15:32:09,129][12883] Updated weights for policy 0, policy_version 161714 (0.0044) +[2024-06-18 15:32:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2649604096. Throughput: 0: 42594.3. Samples: 2649786640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:32:11,994][12645] Avg episode reward: [(0, '0.429')] +[2024-06-18 15:32:13,745][12883] Updated weights for policy 0, policy_version 161724 (0.0035) +[2024-06-18 15:32:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 2649833472. Throughput: 0: 42549.4. Samples: 2649901580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:32:16,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 15:32:17,073][12883] Updated weights for policy 0, policy_version 161734 (0.0032) +[2024-06-18 15:32:21,415][12883] Updated weights for policy 0, policy_version 161744 (0.0040) +[2024-06-18 15:32:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2650046464. Throughput: 0: 42471.6. Samples: 2650164560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) +[2024-06-18 15:32:21,994][12645] Avg episode reward: [(0, '0.498')] +[2024-06-18 15:32:24,583][12883] Updated weights for policy 0, policy_version 161754 (0.0031) +[2024-06-18 15:32:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2650226688. Throughput: 0: 42461.7. Samples: 2650421620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:32:26,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 15:32:29,223][12883] Updated weights for policy 0, policy_version 161764 (0.0031) +[2024-06-18 15:32:29,249][12862] Signal inference workers to stop experience collection... (38750 times) +[2024-06-18 15:32:29,250][12862] Signal inference workers to resume experience collection... (38750 times) +[2024-06-18 15:32:29,286][12883] InferenceWorker_p0-w0: stopping experience collection (38750 times) +[2024-06-18 15:32:29,286][12883] InferenceWorker_p0-w0: resuming experience collection (38750 times) +[2024-06-18 15:32:32,000][12645] Fps is (10 sec: 44208.6, 60 sec: 42866.9, 300 sec: 42819.7). Total num frames: 2650488832. Throughput: 0: 42448.3. Samples: 2650539700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:32:32,001][12645] Avg episode reward: [(0, '0.254')] +[2024-06-18 15:32:32,715][12883] Updated weights for policy 0, policy_version 161774 (0.0036) +[2024-06-18 15:32:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 42654.1). Total num frames: 2650652672. Throughput: 0: 42453.3. Samples: 2650803020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:32:36,994][12645] Avg episode reward: [(0, '0.556')] +[2024-06-18 15:32:37,012][12883] Updated weights for policy 0, policy_version 161784 (0.0046) +[2024-06-18 15:32:37,145][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161785_2650685440.pth... +[2024-06-18 15:32:37,215][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161161_2640461824.pth +[2024-06-18 15:32:40,290][12883] Updated weights for policy 0, policy_version 161794 (0.0034) +[2024-06-18 15:32:41,994][12645] Fps is (10 sec: 37706.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2650865664. Throughput: 0: 42365.7. Samples: 2651052360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:32:41,994][12645] Avg episode reward: [(0, '0.569')] +[2024-06-18 15:32:44,726][12883] Updated weights for policy 0, policy_version 161804 (0.0029) +[2024-06-18 15:32:46,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2651127808. Throughput: 0: 42626.1. Samples: 2651178740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:32:46,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 15:32:47,735][12883] Updated weights for policy 0, policy_version 161814 (0.0033) +[2024-06-18 15:32:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 2651291648. Throughput: 0: 42517.8. Samples: 2651440620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:32:51,994][12645] Avg episode reward: [(0, '0.628')] +[2024-06-18 15:32:52,254][12883] Updated weights for policy 0, policy_version 161824 (0.0035) +[2024-06-18 15:32:55,473][12883] Updated weights for policy 0, policy_version 161834 (0.0033) +[2024-06-18 15:32:56,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2651504640. Throughput: 0: 42254.1. Samples: 2651688080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:32:56,994][12645] Avg episode reward: [(0, '0.426')] +[2024-06-18 15:32:59,911][12883] Updated weights for policy 0, policy_version 161844 (0.0031) +[2024-06-18 15:33:01,996][12645] Fps is (10 sec: 47502.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2651766784. Throughput: 0: 42524.9. Samples: 2651815300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:33:01,997][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 15:33:03,312][12883] Updated weights for policy 0, policy_version 161854 (0.0047) +[2024-06-18 15:33:06,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 42543.2). Total num frames: 2651914240. Throughput: 0: 42351.6. Samples: 2652070380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:33:06,994][12645] Avg episode reward: [(0, '0.622')] +[2024-06-18 15:33:07,923][12883] Updated weights for policy 0, policy_version 161864 (0.0037) +[2024-06-18 15:33:11,076][12883] Updated weights for policy 0, policy_version 161874 (0.0032) +[2024-06-18 15:33:11,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2652160000. Throughput: 0: 42092.5. Samples: 2652315780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:33:11,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 15:33:15,186][12862] Signal inference workers to stop experience collection... (38800 times) +[2024-06-18 15:33:15,233][12883] InferenceWorker_p0-w0: stopping experience collection (38800 times) +[2024-06-18 15:33:15,244][12862] Signal inference workers to resume experience collection... (38800 times) +[2024-06-18 15:33:15,251][12883] InferenceWorker_p0-w0: resuming experience collection (38800 times) +[2024-06-18 15:33:15,572][12883] Updated weights for policy 0, policy_version 161884 (0.0038) +[2024-06-18 15:33:16,996][12645] Fps is (10 sec: 49140.4, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 2652405760. Throughput: 0: 42459.4. Samples: 2652450200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:33:16,997][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 15:33:18,568][12883] Updated weights for policy 0, policy_version 161894 (0.0026) +[2024-06-18 15:33:22,000][12645] Fps is (10 sec: 40934.2, 60 sec: 42047.8, 300 sec: 42597.5). Total num frames: 2652569600. Throughput: 0: 42303.9. Samples: 2652706960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:33:22,009][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 15:33:23,130][12883] Updated weights for policy 0, policy_version 161904 (0.0034) +[2024-06-18 15:33:26,994][12645] Fps is (10 sec: 37691.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2652782592. Throughput: 0: 42395.6. Samples: 2652960160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) +[2024-06-18 15:33:26,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 15:33:27,207][12883] Updated weights for policy 0, policy_version 161914 (0.0028) +[2024-06-18 15:33:30,672][12883] Updated weights for policy 0, policy_version 161924 (0.0034) +[2024-06-18 15:33:31,994][12645] Fps is (10 sec: 45903.5, 60 sec: 42329.7, 300 sec: 42599.3). Total num frames: 2653028352. Throughput: 0: 42719.1. Samples: 2653101100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:33:31,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 15:33:34,856][12883] Updated weights for policy 0, policy_version 161934 (0.0026) +[2024-06-18 15:33:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2653192192. Throughput: 0: 42541.8. Samples: 2653355000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:33:36,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 15:33:38,277][12883] Updated weights for policy 0, policy_version 161944 (0.0024) +[2024-06-18 15:33:41,996][12645] Fps is (10 sec: 40951.3, 60 sec: 42870.0, 300 sec: 42654.5). Total num frames: 2653437952. Throughput: 0: 42528.2. Samples: 2653601940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:33:41,997][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 15:33:42,347][12883] Updated weights for policy 0, policy_version 161954 (0.0039) +[2024-06-18 15:33:45,846][12883] Updated weights for policy 0, policy_version 161964 (0.0026) +[2024-06-18 15:33:46,994][12645] Fps is (10 sec: 49151.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2653683712. Throughput: 0: 42920.8. Samples: 2653746640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:33:46,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 15:33:49,770][12883] Updated weights for policy 0, policy_version 161974 (0.0048) +[2024-06-18 15:33:51,994][12645] Fps is (10 sec: 39329.8, 60 sec: 42325.2, 300 sec: 42432.3). Total num frames: 2653831168. Throughput: 0: 42770.0. Samples: 2653995040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:33:51,994][12645] Avg episode reward: [(0, '0.333')] +[2024-06-18 15:33:53,499][12883] Updated weights for policy 0, policy_version 161984 (0.0038) +[2024-06-18 15:33:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2654093312. Throughput: 0: 42747.4. Samples: 2654239420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:33:56,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 15:33:57,349][12883] Updated weights for policy 0, policy_version 161994 (0.0034) +[2024-06-18 15:34:01,365][12883] Updated weights for policy 0, policy_version 162004 (0.0040) +[2024-06-18 15:34:01,997][12645] Fps is (10 sec: 47496.3, 60 sec: 42324.3, 300 sec: 42597.8). Total num frames: 2654306304. Throughput: 0: 42911.4. Samples: 2654381280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:34:01,998][12645] Avg episode reward: [(0, '0.175')] +[2024-06-18 15:34:04,845][12883] Updated weights for policy 0, policy_version 162014 (0.0030) +[2024-06-18 15:34:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2654470144. Throughput: 0: 42635.2. Samples: 2654625280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:34:06,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 15:34:08,278][12862] Signal inference workers to stop experience collection... (38850 times) +[2024-06-18 15:34:08,317][12883] InferenceWorker_p0-w0: stopping experience collection (38850 times) +[2024-06-18 15:34:08,324][12862] Signal inference workers to resume experience collection... (38850 times) +[2024-06-18 15:34:08,333][12883] InferenceWorker_p0-w0: resuming experience collection (38850 times) +[2024-06-18 15:34:08,948][12883] Updated weights for policy 0, policy_version 162024 (0.0042) +[2024-06-18 15:34:11,994][12645] Fps is (10 sec: 44252.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2654748672. Throughput: 0: 42633.3. Samples: 2654878660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:34:11,994][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 15:34:12,627][12883] Updated weights for policy 0, policy_version 162034 (0.0033) +[2024-06-18 15:34:16,419][12883] Updated weights for policy 0, policy_version 162044 (0.0034) +[2024-06-18 15:34:16,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 2654945280. Throughput: 0: 42633.4. Samples: 2655019600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:34:16,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 15:34:20,479][12883] Updated weights for policy 0, policy_version 162054 (0.0035) +[2024-06-18 15:34:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42602.8, 300 sec: 42542.9). Total num frames: 2655125504. Throughput: 0: 42423.4. Samples: 2655264060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:34:21,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 15:34:24,090][12883] Updated weights for policy 0, policy_version 162064 (0.0041) +[2024-06-18 15:34:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2655371264. Throughput: 0: 42561.7. Samples: 2655517120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:34:26,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 15:34:28,315][12883] Updated weights for policy 0, policy_version 162074 (0.0041) +[2024-06-18 15:34:31,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 2655567872. Throughput: 0: 42321.4. Samples: 2655651100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) +[2024-06-18 15:34:31,994][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 15:34:32,073][12883] Updated weights for policy 0, policy_version 162084 (0.0034) +[2024-06-18 15:34:35,975][12883] Updated weights for policy 0, policy_version 162094 (0.0026) +[2024-06-18 15:34:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2655764480. Throughput: 0: 42288.2. Samples: 2655898000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:34:36,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 15:34:37,074][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162096_2655780864.pth... +[2024-06-18 15:34:37,133][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161475_2645606400.pth +[2024-06-18 15:34:39,866][12883] Updated weights for policy 0, policy_version 162104 (0.0027) +[2024-06-18 15:34:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2655993856. Throughput: 0: 42567.7. Samples: 2656154960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:34:41,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 15:34:43,532][12883] Updated weights for policy 0, policy_version 162114 (0.0033) +[2024-06-18 15:34:46,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42050.7, 300 sec: 42542.5). Total num frames: 2656206848. Throughput: 0: 42336.1. Samples: 2656286340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:34:46,996][12645] Avg episode reward: [(0, '0.476')] +[2024-06-18 15:34:47,591][12883] Updated weights for policy 0, policy_version 162124 (0.0026) +[2024-06-18 15:34:51,120][12883] Updated weights for policy 0, policy_version 162134 (0.0035) +[2024-06-18 15:34:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2656419840. Throughput: 0: 42461.3. Samples: 2656536040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:34:51,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 15:34:55,336][12883] Updated weights for policy 0, policy_version 162144 (0.0035) +[2024-06-18 15:34:56,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2656632832. Throughput: 0: 42572.1. Samples: 2656794400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:34:56,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 15:34:58,680][12883] Updated weights for policy 0, policy_version 162154 (0.0026) +[2024-06-18 15:35:01,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42326.4, 300 sec: 42542.5). Total num frames: 2656845824. Throughput: 0: 42293.9. Samples: 2656922920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:35:01,996][12645] Avg episode reward: [(0, '0.329')] +[2024-06-18 15:35:03,070][12883] Updated weights for policy 0, policy_version 162164 (0.0029) +[2024-06-18 15:35:06,599][12883] Updated weights for policy 0, policy_version 162174 (0.0032) +[2024-06-18 15:35:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 2657075200. Throughput: 0: 42481.9. Samples: 2657175740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:35:06,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 15:35:10,871][12883] Updated weights for policy 0, policy_version 162184 (0.0027) +[2024-06-18 15:35:11,994][12645] Fps is (10 sec: 40969.2, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 2657255424. Throughput: 0: 42481.3. Samples: 2657428780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:35:11,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 15:35:14,235][12883] Updated weights for policy 0, policy_version 162194 (0.0023) +[2024-06-18 15:35:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2657484800. Throughput: 0: 42214.6. Samples: 2657550760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:35:16,994][12645] Avg episode reward: [(0, '0.777')] +[2024-06-18 15:35:18,844][12883] Updated weights for policy 0, policy_version 162204 (0.0042) +[2024-06-18 15:35:21,823][12883] Updated weights for policy 0, policy_version 162214 (0.0026) +[2024-06-18 15:35:21,994][12645] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2657730560. Throughput: 0: 42499.4. Samples: 2657810480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:35:21,994][12645] Avg episode reward: [(0, '0.631')] +[2024-06-18 15:35:26,404][12883] Updated weights for policy 0, policy_version 162224 (0.0048) +[2024-06-18 15:35:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2657910784. Throughput: 0: 42494.5. Samples: 2658067220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:35:26,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 15:35:29,900][12883] Updated weights for policy 0, policy_version 162234 (0.0037) +[2024-06-18 15:35:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 2658107392. Throughput: 0: 42379.5. Samples: 2658193320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:35:31,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 15:35:34,093][12883] Updated weights for policy 0, policy_version 162244 (0.0031) +[2024-06-18 15:35:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2658336768. Throughput: 0: 42460.4. Samples: 2658446760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) +[2024-06-18 15:35:36,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 15:35:37,385][12883] Updated weights for policy 0, policy_version 162254 (0.0034) +[2024-06-18 15:35:41,597][12862] Signal inference workers to stop experience collection... (38900 times) +[2024-06-18 15:35:41,598][12862] Signal inference workers to resume experience collection... (38900 times) +[2024-06-18 15:35:41,641][12883] InferenceWorker_p0-w0: stopping experience collection (38900 times) +[2024-06-18 15:35:41,641][12883] InferenceWorker_p0-w0: resuming experience collection (38900 times) +[2024-06-18 15:35:41,731][12883] Updated weights for policy 0, policy_version 162264 (0.0032) +[2024-06-18 15:35:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2658549760. Throughput: 0: 42504.0. Samples: 2658707080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:35:41,994][12645] Avg episode reward: [(0, '0.534')] +[2024-06-18 15:35:44,870][12883] Updated weights for policy 0, policy_version 162274 (0.0029) +[2024-06-18 15:35:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 2658746368. Throughput: 0: 42330.2. Samples: 2658827680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:35:46,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 15:35:49,473][12883] Updated weights for policy 0, policy_version 162284 (0.0030) +[2024-06-18 15:35:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2658992128. Throughput: 0: 42628.5. Samples: 2659094020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:35:51,994][12645] Avg episode reward: [(0, '0.447')] +[2024-06-18 15:35:52,363][12883] Updated weights for policy 0, policy_version 162294 (0.0026) +[2024-06-18 15:35:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2659172352. Throughput: 0: 42633.3. Samples: 2659347280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:35:56,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 15:35:57,254][12883] Updated weights for policy 0, policy_version 162304 (0.0049) +[2024-06-18 15:36:00,213][12883] Updated weights for policy 0, policy_version 162314 (0.0043) +[2024-06-18 15:36:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42599.9, 300 sec: 42431.8). Total num frames: 2659401728. Throughput: 0: 42608.8. Samples: 2659468160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:36:01,994][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 15:36:04,938][12883] Updated weights for policy 0, policy_version 162324 (0.0034) +[2024-06-18 15:36:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2659631104. Throughput: 0: 42810.3. Samples: 2659736940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:36:06,994][12645] Avg episode reward: [(0, '0.616')] +[2024-06-18 15:36:07,878][12883] Updated weights for policy 0, policy_version 162334 (0.0040) +[2024-06-18 15:36:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 2659811328. Throughput: 0: 42823.0. Samples: 2659994260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:36:11,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 15:36:12,523][12883] Updated weights for policy 0, policy_version 162344 (0.0027) +[2024-06-18 15:36:15,433][12883] Updated weights for policy 0, policy_version 162354 (0.0026) +[2024-06-18 15:36:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2660040704. Throughput: 0: 42691.0. Samples: 2660114420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:36:16,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 15:36:20,181][12883] Updated weights for policy 0, policy_version 162364 (0.0035) +[2024-06-18 15:36:21,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2660270080. Throughput: 0: 42840.9. Samples: 2660374600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:36:21,994][12645] Avg episode reward: [(0, '0.720')] +[2024-06-18 15:36:23,167][12883] Updated weights for policy 0, policy_version 162374 (0.0041) +[2024-06-18 15:36:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2660450304. Throughput: 0: 42540.8. Samples: 2660621420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:36:26,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 15:36:28,291][12883] Updated weights for policy 0, policy_version 162384 (0.0035) +[2024-06-18 15:36:30,750][12883] Updated weights for policy 0, policy_version 162394 (0.0033) +[2024-06-18 15:36:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 2660679680. Throughput: 0: 42661.2. Samples: 2660747440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:36:31,994][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 15:36:35,722][12883] Updated weights for policy 0, policy_version 162404 (0.0039) +[2024-06-18 15:36:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2660876288. Throughput: 0: 42526.9. Samples: 2661007740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) +[2024-06-18 15:36:36,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 15:36:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162408_2660892672.pth... +[2024-06-18 15:36:37,124][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161785_2650685440.pth +[2024-06-18 15:36:38,453][12883] Updated weights for policy 0, policy_version 162414 (0.0033) +[2024-06-18 15:36:41,995][12645] Fps is (10 sec: 40955.5, 60 sec: 42324.6, 300 sec: 42487.2). Total num frames: 2661089280. Throughput: 0: 42541.6. Samples: 2661261700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:36:41,995][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 15:36:43,528][12883] Updated weights for policy 0, policy_version 162424 (0.0051) +[2024-06-18 15:36:46,063][12883] Updated weights for policy 0, policy_version 162434 (0.0031) +[2024-06-18 15:36:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 2661335040. Throughput: 0: 42712.4. Samples: 2661390220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:36:46,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 15:36:51,162][12883] Updated weights for policy 0, policy_version 162444 (0.0028) +[2024-06-18 15:36:51,994][12645] Fps is (10 sec: 42603.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2661515264. Throughput: 0: 42452.5. Samples: 2661647300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:36:51,994][12645] Avg episode reward: [(0, '0.512')] +[2024-06-18 15:36:52,535][12862] Signal inference workers to stop experience collection... (38950 times) +[2024-06-18 15:36:52,587][12883] InferenceWorker_p0-w0: stopping experience collection (38950 times) +[2024-06-18 15:36:52,594][12862] Signal inference workers to resume experience collection... (38950 times) +[2024-06-18 15:36:52,601][12883] InferenceWorker_p0-w0: resuming experience collection (38950 times) +[2024-06-18 15:36:53,730][12883] Updated weights for policy 0, policy_version 162454 (0.0029) +[2024-06-18 15:36:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2661728256. Throughput: 0: 42315.2. Samples: 2661898440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:36:56,994][12645] Avg episode reward: [(0, '0.622')] +[2024-06-18 15:36:58,827][12883] Updated weights for policy 0, policy_version 162464 (0.0037) +[2024-06-18 15:37:01,478][12883] Updated weights for policy 0, policy_version 162474 (0.0036) +[2024-06-18 15:37:01,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2661974016. Throughput: 0: 42581.7. Samples: 2662030600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:01,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 15:37:06,536][12883] Updated weights for policy 0, policy_version 162484 (0.0035) +[2024-06-18 15:37:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2662154240. Throughput: 0: 42524.0. Samples: 2662288180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:06,994][12645] Avg episode reward: [(0, '0.308')] +[2024-06-18 15:37:09,578][12883] Updated weights for policy 0, policy_version 162494 (0.0034) +[2024-06-18 15:37:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2662383616. Throughput: 0: 42571.6. Samples: 2662537140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:11,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 15:37:14,532][12883] Updated weights for policy 0, policy_version 162504 (0.0036) +[2024-06-18 15:37:16,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2662612992. Throughput: 0: 42758.3. Samples: 2662671560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:16,994][12645] Avg episode reward: [(0, '0.594')] +[2024-06-18 15:37:17,241][12883] Updated weights for policy 0, policy_version 162514 (0.0032) +[2024-06-18 15:37:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 2662776832. Throughput: 0: 42611.3. Samples: 2662925240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:21,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 15:37:22,094][12883] Updated weights for policy 0, policy_version 162524 (0.0030) +[2024-06-18 15:37:25,022][12883] Updated weights for policy 0, policy_version 162534 (0.0030) +[2024-06-18 15:37:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42488.2). Total num frames: 2663022592. Throughput: 0: 42562.9. Samples: 2663176980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:26,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 15:37:29,834][12883] Updated weights for policy 0, policy_version 162544 (0.0041) +[2024-06-18 15:37:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2663235584. Throughput: 0: 42633.8. Samples: 2663308740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:31,994][12645] Avg episode reward: [(0, '0.370')] +[2024-06-18 15:37:32,848][12883] Updated weights for policy 0, policy_version 162554 (0.0033) +[2024-06-18 15:37:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2663415808. Throughput: 0: 42471.5. Samples: 2663558520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:36,994][12645] Avg episode reward: [(0, '0.451')] +[2024-06-18 15:37:37,383][12883] Updated weights for policy 0, policy_version 162564 (0.0036) +[2024-06-18 15:37:40,654][12883] Updated weights for policy 0, policy_version 162574 (0.0039) +[2024-06-18 15:37:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42872.2, 300 sec: 42487.3). Total num frames: 2663661568. Throughput: 0: 42302.2. Samples: 2663802040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) +[2024-06-18 15:37:41,995][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 15:37:45,443][12883] Updated weights for policy 0, policy_version 162584 (0.0032) +[2024-06-18 15:37:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2663874560. Throughput: 0: 42466.7. Samples: 2663941600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:37:46,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 15:37:48,474][12883] Updated weights for policy 0, policy_version 162594 (0.0027) +[2024-06-18 15:37:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2664038400. Throughput: 0: 42170.7. Samples: 2664185860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:37:51,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 15:37:53,118][12883] Updated weights for policy 0, policy_version 162604 (0.0038) +[2024-06-18 15:37:56,395][12883] Updated weights for policy 0, policy_version 162614 (0.0031) +[2024-06-18 15:37:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 2664300544. Throughput: 0: 42092.1. Samples: 2664431280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:37:56,994][12645] Avg episode reward: [(0, '0.538')] +[2024-06-18 15:38:00,886][12883] Updated weights for policy 0, policy_version 162624 (0.0033) +[2024-06-18 15:38:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2664480768. Throughput: 0: 42201.7. Samples: 2664570640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:01,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 15:38:03,923][12883] Updated weights for policy 0, policy_version 162634 (0.0034) +[2024-06-18 15:38:07,000][12645] Fps is (10 sec: 39296.9, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 2664693760. Throughput: 0: 42054.1. Samples: 2664817940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:07,000][12645] Avg episode reward: [(0, '0.298')] +[2024-06-18 15:38:08,600][12883] Updated weights for policy 0, policy_version 162644 (0.0026) +[2024-06-18 15:38:11,670][12883] Updated weights for policy 0, policy_version 162654 (0.0045) +[2024-06-18 15:38:11,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 2664955904. Throughput: 0: 42162.1. Samples: 2665074280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:11,994][12645] Avg episode reward: [(0, '0.422')] +[2024-06-18 15:38:16,159][12883] Updated weights for policy 0, policy_version 162664 (0.0033) +[2024-06-18 15:38:16,994][12645] Fps is (10 sec: 42625.3, 60 sec: 41779.2, 300 sec: 42543.8). Total num frames: 2665119744. Throughput: 0: 42161.4. Samples: 2665206000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:16,994][12645] Avg episode reward: [(0, '0.597')] +[2024-06-18 15:38:18,697][12862] Signal inference workers to stop experience collection... (39000 times) +[2024-06-18 15:38:18,697][12862] Signal inference workers to resume experience collection... (39000 times) +[2024-06-18 15:38:18,730][12883] InferenceWorker_p0-w0: stopping experience collection (39000 times) +[2024-06-18 15:38:18,730][12883] InferenceWorker_p0-w0: resuming experience collection (39000 times) +[2024-06-18 15:38:19,383][12883] Updated weights for policy 0, policy_version 162674 (0.0040) +[2024-06-18 15:38:21,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2665332736. Throughput: 0: 42220.0. Samples: 2665458420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:21,994][12645] Avg episode reward: [(0, '0.785')] +[2024-06-18 15:38:23,822][12883] Updated weights for policy 0, policy_version 162684 (0.0029) +[2024-06-18 15:38:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2665562112. Throughput: 0: 42469.4. Samples: 2665713160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:26,994][12645] Avg episode reward: [(0, '0.785')] +[2024-06-18 15:38:27,024][12883] Updated weights for policy 0, policy_version 162694 (0.0034) +[2024-06-18 15:38:31,692][12883] Updated weights for policy 0, policy_version 162704 (0.0037) +[2024-06-18 15:38:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2665758720. Throughput: 0: 42340.1. Samples: 2665846900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:31,994][12645] Avg episode reward: [(0, '0.676')] +[2024-06-18 15:38:34,665][12883] Updated weights for policy 0, policy_version 162714 (0.0033) +[2024-06-18 15:38:36,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 2665988096. Throughput: 0: 42425.8. Samples: 2666095020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:36,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 15:38:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162719_2665988096.pth... +[2024-06-18 15:38:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162096_2655780864.pth +[2024-06-18 15:38:39,294][12883] Updated weights for policy 0, policy_version 162724 (0.0033) +[2024-06-18 15:38:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 2666201088. Throughput: 0: 42735.1. Samples: 2666354360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:41,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 15:38:42,666][12883] Updated weights for policy 0, policy_version 162734 (0.0026) +[2024-06-18 15:38:46,891][12883] Updated weights for policy 0, policy_version 162744 (0.0030) +[2024-06-18 15:38:46,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2666397696. Throughput: 0: 42335.8. Samples: 2666475760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:46,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 15:38:50,550][12883] Updated weights for policy 0, policy_version 162754 (0.0032) +[2024-06-18 15:38:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 2666643456. Throughput: 0: 42499.7. Samples: 2666730160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:51,994][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 15:38:54,607][12883] Updated weights for policy 0, policy_version 162764 (0.0042) +[2024-06-18 15:38:56,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42432.3). Total num frames: 2666823680. Throughput: 0: 42550.3. Samples: 2666989040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:38:56,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 15:38:58,386][12883] Updated weights for policy 0, policy_version 162774 (0.0037) +[2024-06-18 15:39:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2667036672. Throughput: 0: 42313.7. Samples: 2667110120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:01,994][12645] Avg episode reward: [(0, '0.663')] +[2024-06-18 15:39:02,063][12883] Updated weights for policy 0, policy_version 162784 (0.0029) +[2024-06-18 15:39:06,083][12883] Updated weights for policy 0, policy_version 162794 (0.0029) +[2024-06-18 15:39:06,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43149.0, 300 sec: 42487.3). Total num frames: 2667282432. Throughput: 0: 42461.3. Samples: 2667369180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:06,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 15:39:09,839][12883] Updated weights for policy 0, policy_version 162804 (0.0039) +[2024-06-18 15:39:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 2667462656. Throughput: 0: 42605.4. Samples: 2667630400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:11,995][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 15:39:13,605][12883] Updated weights for policy 0, policy_version 162814 (0.0042) +[2024-06-18 15:39:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2667659264. Throughput: 0: 42311.1. Samples: 2667750900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:16,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 15:39:17,385][12883] Updated weights for policy 0, policy_version 162824 (0.0042) +[2024-06-18 15:39:21,078][12883] Updated weights for policy 0, policy_version 162834 (0.0023) +[2024-06-18 15:39:21,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 2667921408. Throughput: 0: 42540.4. Samples: 2668009340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:21,994][12645] Avg episode reward: [(0, '0.622')] +[2024-06-18 15:39:25,211][12883] Updated weights for policy 0, policy_version 162844 (0.0037) +[2024-06-18 15:39:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2668101632. Throughput: 0: 42618.1. Samples: 2668272180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:26,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 15:39:28,951][12883] Updated weights for policy 0, policy_version 162854 (0.0022) +[2024-06-18 15:39:31,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2668314624. Throughput: 0: 42519.4. Samples: 2668389220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:31,996][12645] Avg episode reward: [(0, '0.351')] +[2024-06-18 15:39:32,923][12883] Updated weights for policy 0, policy_version 162864 (0.0036) +[2024-06-18 15:39:36,594][12883] Updated weights for policy 0, policy_version 162874 (0.0037) +[2024-06-18 15:39:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2668544000. Throughput: 0: 42775.1. Samples: 2668655040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:36,994][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 15:39:37,384][12862] Signal inference workers to stop experience collection... (39050 times) +[2024-06-18 15:39:37,423][12883] InferenceWorker_p0-w0: stopping experience collection (39050 times) +[2024-06-18 15:39:37,442][12862] Signal inference workers to resume experience collection... (39050 times) +[2024-06-18 15:39:37,444][12883] InferenceWorker_p0-w0: resuming experience collection (39050 times) +[2024-06-18 15:39:40,475][12883] Updated weights for policy 0, policy_version 162884 (0.0039) +[2024-06-18 15:39:41,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2668740608. Throughput: 0: 42642.6. Samples: 2668907960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:41,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 15:39:44,150][12883] Updated weights for policy 0, policy_version 162894 (0.0042) +[2024-06-18 15:39:46,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 2668953600. Throughput: 0: 42701.4. Samples: 2669031780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:46,996][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 15:39:48,219][12883] Updated weights for policy 0, policy_version 162904 (0.0040) +[2024-06-18 15:39:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2669166592. Throughput: 0: 42838.1. Samples: 2669296900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) +[2024-06-18 15:39:51,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 15:39:52,041][12883] Updated weights for policy 0, policy_version 162914 (0.0035) +[2024-06-18 15:39:55,738][12883] Updated weights for policy 0, policy_version 162924 (0.0032) +[2024-06-18 15:39:56,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2669379584. Throughput: 0: 42740.5. Samples: 2669553720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:39:56,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 15:39:59,567][12883] Updated weights for policy 0, policy_version 162934 (0.0039) +[2024-06-18 15:40:01,996][12645] Fps is (10 sec: 44227.3, 60 sec: 42869.9, 300 sec: 42487.0). Total num frames: 2669608960. Throughput: 0: 42848.0. Samples: 2669679160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:01,997][12645] Avg episode reward: [(0, '0.488')] +[2024-06-18 15:40:03,336][12883] Updated weights for policy 0, policy_version 162944 (0.0030) +[2024-06-18 15:40:06,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2669821952. Throughput: 0: 42905.7. Samples: 2669940100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:06,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 15:40:07,130][12883] Updated weights for policy 0, policy_version 162954 (0.0042) +[2024-06-18 15:40:10,908][12883] Updated weights for policy 0, policy_version 162964 (0.0035) +[2024-06-18 15:40:11,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2670034944. Throughput: 0: 42765.9. Samples: 2670196640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:11,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 15:40:14,920][12883] Updated weights for policy 0, policy_version 162974 (0.0035) +[2024-06-18 15:40:16,993][12645] Fps is (10 sec: 42599.5, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 2670247936. Throughput: 0: 42977.8. Samples: 2670323120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:16,994][12645] Avg episode reward: [(0, '0.204')] +[2024-06-18 15:40:18,501][12883] Updated weights for policy 0, policy_version 162984 (0.0041) +[2024-06-18 15:40:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2670460928. Throughput: 0: 42810.6. Samples: 2670581520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:21,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 15:40:22,693][12883] Updated weights for policy 0, policy_version 162994 (0.0043) +[2024-06-18 15:40:26,010][12883] Updated weights for policy 0, policy_version 163004 (0.0028) +[2024-06-18 15:40:26,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2670690304. Throughput: 0: 42919.9. Samples: 2670839360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:26,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 15:40:30,466][12883] Updated weights for policy 0, policy_version 163014 (0.0036) +[2024-06-18 15:40:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43146.1, 300 sec: 42598.4). Total num frames: 2670903296. Throughput: 0: 43039.5. Samples: 2670968460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:31,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 15:40:34,046][12883] Updated weights for policy 0, policy_version 163024 (0.0041) +[2024-06-18 15:40:36,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2671099904. Throughput: 0: 42694.8. Samples: 2671218260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:36,996][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 15:40:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163031_2671099904.pth... +[2024-06-18 15:40:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162408_2660892672.pth +[2024-06-18 15:40:38,106][12883] Updated weights for policy 0, policy_version 163034 (0.0039) +[2024-06-18 15:40:41,675][12883] Updated weights for policy 0, policy_version 163044 (0.0034) +[2024-06-18 15:40:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2671329280. Throughput: 0: 42725.8. Samples: 2671476380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:41,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 15:40:45,841][12883] Updated weights for policy 0, policy_version 163054 (0.0037) +[2024-06-18 15:40:46,994][12645] Fps is (10 sec: 44247.2, 60 sec: 43146.2, 300 sec: 42542.9). Total num frames: 2671542272. Throughput: 0: 42827.1. Samples: 2671606280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:46,994][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 15:40:49,264][12883] Updated weights for policy 0, policy_version 163064 (0.0040) +[2024-06-18 15:40:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 2671755264. Throughput: 0: 42686.9. Samples: 2671861000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:51,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 15:40:53,318][12883] Updated weights for policy 0, policy_version 163074 (0.0043) +[2024-06-18 15:40:56,866][12883] Updated weights for policy 0, policy_version 163084 (0.0033) +[2024-06-18 15:40:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2671968256. Throughput: 0: 42802.7. Samples: 2672122760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) +[2024-06-18 15:40:56,994][12645] Avg episode reward: [(0, '0.407')] +[2024-06-18 15:41:01,033][12883] Updated weights for policy 0, policy_version 163094 (0.0033) +[2024-06-18 15:41:02,000][12645] Fps is (10 sec: 42571.3, 60 sec: 42868.6, 300 sec: 42542.0). Total num frames: 2672181248. Throughput: 0: 42695.3. Samples: 2672244680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:02,001][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 15:41:04,461][12862] Signal inference workers to stop experience collection... (39100 times) +[2024-06-18 15:41:04,495][12883] InferenceWorker_p0-w0: stopping experience collection (39100 times) +[2024-06-18 15:41:04,519][12862] Signal inference workers to resume experience collection... (39100 times) +[2024-06-18 15:41:04,520][12883] InferenceWorker_p0-w0: resuming experience collection (39100 times) +[2024-06-18 15:41:04,658][12883] Updated weights for policy 0, policy_version 163104 (0.0035) +[2024-06-18 15:41:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2672394240. Throughput: 0: 42728.3. Samples: 2672504300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:06,994][12645] Avg episode reward: [(0, '0.828')] +[2024-06-18 15:41:08,557][12883] Updated weights for policy 0, policy_version 163114 (0.0041) +[2024-06-18 15:41:11,994][12645] Fps is (10 sec: 39346.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2672574464. Throughput: 0: 42728.0. Samples: 2672762120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:11,994][12645] Avg episode reward: [(0, '0.648')] +[2024-06-18 15:41:12,354][12883] Updated weights for policy 0, policy_version 163124 (0.0034) +[2024-06-18 15:41:16,133][12883] Updated weights for policy 0, policy_version 163134 (0.0039) +[2024-06-18 15:41:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 2672820224. Throughput: 0: 42694.6. Samples: 2672889720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:16,994][12645] Avg episode reward: [(0, '0.787')] +[2024-06-18 15:41:20,257][12883] Updated weights for policy 0, policy_version 163144 (0.0040) +[2024-06-18 15:41:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2673016832. Throughput: 0: 42810.4. Samples: 2673144640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:21,994][12645] Avg episode reward: [(0, '0.803')] +[2024-06-18 15:41:23,756][12883] Updated weights for policy 0, policy_version 163154 (0.0042) +[2024-06-18 15:41:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2673229824. Throughput: 0: 42799.9. Samples: 2673402380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:26,994][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 15:41:27,774][12883] Updated weights for policy 0, policy_version 163164 (0.0033) +[2024-06-18 15:41:31,355][12883] Updated weights for policy 0, policy_version 163174 (0.0026) +[2024-06-18 15:41:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2673459200. Throughput: 0: 42657.6. Samples: 2673525880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:31,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 15:41:35,659][12883] Updated weights for policy 0, policy_version 163184 (0.0040) +[2024-06-18 15:41:36,994][12645] Fps is (10 sec: 45875.9, 60 sec: 43146.2, 300 sec: 42709.7). Total num frames: 2673688576. Throughput: 0: 42914.6. Samples: 2673792160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:36,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 15:41:39,352][12883] Updated weights for policy 0, policy_version 163194 (0.0026) +[2024-06-18 15:41:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2673885184. Throughput: 0: 42811.6. Samples: 2674049280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:41,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 15:41:43,076][12883] Updated weights for policy 0, policy_version 163204 (0.0036) +[2024-06-18 15:41:46,808][12883] Updated weights for policy 0, policy_version 163214 (0.0033) +[2024-06-18 15:41:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2674114560. Throughput: 0: 42841.9. Samples: 2674172300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:46,994][12645] Avg episode reward: [(0, '0.444')] +[2024-06-18 15:41:50,580][12883] Updated weights for policy 0, policy_version 163224 (0.0043) +[2024-06-18 15:41:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2674327552. Throughput: 0: 42885.8. Samples: 2674434160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:51,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 15:41:54,441][12883] Updated weights for policy 0, policy_version 163234 (0.0031) +[2024-06-18 15:41:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2674540544. Throughput: 0: 43043.5. Samples: 2674699080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:41:56,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 15:41:58,073][12883] Updated weights for policy 0, policy_version 163244 (0.0032) +[2024-06-18 15:42:01,868][12883] Updated weights for policy 0, policy_version 163254 (0.0036) +[2024-06-18 15:42:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 2674753536. Throughput: 0: 42906.4. Samples: 2674820500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:01,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 15:42:05,564][12883] Updated weights for policy 0, policy_version 163264 (0.0026) +[2024-06-18 15:42:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2674966528. Throughput: 0: 42992.1. Samples: 2675079280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:06,994][12645] Avg episode reward: [(0, '0.577')] +[2024-06-18 15:42:09,439][12883] Updated weights for policy 0, policy_version 163274 (0.0036) +[2024-06-18 15:42:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 2675179520. Throughput: 0: 43041.0. Samples: 2675339220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:11,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 15:42:13,139][12883] Updated weights for policy 0, policy_version 163284 (0.0025) +[2024-06-18 15:42:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2675392512. Throughput: 0: 43156.1. Samples: 2675467900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:16,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 15:42:17,343][12883] Updated weights for policy 0, policy_version 163294 (0.0033) +[2024-06-18 15:42:21,056][12883] Updated weights for policy 0, policy_version 163304 (0.0024) +[2024-06-18 15:42:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2675605504. Throughput: 0: 42913.2. Samples: 2675723260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:21,994][12645] Avg episode reward: [(0, '0.648')] +[2024-06-18 15:42:24,946][12883] Updated weights for policy 0, policy_version 163314 (0.0023) +[2024-06-18 15:42:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2675818496. Throughput: 0: 42806.5. Samples: 2675975580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:26,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 15:42:29,055][12883] Updated weights for policy 0, policy_version 163324 (0.0038) +[2024-06-18 15:42:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2676047872. Throughput: 0: 42941.8. Samples: 2676104680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:31,994][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 15:42:32,534][12862] Signal inference workers to stop experience collection... (39150 times) +[2024-06-18 15:42:32,565][12883] InferenceWorker_p0-w0: stopping experience collection (39150 times) +[2024-06-18 15:42:32,585][12862] Signal inference workers to resume experience collection... (39150 times) +[2024-06-18 15:42:32,599][12883] InferenceWorker_p0-w0: resuming experience collection (39150 times) +[2024-06-18 15:42:32,606][12883] Updated weights for policy 0, policy_version 163334 (0.0041) +[2024-06-18 15:42:36,661][12883] Updated weights for policy 0, policy_version 163344 (0.0033) +[2024-06-18 15:42:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2676260864. Throughput: 0: 43029.8. Samples: 2676370500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:36,994][12645] Avg episode reward: [(0, '0.417')] +[2024-06-18 15:42:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163346_2676260864.pth... +[2024-06-18 15:42:37,054][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162719_2665988096.pth +[2024-06-18 15:42:40,231][12883] Updated weights for policy 0, policy_version 163354 (0.0033) +[2024-06-18 15:42:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2676457472. Throughput: 0: 42726.4. Samples: 2676621760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:41,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 15:42:44,239][12883] Updated weights for policy 0, policy_version 163364 (0.0028) +[2024-06-18 15:42:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2676670464. Throughput: 0: 42796.0. Samples: 2676746320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:46,994][12645] Avg episode reward: [(0, '0.612')] +[2024-06-18 15:42:47,719][12883] Updated weights for policy 0, policy_version 163374 (0.0039) +[2024-06-18 15:42:51,841][12883] Updated weights for policy 0, policy_version 163384 (0.0031) +[2024-06-18 15:42:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2676883456. Throughput: 0: 42805.0. Samples: 2677005500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:51,994][12645] Avg episode reward: [(0, '0.393')] +[2024-06-18 15:42:55,582][12883] Updated weights for policy 0, policy_version 163394 (0.0038) +[2024-06-18 15:42:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2677112832. Throughput: 0: 42654.5. Samples: 2677258680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:42:56,994][12645] Avg episode reward: [(0, '0.533')] +[2024-06-18 15:42:59,339][12883] Updated weights for policy 0, policy_version 163404 (0.0042) +[2024-06-18 15:43:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 2677309440. Throughput: 0: 42627.5. Samples: 2677386140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:43:01,994][12645] Avg episode reward: [(0, '0.305')] +[2024-06-18 15:43:03,389][12883] Updated weights for policy 0, policy_version 163414 (0.0037) +[2024-06-18 15:43:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2677522432. Throughput: 0: 42797.3. Samples: 2677649140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) +[2024-06-18 15:43:06,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 15:43:07,133][12883] Updated weights for policy 0, policy_version 163424 (0.0035) +[2024-06-18 15:43:10,997][12883] Updated weights for policy 0, policy_version 163434 (0.0032) +[2024-06-18 15:43:11,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2677735424. Throughput: 0: 42734.4. Samples: 2677898620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:11,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 15:43:14,689][12883] Updated weights for policy 0, policy_version 163444 (0.0035) +[2024-06-18 15:43:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2677964800. Throughput: 0: 42758.2. Samples: 2678028800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:16,994][12645] Avg episode reward: [(0, '0.347')] +[2024-06-18 15:43:18,687][12883] Updated weights for policy 0, policy_version 163454 (0.0027) +[2024-06-18 15:43:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2678177792. Throughput: 0: 42641.0. Samples: 2678289340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:21,994][12645] Avg episode reward: [(0, '0.414')] +[2024-06-18 15:43:22,438][12883] Updated weights for policy 0, policy_version 163464 (0.0035) +[2024-06-18 15:43:26,178][12883] Updated weights for policy 0, policy_version 163474 (0.0032) +[2024-06-18 15:43:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2678390784. Throughput: 0: 42699.1. Samples: 2678543220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:26,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 15:43:30,072][12883] Updated weights for policy 0, policy_version 163484 (0.0043) +[2024-06-18 15:43:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2678603776. Throughput: 0: 42813.8. Samples: 2678672940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:31,994][12645] Avg episode reward: [(0, '0.307')] +[2024-06-18 15:43:33,747][12883] Updated weights for policy 0, policy_version 163494 (0.0039) +[2024-06-18 15:43:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2678800384. Throughput: 0: 42826.1. Samples: 2678932680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:36,994][12645] Avg episode reward: [(0, '0.546')] +[2024-06-18 15:43:37,629][12883] Updated weights for policy 0, policy_version 163504 (0.0030) +[2024-06-18 15:43:41,658][12883] Updated weights for policy 0, policy_version 163514 (0.0029) +[2024-06-18 15:43:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2679029760. Throughput: 0: 42953.0. Samples: 2679191560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:41,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 15:43:45,368][12883] Updated weights for policy 0, policy_version 163524 (0.0026) +[2024-06-18 15:43:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2679242752. Throughput: 0: 42933.9. Samples: 2679318160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:46,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 15:43:49,311][12883] Updated weights for policy 0, policy_version 163534 (0.0033) +[2024-06-18 15:43:51,993][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2679455744. Throughput: 0: 42727.3. Samples: 2679571860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:51,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 15:43:53,078][12883] Updated weights for policy 0, policy_version 163544 (0.0025) +[2024-06-18 15:43:56,908][12883] Updated weights for policy 0, policy_version 163554 (0.0040) +[2024-06-18 15:43:56,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42596.9, 300 sec: 42820.2). Total num frames: 2679668736. Throughput: 0: 42988.4. Samples: 2679833200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:43:56,997][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 15:44:00,903][12883] Updated weights for policy 0, policy_version 163564 (0.0035) +[2024-06-18 15:44:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2679881728. Throughput: 0: 42851.7. Samples: 2679957120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:44:01,994][12645] Avg episode reward: [(0, '0.398')] +[2024-06-18 15:44:04,473][12883] Updated weights for policy 0, policy_version 163574 (0.0036) +[2024-06-18 15:44:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2680111104. Throughput: 0: 42881.6. Samples: 2680219020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:44:06,994][12645] Avg episode reward: [(0, '0.384')] +[2024-06-18 15:44:08,537][12883] Updated weights for policy 0, policy_version 163584 (0.0037) +[2024-06-18 15:44:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2680307712. Throughput: 0: 42942.7. Samples: 2680475640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) +[2024-06-18 15:44:11,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 15:44:12,027][12883] Updated weights for policy 0, policy_version 163594 (0.0035) +[2024-06-18 15:44:16,200][12883] Updated weights for policy 0, policy_version 163604 (0.0032) +[2024-06-18 15:44:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2680520704. Throughput: 0: 42840.8. Samples: 2680600780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:16,994][12645] Avg episode reward: [(0, '0.630')] +[2024-06-18 15:44:19,543][12883] Updated weights for policy 0, policy_version 163614 (0.0035) +[2024-06-18 15:44:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2680766464. Throughput: 0: 42836.9. Samples: 2680860340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:21,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 15:44:23,661][12883] Updated weights for policy 0, policy_version 163624 (0.0039) +[2024-06-18 15:44:26,640][12862] Signal inference workers to stop experience collection... (39200 times) +[2024-06-18 15:44:26,640][12862] Signal inference workers to resume experience collection... (39200 times) +[2024-06-18 15:44:26,655][12883] InferenceWorker_p0-w0: stopping experience collection (39200 times) +[2024-06-18 15:44:26,664][12883] InferenceWorker_p0-w0: resuming experience collection (39200 times) +[2024-06-18 15:44:26,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42876.1). Total num frames: 2680963072. Throughput: 0: 42785.8. Samples: 2681117020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:26,996][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 15:44:27,125][12883] Updated weights for policy 0, policy_version 163634 (0.0035) +[2024-06-18 15:44:31,312][12883] Updated weights for policy 0, policy_version 163644 (0.0042) +[2024-06-18 15:44:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2681159680. Throughput: 0: 42636.5. Samples: 2681236800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:31,994][12645] Avg episode reward: [(0, '0.516')] +[2024-06-18 15:44:34,700][12883] Updated weights for policy 0, policy_version 163654 (0.0033) +[2024-06-18 15:44:36,994][12645] Fps is (10 sec: 44246.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2681405440. Throughput: 0: 42663.4. Samples: 2681491720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:36,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 15:44:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163660_2681405440.pth... +[2024-06-18 15:44:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163031_2671099904.pth +[2024-06-18 15:44:39,091][12883] Updated weights for policy 0, policy_version 163664 (0.0047) +[2024-06-18 15:44:41,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 2681602048. Throughput: 0: 42730.9. Samples: 2681756000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:42,000][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 15:44:42,734][12883] Updated weights for policy 0, policy_version 163674 (0.0036) +[2024-06-18 15:44:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2681782272. Throughput: 0: 42700.0. Samples: 2681878620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:46,994][12645] Avg episode reward: [(0, '0.571')] +[2024-06-18 15:44:47,022][12883] Updated weights for policy 0, policy_version 163684 (0.0024) +[2024-06-18 15:44:50,238][12883] Updated weights for policy 0, policy_version 163694 (0.0034) +[2024-06-18 15:44:51,996][12645] Fps is (10 sec: 44227.6, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 2682044416. Throughput: 0: 42601.5. Samples: 2682136180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:51,997][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 15:44:54,666][12883] Updated weights for policy 0, policy_version 163704 (0.0043) +[2024-06-18 15:44:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42600.0, 300 sec: 42765.3). Total num frames: 2682224640. Throughput: 0: 42712.4. Samples: 2682397700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:44:56,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 15:44:57,882][12883] Updated weights for policy 0, policy_version 163714 (0.0042) +[2024-06-18 15:45:01,994][12645] Fps is (10 sec: 37691.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2682421248. Throughput: 0: 42651.2. Samples: 2682520080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:45:01,998][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 15:45:02,245][12883] Updated weights for policy 0, policy_version 163724 (0.0033) +[2024-06-18 15:45:05,443][12883] Updated weights for policy 0, policy_version 163734 (0.0028) +[2024-06-18 15:45:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2682683392. Throughput: 0: 42592.5. Samples: 2682777000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:45:06,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 15:45:09,793][12883] Updated weights for policy 0, policy_version 163744 (0.0036) +[2024-06-18 15:45:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2682863616. Throughput: 0: 42796.8. Samples: 2683042780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:45:11,994][12645] Avg episode reward: [(0, '0.851')] +[2024-06-18 15:45:13,057][12883] Updated weights for policy 0, policy_version 163754 (0.0030) +[2024-06-18 15:45:17,000][12645] Fps is (10 sec: 37659.6, 60 sec: 42320.9, 300 sec: 42708.6). Total num frames: 2683060224. Throughput: 0: 42785.5. Samples: 2683162420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 15:45:17,000][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 15:45:17,396][12883] Updated weights for policy 0, policy_version 163764 (0.0028) +[2024-06-18 15:45:20,818][12883] Updated weights for policy 0, policy_version 163774 (0.0035) +[2024-06-18 15:45:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2683322368. Throughput: 0: 42741.4. Samples: 2683415080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:45:21,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 15:45:25,436][12883] Updated weights for policy 0, policy_version 163784 (0.0024) +[2024-06-18 15:45:26,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 2683486208. Throughput: 0: 42642.8. Samples: 2683674920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:45:26,994][12645] Avg episode reward: [(0, '0.216')] +[2024-06-18 15:45:28,590][12883] Updated weights for policy 0, policy_version 163794 (0.0037) +[2024-06-18 15:45:30,880][12862] Signal inference workers to stop experience collection... (39250 times) +[2024-06-18 15:45:30,937][12883] InferenceWorker_p0-w0: stopping experience collection (39250 times) +[2024-06-18 15:45:30,937][12862] Signal inference workers to resume experience collection... (39250 times) +[2024-06-18 15:45:30,963][12883] InferenceWorker_p0-w0: resuming experience collection (39250 times) +[2024-06-18 15:45:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 2683715584. Throughput: 0: 42555.0. Samples: 2683793600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:45:31,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 15:45:33,006][12883] Updated weights for policy 0, policy_version 163804 (0.0035) +[2024-06-18 15:45:36,459][12883] Updated weights for policy 0, policy_version 163814 (0.0033) +[2024-06-18 15:45:36,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2683961344. Throughput: 0: 42639.3. Samples: 2684054860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:45:36,994][12645] Avg episode reward: [(0, '0.449')] +[2024-06-18 15:45:40,808][12883] Updated weights for policy 0, policy_version 163824 (0.0032) +[2024-06-18 15:45:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 2684125184. Throughput: 0: 42415.2. Samples: 2684306380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:45:41,994][12645] Avg episode reward: [(0, '0.466')] +[2024-06-18 15:45:44,129][12883] Updated weights for policy 0, policy_version 163834 (0.0028) +[2024-06-18 15:45:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2684370944. Throughput: 0: 42378.1. Samples: 2684427100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:45:46,994][12645] Avg episode reward: [(0, '0.348')] +[2024-06-18 15:45:48,479][12883] Updated weights for policy 0, policy_version 163844 (0.0037) +[2024-06-18 15:45:51,945][12883] Updated weights for policy 0, policy_version 163854 (0.0034) +[2024-06-18 15:45:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2684583936. Throughput: 0: 42477.8. Samples: 2684688500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:45:51,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 15:45:56,171][12883] Updated weights for policy 0, policy_version 163864 (0.0035) +[2024-06-18 15:45:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 2684780544. Throughput: 0: 42238.1. Samples: 2684943500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:45:56,994][12645] Avg episode reward: [(0, '0.515')] +[2024-06-18 15:45:59,736][12883] Updated weights for policy 0, policy_version 163874 (0.0039) +[2024-06-18 15:46:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2685009920. Throughput: 0: 42436.6. Samples: 2685071800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:46:01,994][12645] Avg episode reward: [(0, '0.673')] +[2024-06-18 15:46:03,659][12883] Updated weights for policy 0, policy_version 163884 (0.0035) +[2024-06-18 15:46:06,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 2685206528. Throughput: 0: 42641.3. Samples: 2685333940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:46:06,994][12645] Avg episode reward: [(0, '0.484')] +[2024-06-18 15:46:07,407][12883] Updated weights for policy 0, policy_version 163894 (0.0037) +[2024-06-18 15:46:11,373][12883] Updated weights for policy 0, policy_version 163904 (0.0025) +[2024-06-18 15:46:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2685435904. Throughput: 0: 42417.8. Samples: 2685583720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:46:11,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 15:46:14,939][12883] Updated weights for policy 0, policy_version 163914 (0.0033) +[2024-06-18 15:46:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43149.0, 300 sec: 42820.6). Total num frames: 2685648896. Throughput: 0: 42695.1. Samples: 2685714880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:46:16,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 15:46:18,889][12883] Updated weights for policy 0, policy_version 163924 (0.0021) +[2024-06-18 15:46:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2685845504. Throughput: 0: 42514.4. Samples: 2685968000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) +[2024-06-18 15:46:21,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 15:46:22,617][12883] Updated weights for policy 0, policy_version 163934 (0.0042) +[2024-06-18 15:46:26,479][12883] Updated weights for policy 0, policy_version 163944 (0.0030) +[2024-06-18 15:46:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2686058496. Throughput: 0: 42598.2. Samples: 2686223300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:46:26,994][12645] Avg episode reward: [(0, '0.375')] +[2024-06-18 15:46:30,271][12883] Updated weights for policy 0, policy_version 163954 (0.0028) +[2024-06-18 15:46:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2686287872. Throughput: 0: 42822.7. Samples: 2686354120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:46:31,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 15:46:34,515][12883] Updated weights for policy 0, policy_version 163964 (0.0035) +[2024-06-18 15:46:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2686500864. Throughput: 0: 42629.7. Samples: 2686606840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:46:36,994][12645] Avg episode reward: [(0, '0.596')] +[2024-06-18 15:46:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163971_2686500864.pth... +[2024-06-18 15:46:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163346_2676260864.pth +[2024-06-18 15:46:37,979][12883] Updated weights for policy 0, policy_version 163974 (0.0031) +[2024-06-18 15:46:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2686697472. Throughput: 0: 42602.0. Samples: 2686860580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:46:41,994][12645] Avg episode reward: [(0, '0.278')] +[2024-06-18 15:46:42,184][12883] Updated weights for policy 0, policy_version 163984 (0.0030) +[2024-06-18 15:46:45,739][12883] Updated weights for policy 0, policy_version 163994 (0.0030) +[2024-06-18 15:46:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2686910464. Throughput: 0: 42527.0. Samples: 2686985520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:46:46,994][12645] Avg episode reward: [(0, '0.600')] +[2024-06-18 15:46:50,051][12883] Updated weights for policy 0, policy_version 164004 (0.0030) +[2024-06-18 15:46:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2687123456. Throughput: 0: 42422.1. Samples: 2687242940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:46:51,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 15:46:53,344][12883] Updated weights for policy 0, policy_version 164014 (0.0043) +[2024-06-18 15:46:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2687336448. Throughput: 0: 42556.0. Samples: 2687498740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:46:56,994][12645] Avg episode reward: [(0, '0.357')] +[2024-06-18 15:46:57,664][12883] Updated weights for policy 0, policy_version 164024 (0.0040) +[2024-06-18 15:47:01,054][12883] Updated weights for policy 0, policy_version 164034 (0.0040) +[2024-06-18 15:47:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2687549440. Throughput: 0: 42336.8. Samples: 2687620040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:47:01,994][12645] Avg episode reward: [(0, '0.445')] +[2024-06-18 15:47:05,252][12883] Updated weights for policy 0, policy_version 164044 (0.0037) +[2024-06-18 15:47:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2687762432. Throughput: 0: 42410.1. Samples: 2687876460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:47:06,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 15:47:08,020][12862] Signal inference workers to stop experience collection... (39300 times) +[2024-06-18 15:47:08,071][12883] InferenceWorker_p0-w0: stopping experience collection (39300 times) +[2024-06-18 15:47:08,135][12862] Signal inference workers to resume experience collection... (39300 times) +[2024-06-18 15:47:08,135][12883] InferenceWorker_p0-w0: resuming experience collection (39300 times) +[2024-06-18 15:47:08,801][12883] Updated weights for policy 0, policy_version 164054 (0.0028) +[2024-06-18 15:47:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2687975424. Throughput: 0: 42567.9. Samples: 2688138860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:47:11,994][12645] Avg episode reward: [(0, '0.442')] +[2024-06-18 15:47:13,397][12883] Updated weights for policy 0, policy_version 164064 (0.0034) +[2024-06-18 15:47:16,470][12883] Updated weights for policy 0, policy_version 164074 (0.0040) +[2024-06-18 15:47:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2688204800. Throughput: 0: 42452.6. Samples: 2688264480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:47:16,994][12645] Avg episode reward: [(0, '0.454')] +[2024-06-18 15:47:20,985][12883] Updated weights for policy 0, policy_version 164084 (0.0034) +[2024-06-18 15:47:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2688417792. Throughput: 0: 42616.0. Samples: 2688524560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:47:21,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 15:47:24,431][12883] Updated weights for policy 0, policy_version 164094 (0.0048) +[2024-06-18 15:47:26,995][12645] Fps is (10 sec: 42591.9, 60 sec: 42870.3, 300 sec: 42653.7). Total num frames: 2688630784. Throughput: 0: 42632.3. Samples: 2688779100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) +[2024-06-18 15:47:27,000][12645] Avg episode reward: [(0, '0.688')] +[2024-06-18 15:47:28,586][12883] Updated weights for policy 0, policy_version 164104 (0.0044) +[2024-06-18 15:47:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2688843776. Throughput: 0: 42569.4. Samples: 2688901140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:47:31,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 15:47:31,991][12883] Updated weights for policy 0, policy_version 164114 (0.0030) +[2024-06-18 15:47:36,099][12883] Updated weights for policy 0, policy_version 164124 (0.0029) +[2024-06-18 15:47:36,994][12645] Fps is (10 sec: 40966.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2689040384. Throughput: 0: 42630.3. Samples: 2689161300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:47:36,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 15:47:39,686][12883] Updated weights for policy 0, policy_version 164134 (0.0038) +[2024-06-18 15:47:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2689253376. Throughput: 0: 42579.0. Samples: 2689414800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:47:41,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 15:47:43,751][12883] Updated weights for policy 0, policy_version 164144 (0.0029) +[2024-06-18 15:47:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2689466368. Throughput: 0: 42710.7. Samples: 2689542020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:47:47,000][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 15:47:47,929][12883] Updated weights for policy 0, policy_version 164154 (0.0037) +[2024-06-18 15:47:51,454][12883] Updated weights for policy 0, policy_version 164164 (0.0026) +[2024-06-18 15:47:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2689679360. Throughput: 0: 42663.5. Samples: 2689796320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:47:51,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 15:47:55,706][12883] Updated weights for policy 0, policy_version 164174 (0.0030) +[2024-06-18 15:47:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2689875968. Throughput: 0: 42548.0. Samples: 2690053520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:47:56,994][12645] Avg episode reward: [(0, '0.799')] +[2024-06-18 15:47:59,248][12883] Updated weights for policy 0, policy_version 164184 (0.0037) +[2024-06-18 15:48:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2690105344. Throughput: 0: 42467.6. Samples: 2690175520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:48:01,994][12645] Avg episode reward: [(0, '0.786')] +[2024-06-18 15:48:03,270][12883] Updated weights for policy 0, policy_version 164194 (0.0023) +[2024-06-18 15:48:06,825][12883] Updated weights for policy 0, policy_version 164204 (0.0028) +[2024-06-18 15:48:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2690318336. Throughput: 0: 42524.6. Samples: 2690438160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:48:06,994][12645] Avg episode reward: [(0, '0.574')] +[2024-06-18 15:48:10,832][12883] Updated weights for policy 0, policy_version 164214 (0.0040) +[2024-06-18 15:48:11,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2690514944. Throughput: 0: 42623.8. Samples: 2690697200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:48:11,996][12645] Avg episode reward: [(0, '0.732')] +[2024-06-18 15:48:14,467][12883] Updated weights for policy 0, policy_version 164224 (0.0036) +[2024-06-18 15:48:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2690760704. Throughput: 0: 42540.9. Samples: 2690815480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:48:16,994][12645] Avg episode reward: [(0, '0.810')] +[2024-06-18 15:48:18,827][12883] Updated weights for policy 0, policy_version 164234 (0.0024) +[2024-06-18 15:48:21,990][12883] Updated weights for policy 0, policy_version 164244 (0.0032) +[2024-06-18 15:48:21,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2690973696. Throughput: 0: 42688.9. Samples: 2691082300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:48:21,994][12645] Avg episode reward: [(0, '0.814')] +[2024-06-18 15:48:26,492][12883] Updated weights for policy 0, policy_version 164254 (0.0032) +[2024-06-18 15:48:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42053.3, 300 sec: 42542.8). Total num frames: 2691153920. Throughput: 0: 42850.7. Samples: 2691343080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:48:26,994][12645] Avg episode reward: [(0, '0.818')] +[2024-06-18 15:48:29,553][12883] Updated weights for policy 0, policy_version 164264 (0.0044) +[2024-06-18 15:48:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2691432448. Throughput: 0: 42739.1. Samples: 2691465280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:48:31,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 15:48:33,912][12883] Updated weights for policy 0, policy_version 164274 (0.0028) +[2024-06-18 15:48:36,906][12862] Signal inference workers to stop experience collection... (39350 times) +[2024-06-18 15:48:36,952][12883] InferenceWorker_p0-w0: stopping experience collection (39350 times) +[2024-06-18 15:48:36,960][12862] Signal inference workers to resume experience collection... (39350 times) +[2024-06-18 15:48:36,966][12883] InferenceWorker_p0-w0: resuming experience collection (39350 times) +[2024-06-18 15:48:36,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2691596288. Throughput: 0: 42870.4. Samples: 2691725480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:48:36,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 15:48:37,088][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164283_2691612672.pth... +[2024-06-18 15:48:37,156][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163660_2681405440.pth +[2024-06-18 15:48:37,307][12883] Updated weights for policy 0, policy_version 164284 (0.0029) +[2024-06-18 15:48:41,386][12883] Updated weights for policy 0, policy_version 164294 (0.0031) +[2024-06-18 15:48:41,994][12645] Fps is (10 sec: 36044.5, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2691792896. Throughput: 0: 42811.4. Samples: 2691980040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:48:41,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 15:48:44,826][12883] Updated weights for policy 0, policy_version 164304 (0.0023) +[2024-06-18 15:48:46,994][12645] Fps is (10 sec: 45874.1, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 2692055040. Throughput: 0: 42989.6. Samples: 2692110060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:48:46,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 15:48:48,872][12883] Updated weights for policy 0, policy_version 164314 (0.0033) +[2024-06-18 15:48:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 2692235264. Throughput: 0: 42999.5. Samples: 2692373140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:48:51,994][12645] Avg episode reward: [(0, '0.502')] +[2024-06-18 15:48:52,478][12883] Updated weights for policy 0, policy_version 164324 (0.0027) +[2024-06-18 15:48:56,544][12883] Updated weights for policy 0, policy_version 164334 (0.0034) +[2024-06-18 15:48:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2692448256. Throughput: 0: 42787.4. Samples: 2692622540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:48:56,994][12645] Avg episode reward: [(0, '0.341')] +[2024-06-18 15:49:00,061][12883] Updated weights for policy 0, policy_version 164344 (0.0042) +[2024-06-18 15:49:01,996][12645] Fps is (10 sec: 45864.8, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 2692694016. Throughput: 0: 43000.1. Samples: 2692750580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:49:01,996][12645] Avg episode reward: [(0, '0.289')] +[2024-06-18 15:49:04,202][12883] Updated weights for policy 0, policy_version 164354 (0.0031) +[2024-06-18 15:49:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2692890624. Throughput: 0: 43040.1. Samples: 2693019100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:49:06,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 15:49:07,520][12883] Updated weights for policy 0, policy_version 164364 (0.0033) +[2024-06-18 15:49:11,912][12883] Updated weights for policy 0, policy_version 164374 (0.0040) +[2024-06-18 15:49:11,994][12645] Fps is (10 sec: 40969.0, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 2693103616. Throughput: 0: 42672.9. Samples: 2693263360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:49:11,994][12645] Avg episode reward: [(0, '0.620')] +[2024-06-18 15:49:15,268][12883] Updated weights for policy 0, policy_version 164384 (0.0044) +[2024-06-18 15:49:16,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2693332992. Throughput: 0: 42832.4. Samples: 2693392740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:49:16,994][12645] Avg episode reward: [(0, '0.750')] +[2024-06-18 15:49:19,514][12883] Updated weights for policy 0, policy_version 164394 (0.0032) +[2024-06-18 15:49:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 2693513216. Throughput: 0: 42925.7. Samples: 2693657140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:49:21,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 15:49:23,195][12883] Updated weights for policy 0, policy_version 164404 (0.0034) +[2024-06-18 15:49:26,994][12645] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2693742592. Throughput: 0: 42746.4. Samples: 2693903620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:49:26,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 15:49:27,708][12883] Updated weights for policy 0, policy_version 164414 (0.0026) +[2024-06-18 15:49:30,658][12883] Updated weights for policy 0, policy_version 164424 (0.0037) +[2024-06-18 15:49:31,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2693971968. Throughput: 0: 42747.3. Samples: 2694033680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:49:31,994][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 15:49:35,380][12883] Updated weights for policy 0, policy_version 164434 (0.0034) +[2024-06-18 15:49:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2694168576. Throughput: 0: 42675.1. Samples: 2694293520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:49:36,994][12645] Avg episode reward: [(0, '0.621')] +[2024-06-18 15:49:38,278][12883] Updated weights for policy 0, policy_version 164444 (0.0038) +[2024-06-18 15:49:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2694381568. Throughput: 0: 42723.1. Samples: 2694545080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:49:41,994][12645] Avg episode reward: [(0, '0.674')] +[2024-06-18 15:49:42,877][12883] Updated weights for policy 0, policy_version 164454 (0.0040) +[2024-06-18 15:49:44,209][12862] Signal inference workers to stop experience collection... (39400 times) +[2024-06-18 15:49:44,209][12862] Signal inference workers to resume experience collection... (39400 times) +[2024-06-18 15:49:44,253][12883] InferenceWorker_p0-w0: stopping experience collection (39400 times) +[2024-06-18 15:49:44,254][12883] InferenceWorker_p0-w0: resuming experience collection (39400 times) +[2024-06-18 15:49:46,238][12883] Updated weights for policy 0, policy_version 164464 (0.0037) +[2024-06-18 15:49:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 2694610944. Throughput: 0: 42791.9. Samples: 2694676120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:49:46,994][12645] Avg episode reward: [(0, '0.312')] +[2024-06-18 15:49:50,470][12883] Updated weights for policy 0, policy_version 164474 (0.0027) +[2024-06-18 15:49:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2694791168. Throughput: 0: 42528.8. Samples: 2694932900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:49:51,994][12645] Avg episode reward: [(0, '0.317')] +[2024-06-18 15:49:53,691][12883] Updated weights for policy 0, policy_version 164484 (0.0029) +[2024-06-18 15:49:56,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 2695020544. Throughput: 0: 42794.8. Samples: 2695189220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:49:56,996][12645] Avg episode reward: [(0, '0.342')] +[2024-06-18 15:49:58,294][12883] Updated weights for policy 0, policy_version 164494 (0.0037) +[2024-06-18 15:50:01,435][12883] Updated weights for policy 0, policy_version 164504 (0.0035) +[2024-06-18 15:50:01,996][12645] Fps is (10 sec: 45865.1, 60 sec: 42598.4, 300 sec: 42598.1). Total num frames: 2695249920. Throughput: 0: 42851.8. Samples: 2695321160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:01,997][12645] Avg episode reward: [(0, '0.378')] +[2024-06-18 15:50:06,358][12883] Updated weights for policy 0, policy_version 164514 (0.0038) +[2024-06-18 15:50:06,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2695430144. Throughput: 0: 42579.0. Samples: 2695573200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:06,994][12645] Avg episode reward: [(0, '0.485')] +[2024-06-18 15:50:09,314][12883] Updated weights for policy 0, policy_version 164524 (0.0047) +[2024-06-18 15:50:12,000][12645] Fps is (10 sec: 42581.4, 60 sec: 42867.1, 300 sec: 42765.0). Total num frames: 2695675904. Throughput: 0: 42648.3. Samples: 2695823060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:12,000][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 15:50:13,826][12883] Updated weights for policy 0, policy_version 164534 (0.0030) +[2024-06-18 15:50:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2695872512. Throughput: 0: 42790.2. Samples: 2695959240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:16,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 15:50:17,076][12883] Updated weights for policy 0, policy_version 164544 (0.0046) +[2024-06-18 15:50:21,319][12883] Updated weights for policy 0, policy_version 164554 (0.0031) +[2024-06-18 15:50:21,994][12645] Fps is (10 sec: 39345.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2696069120. Throughput: 0: 42560.8. Samples: 2696208760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:21,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 15:50:24,533][12883] Updated weights for policy 0, policy_version 164564 (0.0027) +[2024-06-18 15:50:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2696314880. Throughput: 0: 42563.6. Samples: 2696460440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:26,994][12645] Avg episode reward: [(0, '0.598')] +[2024-06-18 15:50:29,316][12883] Updated weights for policy 0, policy_version 164574 (0.0043) +[2024-06-18 15:50:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2696511488. Throughput: 0: 42613.7. Samples: 2696593740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:31,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 15:50:32,229][12883] Updated weights for policy 0, policy_version 164584 (0.0042) +[2024-06-18 15:50:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2696708096. Throughput: 0: 42492.9. Samples: 2696845080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:36,994][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 15:50:36,999][12883] Updated weights for policy 0, policy_version 164594 (0.0036) +[2024-06-18 15:50:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164594_2696708096.pth... +[2024-06-18 15:50:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163971_2686500864.pth +[2024-06-18 15:50:39,857][12883] Updated weights for policy 0, policy_version 164604 (0.0054) +[2024-06-18 15:50:42,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 2696937472. Throughput: 0: 42474.8. Samples: 2697100760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) +[2024-06-18 15:50:42,001][12645] Avg episode reward: [(0, '0.473')] +[2024-06-18 15:50:44,389][12883] Updated weights for policy 0, policy_version 164614 (0.0030) +[2024-06-18 15:50:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2697166848. Throughput: 0: 42367.4. Samples: 2697227600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:50:46,994][12645] Avg episode reward: [(0, '0.368')] +[2024-06-18 15:50:47,888][12883] Updated weights for policy 0, policy_version 164624 (0.0041) +[2024-06-18 15:50:51,777][12883] Updated weights for policy 0, policy_version 164634 (0.0035) +[2024-06-18 15:50:51,994][12645] Fps is (10 sec: 42625.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2697363456. Throughput: 0: 42522.3. Samples: 2697486700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:50:51,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 15:50:55,484][12883] Updated weights for policy 0, policy_version 164644 (0.0040) +[2024-06-18 15:50:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 2697592832. Throughput: 0: 42638.4. Samples: 2697741520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:50:56,994][12645] Avg episode reward: [(0, '0.322')] +[2024-06-18 15:50:58,008][12862] Signal inference workers to stop experience collection... (39450 times) +[2024-06-18 15:50:58,008][12862] Signal inference workers to resume experience collection... (39450 times) +[2024-06-18 15:50:58,058][12883] InferenceWorker_p0-w0: stopping experience collection (39450 times) +[2024-06-18 15:50:58,058][12883] InferenceWorker_p0-w0: resuming experience collection (39450 times) +[2024-06-18 15:50:59,329][12883] Updated weights for policy 0, policy_version 164654 (0.0031) +[2024-06-18 15:51:01,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42598.4, 300 sec: 42709.2). Total num frames: 2697805824. Throughput: 0: 42508.1. Samples: 2697872200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:01,996][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 15:51:03,152][12883] Updated weights for policy 0, policy_version 164664 (0.0033) +[2024-06-18 15:51:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2698002432. Throughput: 0: 42736.1. Samples: 2698131880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:06,994][12645] Avg episode reward: [(0, '0.741')] +[2024-06-18 15:51:07,112][12883] Updated weights for policy 0, policy_version 164674 (0.0038) +[2024-06-18 15:51:10,869][12883] Updated weights for policy 0, policy_version 164684 (0.0038) +[2024-06-18 15:51:11,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 2698248192. Throughput: 0: 42632.5. Samples: 2698378900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:11,994][12645] Avg episode reward: [(0, '0.607')] +[2024-06-18 15:51:15,017][12883] Updated weights for policy 0, policy_version 164694 (0.0023) +[2024-06-18 15:51:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2698444800. Throughput: 0: 42710.2. Samples: 2698515700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:16,994][12645] Avg episode reward: [(0, '0.472')] +[2024-06-18 15:51:18,289][12883] Updated weights for policy 0, policy_version 164704 (0.0023) +[2024-06-18 15:51:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2698625024. Throughput: 0: 42908.6. Samples: 2698775960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:21,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 15:51:22,681][12883] Updated weights for policy 0, policy_version 164714 (0.0030) +[2024-06-18 15:51:25,773][12883] Updated weights for policy 0, policy_version 164724 (0.0042) +[2024-06-18 15:51:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2698887168. Throughput: 0: 42806.4. Samples: 2699026780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:26,994][12645] Avg episode reward: [(0, '0.557')] +[2024-06-18 15:51:30,093][12883] Updated weights for policy 0, policy_version 164734 (0.0028) +[2024-06-18 15:51:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2699083776. Throughput: 0: 43009.9. Samples: 2699163040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:31,994][12645] Avg episode reward: [(0, '0.589')] +[2024-06-18 15:51:33,200][12883] Updated weights for policy 0, policy_version 164744 (0.0022) +[2024-06-18 15:51:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2699296768. Throughput: 0: 42983.6. Samples: 2699420960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:36,994][12645] Avg episode reward: [(0, '0.714')] +[2024-06-18 15:51:37,478][12883] Updated weights for policy 0, policy_version 164754 (0.0032) +[2024-06-18 15:51:40,862][12883] Updated weights for policy 0, policy_version 164764 (0.0029) +[2024-06-18 15:51:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 2699526144. Throughput: 0: 42885.8. Samples: 2699671380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:41,994][12645] Avg episode reward: [(0, '0.755')] +[2024-06-18 15:51:45,237][12883] Updated weights for policy 0, policy_version 164774 (0.0040) +[2024-06-18 15:51:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2699739136. Throughput: 0: 43067.9. Samples: 2699810160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) +[2024-06-18 15:51:46,994][12645] Avg episode reward: [(0, '0.818')] +[2024-06-18 15:51:48,688][12883] Updated weights for policy 0, policy_version 164784 (0.0042) +[2024-06-18 15:51:51,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2699919360. Throughput: 0: 42768.3. Samples: 2700056460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:51:51,994][12645] Avg episode reward: [(0, '0.379')] +[2024-06-18 15:51:52,885][12883] Updated weights for policy 0, policy_version 164794 (0.0025) +[2024-06-18 15:51:56,338][12883] Updated weights for policy 0, policy_version 164804 (0.0035) +[2024-06-18 15:51:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2700165120. Throughput: 0: 42902.1. Samples: 2700309500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:51:56,999][12645] Avg episode reward: [(0, '0.364')] +[2024-06-18 15:52:00,346][12883] Updated weights for policy 0, policy_version 164814 (0.0029) +[2024-06-18 15:52:01,994][12645] Fps is (10 sec: 47514.7, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 2700394496. Throughput: 0: 43083.7. Samples: 2700454460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:01,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 15:52:04,142][12883] Updated weights for policy 0, policy_version 164824 (0.0042) +[2024-06-18 15:52:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2700574720. Throughput: 0: 42877.3. Samples: 2700705440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:06,994][12645] Avg episode reward: [(0, '0.521')] +[2024-06-18 15:52:08,253][12883] Updated weights for policy 0, policy_version 164834 (0.0037) +[2024-06-18 15:52:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2700787712. Throughput: 0: 43072.0. Samples: 2700965020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:11,994][12645] Avg episode reward: [(0, '0.540')] +[2024-06-18 15:52:12,011][12883] Updated weights for policy 0, policy_version 164844 (0.0042) +[2024-06-18 15:52:15,768][12883] Updated weights for policy 0, policy_version 164854 (0.0035) +[2024-06-18 15:52:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2701033472. Throughput: 0: 42931.6. Samples: 2701094960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:16,994][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 15:52:17,288][12862] Signal inference workers to stop experience collection... (39500 times) +[2024-06-18 15:52:17,321][12883] InferenceWorker_p0-w0: stopping experience collection (39500 times) +[2024-06-18 15:52:17,344][12862] Signal inference workers to resume experience collection... (39500 times) +[2024-06-18 15:52:17,348][12883] InferenceWorker_p0-w0: resuming experience collection (39500 times) +[2024-06-18 15:52:19,543][12883] Updated weights for policy 0, policy_version 164864 (0.0034) +[2024-06-18 15:52:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42654.2). Total num frames: 2701213696. Throughput: 0: 42986.6. Samples: 2701355360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:21,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 15:52:23,083][12883] Updated weights for policy 0, policy_version 164874 (0.0041) +[2024-06-18 15:52:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2701459456. Throughput: 0: 43036.5. Samples: 2701608020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:26,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 15:52:26,997][12883] Updated weights for policy 0, policy_version 164884 (0.0040) +[2024-06-18 15:52:30,599][12883] Updated weights for policy 0, policy_version 164894 (0.0027) +[2024-06-18 15:52:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2701672448. Throughput: 0: 42952.8. Samples: 2701743040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:31,995][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 15:52:34,464][12883] Updated weights for policy 0, policy_version 164904 (0.0029) +[2024-06-18 15:52:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2701869056. Throughput: 0: 43231.3. Samples: 2702001860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:36,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 15:52:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164910_2701885440.pth... +[2024-06-18 15:52:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164283_2691612672.pth +[2024-06-18 15:52:38,435][12883] Updated weights for policy 0, policy_version 164914 (0.0026) +[2024-06-18 15:52:41,971][12883] Updated weights for policy 0, policy_version 164924 (0.0025) +[2024-06-18 15:52:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2702114816. Throughput: 0: 43205.8. Samples: 2702253760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:41,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 15:52:46,039][12883] Updated weights for policy 0, policy_version 164934 (0.0023) +[2024-06-18 15:52:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2702311424. Throughput: 0: 42942.5. Samples: 2702386880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 15:52:46,994][12645] Avg episode reward: [(0, '0.572')] +[2024-06-18 15:52:49,496][12883] Updated weights for policy 0, policy_version 164944 (0.0031) +[2024-06-18 15:52:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 2702524416. Throughput: 0: 43059.1. Samples: 2702643100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:52:51,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 15:52:53,486][12883] Updated weights for policy 0, policy_version 164954 (0.0033) +[2024-06-18 15:52:56,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2702753792. Throughput: 0: 42914.7. Samples: 2702896180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:52:56,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 15:52:57,091][12883] Updated weights for policy 0, policy_version 164964 (0.0032) +[2024-06-18 15:53:01,335][12883] Updated weights for policy 0, policy_version 164974 (0.0032) +[2024-06-18 15:53:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2702950400. Throughput: 0: 42922.1. Samples: 2703026460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:01,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 15:53:04,607][12883] Updated weights for policy 0, policy_version 164984 (0.0037) +[2024-06-18 15:53:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 2703163392. Throughput: 0: 42865.4. Samples: 2703284300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:06,994][12645] Avg episode reward: [(0, '0.525')] +[2024-06-18 15:53:08,911][12883] Updated weights for policy 0, policy_version 164994 (0.0027) +[2024-06-18 15:53:11,996][12645] Fps is (10 sec: 45864.2, 60 sec: 43688.9, 300 sec: 42875.7). Total num frames: 2703409152. Throughput: 0: 42947.4. Samples: 2703540760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:11,997][12645] Avg episode reward: [(0, '0.607')] +[2024-06-18 15:53:12,526][12883] Updated weights for policy 0, policy_version 165004 (0.0032) +[2024-06-18 15:53:16,653][12883] Updated weights for policy 0, policy_version 165014 (0.0038) +[2024-06-18 15:53:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2703605760. Throughput: 0: 42844.0. Samples: 2703671020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:16,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 15:53:19,949][12883] Updated weights for policy 0, policy_version 165024 (0.0040) +[2024-06-18 15:53:21,994][12645] Fps is (10 sec: 40969.0, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 2703818752. Throughput: 0: 42823.3. Samples: 2703928920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:21,995][12645] Avg episode reward: [(0, '0.539')] +[2024-06-18 15:53:24,414][12883] Updated weights for policy 0, policy_version 165034 (0.0029) +[2024-06-18 15:53:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2704048128. Throughput: 0: 42942.3. Samples: 2704186160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:26,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 15:53:27,619][12883] Updated weights for policy 0, policy_version 165044 (0.0030) +[2024-06-18 15:53:31,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2704228352. Throughput: 0: 42712.5. Samples: 2704308940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:31,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 15:53:32,239][12883] Updated weights for policy 0, policy_version 165054 (0.0047) +[2024-06-18 15:53:33,736][12862] Signal inference workers to stop experience collection... (39550 times) +[2024-06-18 15:53:33,790][12883] InferenceWorker_p0-w0: stopping experience collection (39550 times) +[2024-06-18 15:53:33,854][12862] Signal inference workers to resume experience collection... (39550 times) +[2024-06-18 15:53:33,854][12883] InferenceWorker_p0-w0: resuming experience collection (39550 times) +[2024-06-18 15:53:35,176][12883] Updated weights for policy 0, policy_version 165064 (0.0034) +[2024-06-18 15:53:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2704441344. Throughput: 0: 42697.9. Samples: 2704564500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:36,994][12645] Avg episode reward: [(0, '0.664')] +[2024-06-18 15:53:39,795][12883] Updated weights for policy 0, policy_version 165074 (0.0036) +[2024-06-18 15:53:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2704654336. Throughput: 0: 42897.2. Samples: 2704826560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:41,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 15:53:43,283][12883] Updated weights for policy 0, policy_version 165084 (0.0035) +[2024-06-18 15:53:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2704867328. Throughput: 0: 42823.5. Samples: 2704953520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:46,994][12645] Avg episode reward: [(0, '0.328')] +[2024-06-18 15:53:47,460][12883] Updated weights for policy 0, policy_version 165094 (0.0044) +[2024-06-18 15:53:50,908][12883] Updated weights for policy 0, policy_version 165104 (0.0036) +[2024-06-18 15:53:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2705096704. Throughput: 0: 42681.7. Samples: 2705204980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 15:53:51,994][12645] Avg episode reward: [(0, '0.481')] +[2024-06-18 15:53:54,866][12883] Updated weights for policy 0, policy_version 165114 (0.0032) +[2024-06-18 15:53:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2705293312. Throughput: 0: 42825.8. Samples: 2705467820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:53:56,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 15:53:58,373][12883] Updated weights for policy 0, policy_version 165124 (0.0023) +[2024-06-18 15:54:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2705506304. Throughput: 0: 42675.3. Samples: 2705591400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:01,994][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 15:54:02,452][12883] Updated weights for policy 0, policy_version 165134 (0.0030) +[2024-06-18 15:54:06,095][12883] Updated weights for policy 0, policy_version 165144 (0.0032) +[2024-06-18 15:54:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2705735680. Throughput: 0: 42701.6. Samples: 2705850480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:06,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 15:54:10,402][12883] Updated weights for policy 0, policy_version 165154 (0.0033) +[2024-06-18 15:54:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41780.9, 300 sec: 42654.0). Total num frames: 2705915904. Throughput: 0: 42719.1. Samples: 2706108520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:11,994][12645] Avg episode reward: [(0, '0.353')] +[2024-06-18 15:54:13,696][12883] Updated weights for policy 0, policy_version 165164 (0.0029) +[2024-06-18 15:54:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2706161664. Throughput: 0: 42787.1. Samples: 2706234360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:16,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 15:54:17,954][12883] Updated weights for policy 0, policy_version 165174 (0.0040) +[2024-06-18 15:54:21,274][12883] Updated weights for policy 0, policy_version 165184 (0.0029) +[2024-06-18 15:54:21,994][12645] Fps is (10 sec: 47512.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2706391040. Throughput: 0: 42922.9. Samples: 2706496040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:21,994][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 15:54:25,625][12883] Updated weights for policy 0, policy_version 165194 (0.0045) +[2024-06-18 15:54:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2706571264. Throughput: 0: 42792.9. Samples: 2706752240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:26,994][12645] Avg episode reward: [(0, '0.335')] +[2024-06-18 15:54:28,977][12883] Updated weights for policy 0, policy_version 165204 (0.0034) +[2024-06-18 15:54:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2706817024. Throughput: 0: 42829.4. Samples: 2706880840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:31,994][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 15:54:33,248][12883] Updated weights for policy 0, policy_version 165214 (0.0037) +[2024-06-18 15:54:36,659][12883] Updated weights for policy 0, policy_version 165224 (0.0028) +[2024-06-18 15:54:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2707030016. Throughput: 0: 43014.7. Samples: 2707140640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:36,994][12645] Avg episode reward: [(0, '0.690')] +[2024-06-18 15:54:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165225_2707046400.pth... +[2024-06-18 15:54:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164594_2696708096.pth +[2024-06-18 15:54:40,852][12883] Updated weights for policy 0, policy_version 165234 (0.0033) +[2024-06-18 15:54:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2707210240. Throughput: 0: 42975.9. Samples: 2707401740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:41,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 15:54:44,457][12883] Updated weights for policy 0, policy_version 165244 (0.0045) +[2024-06-18 15:54:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2707456000. Throughput: 0: 42953.7. Samples: 2707524320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:46,994][12645] Avg episode reward: [(0, '0.639')] +[2024-06-18 15:54:48,649][12883] Updated weights for policy 0, policy_version 165254 (0.0044) +[2024-06-18 15:54:51,576][12862] Signal inference workers to stop experience collection... (39600 times) +[2024-06-18 15:54:51,576][12862] Signal inference workers to resume experience collection... (39600 times) +[2024-06-18 15:54:51,616][12883] InferenceWorker_p0-w0: stopping experience collection (39600 times) +[2024-06-18 15:54:51,616][12883] InferenceWorker_p0-w0: resuming experience collection (39600 times) +[2024-06-18 15:54:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 2707668992. Throughput: 0: 42886.1. Samples: 2707780360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:51,994][12645] Avg episode reward: [(0, '0.542')] +[2024-06-18 15:54:52,064][12883] Updated weights for policy 0, policy_version 165264 (0.0037) +[2024-06-18 15:54:56,276][12883] Updated weights for policy 0, policy_version 165274 (0.0027) +[2024-06-18 15:54:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 2707865600. Throughput: 0: 42858.6. Samples: 2708037160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) +[2024-06-18 15:54:56,994][12645] Avg episode reward: [(0, '0.272')] +[2024-06-18 15:55:00,088][12883] Updated weights for policy 0, policy_version 165284 (0.0043) +[2024-06-18 15:55:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 2708111360. Throughput: 0: 42938.7. Samples: 2708166600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:01,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 15:55:03,724][12883] Updated weights for policy 0, policy_version 165294 (0.0036) +[2024-06-18 15:55:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 2708307968. Throughput: 0: 42968.0. Samples: 2708429600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:06,994][12645] Avg episode reward: [(0, '0.186')] +[2024-06-18 15:55:07,455][12883] Updated weights for policy 0, policy_version 165304 (0.0039) +[2024-06-18 15:55:11,226][12883] Updated weights for policy 0, policy_version 165314 (0.0036) +[2024-06-18 15:55:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2708520960. Throughput: 0: 42983.6. Samples: 2708686500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:11,994][12645] Avg episode reward: [(0, '0.343')] +[2024-06-18 15:55:15,127][12883] Updated weights for policy 0, policy_version 165324 (0.0032) +[2024-06-18 15:55:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2708750336. Throughput: 0: 42982.8. Samples: 2708815060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:16,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 15:55:18,823][12883] Updated weights for policy 0, policy_version 165334 (0.0032) +[2024-06-18 15:55:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2708946944. Throughput: 0: 42988.0. Samples: 2709075100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:21,994][12645] Avg episode reward: [(0, '0.718')] +[2024-06-18 15:55:22,675][12883] Updated weights for policy 0, policy_version 165344 (0.0028) +[2024-06-18 15:55:26,387][12883] Updated weights for policy 0, policy_version 165354 (0.0026) +[2024-06-18 15:55:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2709176320. Throughput: 0: 42808.5. Samples: 2709328120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:26,994][12645] Avg episode reward: [(0, '0.739')] +[2024-06-18 15:55:30,311][12883] Updated weights for policy 0, policy_version 165364 (0.0029) +[2024-06-18 15:55:31,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42869.9, 300 sec: 42986.9). Total num frames: 2709389312. Throughput: 0: 42928.5. Samples: 2709456200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:31,997][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 15:55:33,874][12883] Updated weights for policy 0, policy_version 165374 (0.0039) +[2024-06-18 15:55:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42932.5). Total num frames: 2709602304. Throughput: 0: 43088.9. Samples: 2709719360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:36,994][12645] Avg episode reward: [(0, '0.618')] +[2024-06-18 15:55:38,229][12883] Updated weights for policy 0, policy_version 165384 (0.0037) +[2024-06-18 15:55:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2709798912. Throughput: 0: 42940.9. Samples: 2709969500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:41,994][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 15:55:42,016][12883] Updated weights for policy 0, policy_version 165394 (0.0048) +[2024-06-18 15:55:46,005][12883] Updated weights for policy 0, policy_version 165404 (0.0043) +[2024-06-18 15:55:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2710028288. Throughput: 0: 42968.9. Samples: 2710100200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:46,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 15:55:49,678][12883] Updated weights for policy 0, policy_version 165414 (0.0036) +[2024-06-18 15:55:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2710224896. Throughput: 0: 42909.4. Samples: 2710360520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:51,994][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 15:55:53,654][12883] Updated weights for policy 0, policy_version 165424 (0.0034) +[2024-06-18 15:55:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 2710454272. Throughput: 0: 42722.8. Samples: 2710609020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:55:56,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 15:55:57,503][12883] Updated weights for policy 0, policy_version 165434 (0.0042) +[2024-06-18 15:56:01,353][12883] Updated weights for policy 0, policy_version 165444 (0.0043) +[2024-06-18 15:56:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2710667264. Throughput: 0: 42868.4. Samples: 2710744140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) +[2024-06-18 15:56:01,994][12645] Avg episode reward: [(0, '0.503')] +[2024-06-18 15:56:05,007][12883] Updated weights for policy 0, policy_version 165454 (0.0026) +[2024-06-18 15:56:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2710863872. Throughput: 0: 42751.1. Samples: 2710998900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:06,994][12645] Avg episode reward: [(0, '0.456')] +[2024-06-18 15:56:09,042][12883] Updated weights for policy 0, policy_version 165464 (0.0032) +[2024-06-18 15:56:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2711109632. Throughput: 0: 42677.8. Samples: 2711248620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:11,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 15:56:12,595][12883] Updated weights for policy 0, policy_version 165474 (0.0039) +[2024-06-18 15:56:16,633][12883] Updated weights for policy 0, policy_version 165484 (0.0033) +[2024-06-18 15:56:16,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 2711322624. Throughput: 0: 42782.1. Samples: 2711381300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:16,994][12645] Avg episode reward: [(0, '0.587')] +[2024-06-18 15:56:17,532][12862] Signal inference workers to stop experience collection... (39650 times) +[2024-06-18 15:56:17,571][12883] InferenceWorker_p0-w0: stopping experience collection (39650 times) +[2024-06-18 15:56:17,591][12862] Signal inference workers to resume experience collection... (39650 times) +[2024-06-18 15:56:17,593][12883] InferenceWorker_p0-w0: resuming experience collection (39650 times) +[2024-06-18 15:56:20,738][12883] Updated weights for policy 0, policy_version 165494 (0.0038) +[2024-06-18 15:56:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2711502848. Throughput: 0: 42602.4. Samples: 2711636460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:21,994][12645] Avg episode reward: [(0, '0.708')] +[2024-06-18 15:56:24,205][12883] Updated weights for policy 0, policy_version 165504 (0.0027) +[2024-06-18 15:56:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2711732224. Throughput: 0: 42539.0. Samples: 2711883760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:26,994][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 15:56:28,357][12883] Updated weights for policy 0, policy_version 165514 (0.0035) +[2024-06-18 15:56:31,757][12883] Updated weights for policy 0, policy_version 165524 (0.0033) +[2024-06-18 15:56:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 2711961600. Throughput: 0: 42644.0. Samples: 2712019180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:31,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 15:56:35,871][12883] Updated weights for policy 0, policy_version 165534 (0.0036) +[2024-06-18 15:56:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2712141824. Throughput: 0: 42579.0. Samples: 2712276580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:36,994][12645] Avg episode reward: [(0, '0.606')] +[2024-06-18 15:56:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165536_2712141824.pth... +[2024-06-18 15:56:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164910_2701885440.pth +[2024-06-18 15:56:39,508][12883] Updated weights for policy 0, policy_version 165544 (0.0038) +[2024-06-18 15:56:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2712387584. Throughput: 0: 42608.9. Samples: 2712526420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:41,994][12645] Avg episode reward: [(0, '0.738')] +[2024-06-18 15:56:43,401][12883] Updated weights for policy 0, policy_version 165554 (0.0031) +[2024-06-18 15:56:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2712584192. Throughput: 0: 42642.6. Samples: 2712663060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:46,994][12645] Avg episode reward: [(0, '0.729')] +[2024-06-18 15:56:47,129][12883] Updated weights for policy 0, policy_version 165564 (0.0042) +[2024-06-18 15:56:51,003][12883] Updated weights for policy 0, policy_version 165574 (0.0029) +[2024-06-18 15:56:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2712780800. Throughput: 0: 42645.4. Samples: 2712917940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:51,994][12645] Avg episode reward: [(0, '0.509')] +[2024-06-18 15:56:54,765][12883] Updated weights for policy 0, policy_version 165584 (0.0028) +[2024-06-18 15:56:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2713026560. Throughput: 0: 42570.6. Samples: 2713164300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:56:56,994][12645] Avg episode reward: [(0, '0.688')] +[2024-06-18 15:56:58,591][12883] Updated weights for policy 0, policy_version 165594 (0.0042) +[2024-06-18 15:57:01,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 2713223168. Throughput: 0: 42695.9. Samples: 2713302620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:57:01,995][12645] Avg episode reward: [(0, '0.411')] +[2024-06-18 15:57:02,444][12883] Updated weights for policy 0, policy_version 165604 (0.0041) +[2024-06-18 15:57:06,232][12883] Updated weights for policy 0, policy_version 165614 (0.0031) +[2024-06-18 15:57:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2713436160. Throughput: 0: 42844.4. Samples: 2713564460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) +[2024-06-18 15:57:06,994][12645] Avg episode reward: [(0, '0.547')] +[2024-06-18 15:57:10,153][12883] Updated weights for policy 0, policy_version 165624 (0.0026) +[2024-06-18 15:57:10,671][12862] Signal inference workers to stop experience collection... (39700 times) +[2024-06-18 15:57:10,672][12862] Signal inference workers to resume experience collection... (39700 times) +[2024-06-18 15:57:10,724][12883] InferenceWorker_p0-w0: stopping experience collection (39700 times) +[2024-06-18 15:57:10,724][12883] InferenceWorker_p0-w0: resuming experience collection (39700 times) +[2024-06-18 15:57:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2713681920. Throughput: 0: 42740.9. Samples: 2713807100. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:11,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 15:57:14,082][12883] Updated weights for policy 0, policy_version 165634 (0.0040) +[2024-06-18 15:57:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 2713829376. Throughput: 0: 42662.2. Samples: 2713938980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:16,994][12645] Avg episode reward: [(0, '0.663')] +[2024-06-18 15:57:17,776][12883] Updated weights for policy 0, policy_version 165644 (0.0029) +[2024-06-18 15:57:21,994][12645] Fps is (10 sec: 37684.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2714058752. Throughput: 0: 42649.1. Samples: 2714195780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:21,994][12645] Avg episode reward: [(0, '0.459')] +[2024-06-18 15:57:22,027][12883] Updated weights for policy 0, policy_version 165654 (0.0029) +[2024-06-18 15:57:25,448][12883] Updated weights for policy 0, policy_version 165664 (0.0036) +[2024-06-18 15:57:26,994][12645] Fps is (10 sec: 49151.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2714320896. Throughput: 0: 42698.6. Samples: 2714447860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:26,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 15:57:29,363][12883] Updated weights for policy 0, policy_version 165674 (0.0024) +[2024-06-18 15:57:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2714484736. Throughput: 0: 42654.3. Samples: 2714582500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:31,994][12645] Avg episode reward: [(0, '0.349')] +[2024-06-18 15:57:32,877][12883] Updated weights for policy 0, policy_version 165684 (0.0048) +[2024-06-18 15:57:36,904][12883] Updated weights for policy 0, policy_version 165694 (0.0039) +[2024-06-18 15:57:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2714730496. Throughput: 0: 42775.0. Samples: 2714842820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:36,996][12645] Avg episode reward: [(0, '0.363')] +[2024-06-18 15:57:40,619][12883] Updated weights for policy 0, policy_version 165704 (0.0029) +[2024-06-18 15:57:41,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2714959872. Throughput: 0: 42890.2. Samples: 2715094360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:41,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 15:57:44,584][12883] Updated weights for policy 0, policy_version 165714 (0.0028) +[2024-06-18 15:57:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2715140096. Throughput: 0: 42806.4. Samples: 2715228900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:46,994][12645] Avg episode reward: [(0, '0.520')] +[2024-06-18 15:57:48,081][12883] Updated weights for policy 0, policy_version 165724 (0.0035) +[2024-06-18 15:57:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2715369472. Throughput: 0: 42640.8. Samples: 2715483300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:51,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 15:57:52,092][12883] Updated weights for policy 0, policy_version 165734 (0.0031) +[2024-06-18 15:57:55,579][12883] Updated weights for policy 0, policy_version 165744 (0.0033) +[2024-06-18 15:57:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2715598848. Throughput: 0: 42902.7. Samples: 2715737720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:57:56,996][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 15:57:59,825][12883] Updated weights for policy 0, policy_version 165754 (0.0036) +[2024-06-18 15:58:01,998][12645] Fps is (10 sec: 40943.2, 60 sec: 42595.6, 300 sec: 42764.4). Total num frames: 2715779072. Throughput: 0: 42914.7. Samples: 2715870320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:58:02,004][12645] Avg episode reward: [(0, '0.733')] +[2024-06-18 15:58:03,224][12862] Signal inference workers to stop experience collection... (39750 times) +[2024-06-18 15:58:03,270][12883] InferenceWorker_p0-w0: stopping experience collection (39750 times) +[2024-06-18 15:58:03,277][12862] Signal inference workers to resume experience collection... (39750 times) +[2024-06-18 15:58:03,291][12883] InferenceWorker_p0-w0: resuming experience collection (39750 times) +[2024-06-18 15:58:03,407][12883] Updated weights for policy 0, policy_version 165764 (0.0030) +[2024-06-18 15:58:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 2716008448. Throughput: 0: 42826.0. Samples: 2716122960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:58:06,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 15:58:07,457][12883] Updated weights for policy 0, policy_version 165774 (0.0033) +[2024-06-18 15:58:11,445][12883] Updated weights for policy 0, policy_version 165784 (0.0029) +[2024-06-18 15:58:11,994][12645] Fps is (10 sec: 44255.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2716221440. Throughput: 0: 42853.0. Samples: 2716376240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) +[2024-06-18 15:58:11,994][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 15:58:15,055][12883] Updated weights for policy 0, policy_version 165794 (0.0036) +[2024-06-18 15:58:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2716418048. Throughput: 0: 42838.2. Samples: 2716510220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:16,994][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 15:58:18,855][12883] Updated weights for policy 0, policy_version 165804 (0.0036) +[2024-06-18 15:58:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2716647424. Throughput: 0: 42696.9. Samples: 2716764180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:21,994][12645] Avg episode reward: [(0, '0.601')] +[2024-06-18 15:58:22,748][12883] Updated weights for policy 0, policy_version 165814 (0.0031) +[2024-06-18 15:58:26,372][12883] Updated weights for policy 0, policy_version 165824 (0.0040) +[2024-06-18 15:58:26,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 42875.8). Total num frames: 2716876800. Throughput: 0: 42776.1. Samples: 2717019380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:26,996][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 15:58:30,479][12883] Updated weights for policy 0, policy_version 165834 (0.0035) +[2024-06-18 15:58:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2717057024. Throughput: 0: 42660.5. Samples: 2717148620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:31,994][12645] Avg episode reward: [(0, '0.475')] +[2024-06-18 15:58:34,421][12883] Updated weights for policy 0, policy_version 165844 (0.0036) +[2024-06-18 15:58:36,994][12645] Fps is (10 sec: 42608.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2717302784. Throughput: 0: 42784.1. Samples: 2717408580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:36,994][12645] Avg episode reward: [(0, '0.221')] +[2024-06-18 15:58:37,071][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165852_2717319168.pth... +[2024-06-18 15:58:37,147][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165225_2707046400.pth +[2024-06-18 15:58:38,515][12883] Updated weights for policy 0, policy_version 165854 (0.0024) +[2024-06-18 15:58:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2717499392. Throughput: 0: 42600.0. Samples: 2717654720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:41,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 15:58:42,263][12883] Updated weights for policy 0, policy_version 165864 (0.0033) +[2024-06-18 15:58:46,050][12883] Updated weights for policy 0, policy_version 165874 (0.0030) +[2024-06-18 15:58:47,000][12645] Fps is (10 sec: 40933.8, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 2717712384. Throughput: 0: 42475.3. Samples: 2717781800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:47,001][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 15:58:49,919][12883] Updated weights for policy 0, policy_version 165884 (0.0028) +[2024-06-18 15:58:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2717941760. Throughput: 0: 42580.5. Samples: 2718039080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:51,994][12645] Avg episode reward: [(0, '0.501')] +[2024-06-18 15:58:53,662][12883] Updated weights for policy 0, policy_version 165894 (0.0036) +[2024-06-18 15:58:56,994][12645] Fps is (10 sec: 44264.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2718154752. Throughput: 0: 42704.8. Samples: 2718297960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:58:56,994][12645] Avg episode reward: [(0, '0.458')] +[2024-06-18 15:58:57,496][12883] Updated weights for policy 0, policy_version 165904 (0.0035) +[2024-06-18 15:59:01,319][12883] Updated weights for policy 0, policy_version 165914 (0.0035) +[2024-06-18 15:59:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42874.5, 300 sec: 42765.0). Total num frames: 2718351360. Throughput: 0: 42490.7. Samples: 2718422300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:59:01,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 15:59:05,394][12883] Updated weights for policy 0, policy_version 165924 (0.0044) +[2024-06-18 15:59:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2718580736. Throughput: 0: 42637.8. Samples: 2718682880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:59:06,994][12645] Avg episode reward: [(0, '0.759')] +[2024-06-18 15:59:08,829][12883] Updated weights for policy 0, policy_version 165934 (0.0041) +[2024-06-18 15:59:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2718793728. Throughput: 0: 42607.9. Samples: 2718936640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:59:11,994][12645] Avg episode reward: [(0, '0.645')] +[2024-06-18 15:59:12,928][12883] Updated weights for policy 0, policy_version 165944 (0.0030) +[2024-06-18 15:59:16,578][12883] Updated weights for policy 0, policy_version 165954 (0.0027) +[2024-06-18 15:59:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2719006720. Throughput: 0: 42571.4. Samples: 2719064340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) +[2024-06-18 15:59:16,994][12645] Avg episode reward: [(0, '0.479')] +[2024-06-18 15:59:20,636][12883] Updated weights for policy 0, policy_version 165964 (0.0046) +[2024-06-18 15:59:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2719203328. Throughput: 0: 42595.1. Samples: 2719325360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:59:21,994][12645] Avg episode reward: [(0, '0.259')] +[2024-06-18 15:59:24,217][12883] Updated weights for policy 0, policy_version 165974 (0.0037) +[2024-06-18 15:59:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 2719432704. Throughput: 0: 42762.2. Samples: 2719579020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:59:26,994][12645] Avg episode reward: [(0, '0.448')] +[2024-06-18 15:59:28,687][12883] Updated weights for policy 0, policy_version 165984 (0.0035) +[2024-06-18 15:59:29,612][12862] Signal inference workers to stop experience collection... (39800 times) +[2024-06-18 15:59:29,612][12862] Signal inference workers to resume experience collection... (39800 times) +[2024-06-18 15:59:29,659][12883] InferenceWorker_p0-w0: stopping experience collection (39800 times) +[2024-06-18 15:59:29,660][12883] InferenceWorker_p0-w0: resuming experience collection (39800 times) +[2024-06-18 15:59:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2719629312. Throughput: 0: 42864.2. Samples: 2719710420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:59:31,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 15:59:32,149][12883] Updated weights for policy 0, policy_version 165994 (0.0029) +[2024-06-18 15:59:36,214][12883] Updated weights for policy 0, policy_version 166004 (0.0027) +[2024-06-18 15:59:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2719842304. Throughput: 0: 42695.6. Samples: 2719960380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:59:36,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 15:59:39,791][12883] Updated weights for policy 0, policy_version 166014 (0.0025) +[2024-06-18 15:59:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2720071680. Throughput: 0: 42795.6. Samples: 2720223760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:59:41,994][12645] Avg episode reward: [(0, '0.505')] +[2024-06-18 15:59:43,694][12883] Updated weights for policy 0, policy_version 166024 (0.0028) +[2024-06-18 15:59:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 2720268288. Throughput: 0: 42889.3. Samples: 2720352320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:59:46,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 15:59:47,313][12883] Updated weights for policy 0, policy_version 166034 (0.0040) +[2024-06-18 15:59:51,326][12883] Updated weights for policy 0, policy_version 166044 (0.0041) +[2024-06-18 15:59:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2720497664. Throughput: 0: 42830.7. Samples: 2720610260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:59:51,994][12645] Avg episode reward: [(0, '0.563')] +[2024-06-18 15:59:55,083][12883] Updated weights for policy 0, policy_version 166054 (0.0026) +[2024-06-18 15:59:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2720710656. Throughput: 0: 42851.7. Samples: 2720864960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 15:59:56,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 15:59:58,960][12883] Updated weights for policy 0, policy_version 166064 (0.0041) +[2024-06-18 16:00:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2720907264. Throughput: 0: 42920.2. Samples: 2720995740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:01,994][12645] Avg episode reward: [(0, '0.290')] +[2024-06-18 16:00:02,868][12883] Updated weights for policy 0, policy_version 166074 (0.0022) +[2024-06-18 16:00:06,971][12883] Updated weights for policy 0, policy_version 166084 (0.0030) +[2024-06-18 16:00:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2721120256. Throughput: 0: 42874.6. Samples: 2721254720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:06,994][12645] Avg episode reward: [(0, '0.418')] +[2024-06-18 16:00:10,352][12883] Updated weights for policy 0, policy_version 166094 (0.0034) +[2024-06-18 16:00:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2721366016. Throughput: 0: 42899.2. Samples: 2721509480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:11,994][12645] Avg episode reward: [(0, '0.487')] +[2024-06-18 16:00:14,552][12883] Updated weights for policy 0, policy_version 166104 (0.0042) +[2024-06-18 16:00:16,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2721562624. Throughput: 0: 42909.2. Samples: 2721641340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:16,994][12645] Avg episode reward: [(0, '0.632')] +[2024-06-18 16:00:18,063][12883] Updated weights for policy 0, policy_version 166114 (0.0039) +[2024-06-18 16:00:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2721759232. Throughput: 0: 43000.1. Samples: 2721895380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:21,994][12645] Avg episode reward: [(0, '0.671')] +[2024-06-18 16:00:22,039][12883] Updated weights for policy 0, policy_version 166124 (0.0028) +[2024-06-18 16:00:25,597][12883] Updated weights for policy 0, policy_version 166134 (0.0047) +[2024-06-18 16:00:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 2722004992. Throughput: 0: 42921.2. Samples: 2722155220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:26,999][12645] Avg episode reward: [(0, '0.666')] +[2024-06-18 16:00:29,541][12883] Updated weights for policy 0, policy_version 166144 (0.0033) +[2024-06-18 16:00:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2722201600. Throughput: 0: 42923.1. Samples: 2722283860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:31,994][12645] Avg episode reward: [(0, '0.462')] +[2024-06-18 16:00:33,026][12883] Updated weights for policy 0, policy_version 166154 (0.0031) +[2024-06-18 16:00:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2722414592. Throughput: 0: 42840.4. Samples: 2722538080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:36,994][12645] Avg episode reward: [(0, '0.391')] +[2024-06-18 16:00:37,091][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166164_2722430976.pth... +[2024-06-18 16:00:37,095][12883] Updated weights for policy 0, policy_version 166164 (0.0040) +[2024-06-18 16:00:37,144][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165536_2712141824.pth +[2024-06-18 16:00:40,607][12883] Updated weights for policy 0, policy_version 166174 (0.0039) +[2024-06-18 16:00:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2722643968. Throughput: 0: 42971.1. Samples: 2722798660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:41,994][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 16:00:42,347][12862] Signal inference workers to stop experience collection... (39850 times) +[2024-06-18 16:00:42,397][12883] InferenceWorker_p0-w0: stopping experience collection (39850 times) +[2024-06-18 16:00:42,462][12862] Signal inference workers to resume experience collection... (39850 times) +[2024-06-18 16:00:42,462][12883] InferenceWorker_p0-w0: resuming experience collection (39850 times) +[2024-06-18 16:00:44,632][12883] Updated weights for policy 0, policy_version 166184 (0.0035) +[2024-06-18 16:00:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2722840576. Throughput: 0: 42943.5. Samples: 2722928200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:46,994][12645] Avg episode reward: [(0, '0.316')] +[2024-06-18 16:00:48,084][12883] Updated weights for policy 0, policy_version 166194 (0.0036) +[2024-06-18 16:00:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2723053568. Throughput: 0: 42646.7. Samples: 2723173820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:51,994][12645] Avg episode reward: [(0, '0.500')] +[2024-06-18 16:00:52,658][12883] Updated weights for policy 0, policy_version 166204 (0.0038) +[2024-06-18 16:00:56,054][12883] Updated weights for policy 0, policy_version 166214 (0.0026) +[2024-06-18 16:00:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2723282944. Throughput: 0: 42705.7. Samples: 2723431240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:00:56,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 16:01:00,474][12883] Updated weights for policy 0, policy_version 166224 (0.0026) +[2024-06-18 16:01:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2723495936. Throughput: 0: 42863.0. Samples: 2723570160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:01:01,994][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 16:01:03,514][12883] Updated weights for policy 0, policy_version 166234 (0.0040) +[2024-06-18 16:01:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2723692544. Throughput: 0: 42777.6. Samples: 2723820380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:01:06,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 16:01:08,088][12883] Updated weights for policy 0, policy_version 166244 (0.0034) +[2024-06-18 16:01:11,069][12883] Updated weights for policy 0, policy_version 166254 (0.0047) +[2024-06-18 16:01:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2723921920. Throughput: 0: 42652.4. Samples: 2724074580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:01:11,994][12645] Avg episode reward: [(0, '0.624')] +[2024-06-18 16:01:15,697][12883] Updated weights for policy 0, policy_version 166264 (0.0036) +[2024-06-18 16:01:16,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 2724134912. Throughput: 0: 42716.5. Samples: 2724206100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:01:17,000][12645] Avg episode reward: [(0, '0.467')] +[2024-06-18 16:01:18,719][12883] Updated weights for policy 0, policy_version 166274 (0.0026) +[2024-06-18 16:01:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2724331520. Throughput: 0: 42625.4. Samples: 2724456220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:01:21,994][12645] Avg episode reward: [(0, '0.428')] +[2024-06-18 16:01:23,354][12883] Updated weights for policy 0, policy_version 166284 (0.0038) +[2024-06-18 16:01:26,515][12883] Updated weights for policy 0, policy_version 166294 (0.0027) +[2024-06-18 16:01:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2724577280. Throughput: 0: 42533.6. Samples: 2724712680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) +[2024-06-18 16:01:26,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 16:01:30,964][12883] Updated weights for policy 0, policy_version 166304 (0.0031) +[2024-06-18 16:01:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2724757504. Throughput: 0: 42570.3. Samples: 2724843860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:01:31,994][12645] Avg episode reward: [(0, '0.545')] +[2024-06-18 16:01:34,232][12883] Updated weights for policy 0, policy_version 166314 (0.0026) +[2024-06-18 16:01:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2724986880. Throughput: 0: 42718.1. Samples: 2725096140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:01:36,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 16:01:38,951][12883] Updated weights for policy 0, policy_version 166324 (0.0041) +[2024-06-18 16:01:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2725199872. Throughput: 0: 42728.5. Samples: 2725354020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:01:41,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 16:01:42,002][12883] Updated weights for policy 0, policy_version 166334 (0.0037) +[2024-06-18 16:01:46,695][12883] Updated weights for policy 0, policy_version 166344 (0.0031) +[2024-06-18 16:01:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2725396480. Throughput: 0: 42489.7. Samples: 2725482200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:01:46,994][12645] Avg episode reward: [(0, '0.584')] +[2024-06-18 16:01:50,007][12883] Updated weights for policy 0, policy_version 166354 (0.0035) +[2024-06-18 16:01:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2725625856. Throughput: 0: 42635.2. Samples: 2725738960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:01:51,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 16:01:54,385][12883] Updated weights for policy 0, policy_version 166364 (0.0042) +[2024-06-18 16:01:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2725855232. Throughput: 0: 42631.9. Samples: 2725993020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:01:56,994][12645] Avg episode reward: [(0, '0.723')] +[2024-06-18 16:01:57,750][12883] Updated weights for policy 0, policy_version 166374 (0.0045) +[2024-06-18 16:02:01,893][12883] Updated weights for policy 0, policy_version 166384 (0.0027) +[2024-06-18 16:02:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2726035456. Throughput: 0: 42451.9. Samples: 2726116440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:02:01,994][12645] Avg episode reward: [(0, '0.421')] +[2024-06-18 16:02:05,454][12862] Signal inference workers to stop experience collection... (39900 times) +[2024-06-18 16:02:05,500][12883] InferenceWorker_p0-w0: stopping experience collection (39900 times) +[2024-06-18 16:02:05,504][12862] Signal inference workers to resume experience collection... (39900 times) +[2024-06-18 16:02:05,512][12883] InferenceWorker_p0-w0: resuming experience collection (39900 times) +[2024-06-18 16:02:05,516][12883] Updated weights for policy 0, policy_version 166394 (0.0036) +[2024-06-18 16:02:06,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2726264832. Throughput: 0: 42696.4. Samples: 2726377560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:02:06,994][12645] Avg episode reward: [(0, '0.334')] +[2024-06-18 16:02:09,499][12883] Updated weights for policy 0, policy_version 166404 (0.0028) +[2024-06-18 16:02:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2726477824. Throughput: 0: 42620.5. Samples: 2726630600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:02:11,994][12645] Avg episode reward: [(0, '0.721')] +[2024-06-18 16:02:13,135][12883] Updated weights for policy 0, policy_version 166414 (0.0032) +[2024-06-18 16:02:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2726658048. Throughput: 0: 42462.6. Samples: 2726754680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:02:16,994][12645] Avg episode reward: [(0, '0.425')] +[2024-06-18 16:02:17,400][12883] Updated weights for policy 0, policy_version 166424 (0.0034) +[2024-06-18 16:02:20,734][12883] Updated weights for policy 0, policy_version 166434 (0.0035) +[2024-06-18 16:02:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2726920192. Throughput: 0: 42502.7. Samples: 2727008760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:02:21,994][12645] Avg episode reward: [(0, '0.330')] +[2024-06-18 16:02:25,154][12883] Updated weights for policy 0, policy_version 166444 (0.0030) +[2024-06-18 16:02:26,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2727116800. Throughput: 0: 42391.0. Samples: 2727261620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:02:26,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 16:02:28,432][12883] Updated weights for policy 0, policy_version 166454 (0.0041) +[2024-06-18 16:02:31,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2727297024. Throughput: 0: 42294.9. Samples: 2727385480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) +[2024-06-18 16:02:31,994][12645] Avg episode reward: [(0, '0.480')] +[2024-06-18 16:02:33,037][12883] Updated weights for policy 0, policy_version 166464 (0.0034) +[2024-06-18 16:02:36,053][12883] Updated weights for policy 0, policy_version 166474 (0.0037) +[2024-06-18 16:02:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2727542784. Throughput: 0: 42409.5. Samples: 2727647380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:02:36,994][12645] Avg episode reward: [(0, '0.573')] +[2024-06-18 16:02:37,133][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166477_2727559168.pth... +[2024-06-18 16:02:37,190][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165852_2717319168.pth +[2024-06-18 16:02:40,611][12883] Updated weights for policy 0, policy_version 166484 (0.0042) +[2024-06-18 16:02:41,994][12645] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2727755776. Throughput: 0: 42394.4. Samples: 2727900760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:02:41,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 16:02:44,220][12883] Updated weights for policy 0, policy_version 166494 (0.0031) +[2024-06-18 16:02:46,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2727952384. Throughput: 0: 42418.7. Samples: 2728025280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:02:46,997][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 16:02:48,269][12883] Updated weights for policy 0, policy_version 166504 (0.0037) +[2024-06-18 16:02:51,658][12883] Updated weights for policy 0, policy_version 166514 (0.0021) +[2024-06-18 16:02:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2728181760. Throughput: 0: 42367.0. Samples: 2728284080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:02:51,994][12645] Avg episode reward: [(0, '0.339')] +[2024-06-18 16:02:55,847][12883] Updated weights for policy 0, policy_version 166524 (0.0039) +[2024-06-18 16:02:57,000][12645] Fps is (10 sec: 40936.3, 60 sec: 41775.2, 300 sec: 42653.7). Total num frames: 2728361984. Throughput: 0: 42501.2. Samples: 2728543400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:02:57,000][12645] Avg episode reward: [(0, '0.553')] +[2024-06-18 16:02:59,053][12883] Updated weights for policy 0, policy_version 166534 (0.0033) +[2024-06-18 16:03:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2728591360. Throughput: 0: 42434.2. Samples: 2728664220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:03:01,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 16:03:03,562][12883] Updated weights for policy 0, policy_version 166544 (0.0036) +[2024-06-18 16:03:06,980][12883] Updated weights for policy 0, policy_version 166554 (0.0033) +[2024-06-18 16:03:06,994][12645] Fps is (10 sec: 45902.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2728820736. Throughput: 0: 42450.7. Samples: 2728919040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:03:06,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 16:03:11,593][12883] Updated weights for policy 0, policy_version 166564 (0.0037) +[2024-06-18 16:03:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2729000960. Throughput: 0: 42579.2. Samples: 2729177680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:03:11,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 16:03:14,633][12883] Updated weights for policy 0, policy_version 166574 (0.0028) +[2024-06-18 16:03:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2729213952. Throughput: 0: 42514.5. Samples: 2729298620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:03:16,994][12645] Avg episode reward: [(0, '0.723')] +[2024-06-18 16:03:19,183][12883] Updated weights for policy 0, policy_version 166584 (0.0034) +[2024-06-18 16:03:21,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 2729459712. Throughput: 0: 42464.3. Samples: 2729558280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:03:21,994][12645] Avg episode reward: [(0, '0.590')] +[2024-06-18 16:03:22,513][12883] Updated weights for policy 0, policy_version 166594 (0.0023) +[2024-06-18 16:03:25,927][12862] Signal inference workers to stop experience collection... (39950 times) +[2024-06-18 16:03:25,973][12883] InferenceWorker_p0-w0: stopping experience collection (39950 times) +[2024-06-18 16:03:25,984][12862] Signal inference workers to resume experience collection... (39950 times) +[2024-06-18 16:03:26,000][12883] InferenceWorker_p0-w0: resuming experience collection (39950 times) +[2024-06-18 16:03:26,654][12883] Updated weights for policy 0, policy_version 166604 (0.0035) +[2024-06-18 16:03:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2729656320. Throughput: 0: 42645.3. Samples: 2729819800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:03:26,994][12645] Avg episode reward: [(0, '0.409')] +[2024-06-18 16:03:29,980][12883] Updated weights for policy 0, policy_version 166614 (0.0031) +[2024-06-18 16:03:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2729869312. Throughput: 0: 42641.8. Samples: 2729944160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:03:31,995][12645] Avg episode reward: [(0, '0.424')] +[2024-06-18 16:03:34,272][12883] Updated weights for policy 0, policy_version 166624 (0.0037) +[2024-06-18 16:03:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2730098688. Throughput: 0: 42603.7. Samples: 2730201240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) +[2024-06-18 16:03:36,994][12645] Avg episode reward: [(0, '0.748')] +[2024-06-18 16:03:37,538][12883] Updated weights for policy 0, policy_version 166634 (0.0026) +[2024-06-18 16:03:41,892][12883] Updated weights for policy 0, policy_version 166644 (0.0039) +[2024-06-18 16:03:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.9). Total num frames: 2730295296. Throughput: 0: 42653.6. Samples: 2730462560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:03:41,994][12645] Avg episode reward: [(0, '0.690')] +[2024-06-18 16:03:45,084][12883] Updated weights for policy 0, policy_version 166654 (0.0035) +[2024-06-18 16:03:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2730508288. Throughput: 0: 42663.1. Samples: 2730584060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:03:46,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 16:03:49,744][12883] Updated weights for policy 0, policy_version 166664 (0.0042) +[2024-06-18 16:03:51,999][12645] Fps is (10 sec: 44213.9, 60 sec: 42594.8, 300 sec: 42653.2). Total num frames: 2730737664. Throughput: 0: 42713.8. Samples: 2730841380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:03:51,999][12645] Avg episode reward: [(0, '0.614')] +[2024-06-18 16:03:52,804][12883] Updated weights for policy 0, policy_version 166674 (0.0036) +[2024-06-18 16:03:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42875.6, 300 sec: 42653.9). Total num frames: 2730934272. Throughput: 0: 42840.8. Samples: 2731105520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:03:56,994][12645] Avg episode reward: [(0, '0.718')] +[2024-06-18 16:03:57,163][12883] Updated weights for policy 0, policy_version 166684 (0.0042) +[2024-06-18 16:04:00,340][12883] Updated weights for policy 0, policy_version 166694 (0.0031) +[2024-06-18 16:04:01,994][12645] Fps is (10 sec: 40981.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2731147264. Throughput: 0: 42881.3. Samples: 2731228280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:01,994][12645] Avg episode reward: [(0, '0.526')] +[2024-06-18 16:04:04,616][12883] Updated weights for policy 0, policy_version 166704 (0.0052) +[2024-06-18 16:04:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2731393024. Throughput: 0: 42930.6. Samples: 2731490160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:06,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 16:04:07,945][12883] Updated weights for policy 0, policy_version 166714 (0.0042) +[2024-06-18 16:04:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2731573248. Throughput: 0: 42889.3. Samples: 2731749820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:11,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 16:04:12,598][12883] Updated weights for policy 0, policy_version 166724 (0.0043) +[2024-06-18 16:04:15,528][12883] Updated weights for policy 0, policy_version 166734 (0.0034) +[2024-06-18 16:04:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2731802624. Throughput: 0: 42832.0. Samples: 2731871600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:16,994][12645] Avg episode reward: [(0, '0.277')] +[2024-06-18 16:04:20,201][12883] Updated weights for policy 0, policy_version 166744 (0.0037) +[2024-06-18 16:04:21,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2732032000. Throughput: 0: 42976.8. Samples: 2732135200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:21,994][12645] Avg episode reward: [(0, '0.327')] +[2024-06-18 16:04:22,997][12883] Updated weights for policy 0, policy_version 166754 (0.0029) +[2024-06-18 16:04:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2732228608. Throughput: 0: 42956.4. Samples: 2732395600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:26,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 16:04:27,844][12883] Updated weights for policy 0, policy_version 166764 (0.0036) +[2024-06-18 16:04:30,493][12883] Updated weights for policy 0, policy_version 166774 (0.0030) +[2024-06-18 16:04:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2732457984. Throughput: 0: 42955.2. Samples: 2732517040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:31,994][12645] Avg episode reward: [(0, '0.665')] +[2024-06-18 16:04:35,942][12883] Updated weights for policy 0, policy_version 166784 (0.0037) +[2024-06-18 16:04:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2732670976. Throughput: 0: 43155.6. Samples: 2732783160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:36,994][12645] Avg episode reward: [(0, '0.549')] +[2024-06-18 16:04:37,101][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166790_2732687360.pth... +[2024-06-18 16:04:37,158][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166164_2722430976.pth +[2024-06-18 16:04:38,429][12883] Updated weights for policy 0, policy_version 166794 (0.0044) +[2024-06-18 16:04:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2732851200. Throughput: 0: 42921.0. Samples: 2733036960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) +[2024-06-18 16:04:41,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 16:04:42,131][12862] Signal inference workers to stop experience collection... (40000 times) +[2024-06-18 16:04:42,186][12883] InferenceWorker_p0-w0: stopping experience collection (40000 times) +[2024-06-18 16:04:42,192][12862] Signal inference workers to resume experience collection... (40000 times) +[2024-06-18 16:04:42,208][12883] InferenceWorker_p0-w0: resuming experience collection (40000 times) +[2024-06-18 16:04:43,396][12883] Updated weights for policy 0, policy_version 166804 (0.0029) +[2024-06-18 16:04:46,260][12883] Updated weights for policy 0, policy_version 166814 (0.0045) +[2024-06-18 16:04:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2733096960. Throughput: 0: 42842.1. Samples: 2733156180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:04:46,994][12645] Avg episode reward: [(0, '0.354')] +[2024-06-18 16:04:51,122][12883] Updated weights for policy 0, policy_version 166824 (0.0038) +[2024-06-18 16:04:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42602.0, 300 sec: 42653.9). Total num frames: 2733293568. Throughput: 0: 42881.8. Samples: 2733419840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:04:51,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 16:04:53,939][12883] Updated weights for policy 0, policy_version 166834 (0.0037) +[2024-06-18 16:04:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2733506560. Throughput: 0: 42789.7. Samples: 2733675360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:04:56,994][12645] Avg episode reward: [(0, '0.374')] +[2024-06-18 16:04:58,627][12883] Updated weights for policy 0, policy_version 166844 (0.0028) +[2024-06-18 16:05:01,690][12883] Updated weights for policy 0, policy_version 166854 (0.0024) +[2024-06-18 16:05:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 2733752320. Throughput: 0: 42969.8. Samples: 2733805240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:02,000][12645] Avg episode reward: [(0, '0.403')] +[2024-06-18 16:05:06,088][12883] Updated weights for policy 0, policy_version 166864 (0.0028) +[2024-06-18 16:05:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2733932544. Throughput: 0: 42857.4. Samples: 2734063780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:06,994][12645] Avg episode reward: [(0, '0.482')] +[2024-06-18 16:05:09,161][12883] Updated weights for policy 0, policy_version 166874 (0.0033) +[2024-06-18 16:05:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2734161920. Throughput: 0: 42833.8. Samples: 2734323120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:11,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 16:05:13,714][12883] Updated weights for policy 0, policy_version 166884 (0.0036) +[2024-06-18 16:05:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2734374912. Throughput: 0: 42979.9. Samples: 2734451140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:16,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 16:05:17,277][12883] Updated weights for policy 0, policy_version 166894 (0.0038) +[2024-06-18 16:05:21,207][12883] Updated weights for policy 0, policy_version 166904 (0.0045) +[2024-06-18 16:05:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2734587904. Throughput: 0: 42866.6. Samples: 2734712160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:21,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 16:05:24,861][12883] Updated weights for policy 0, policy_version 166914 (0.0044) +[2024-06-18 16:05:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2734800896. Throughput: 0: 42875.9. Samples: 2734966380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:26,996][12645] Avg episode reward: [(0, '0.257')] +[2024-06-18 16:05:28,821][12883] Updated weights for policy 0, policy_version 166924 (0.0040) +[2024-06-18 16:05:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2735013888. Throughput: 0: 43055.1. Samples: 2735093660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:31,994][12645] Avg episode reward: [(0, '0.732')] +[2024-06-18 16:05:33,090][12883] Updated weights for policy 0, policy_version 166934 (0.0040) +[2024-06-18 16:05:36,563][12883] Updated weights for policy 0, policy_version 166944 (0.0040) +[2024-06-18 16:05:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2735226880. Throughput: 0: 42784.0. Samples: 2735345120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:36,994][12645] Avg episode reward: [(0, '0.369')] +[2024-06-18 16:05:40,762][12883] Updated weights for policy 0, policy_version 166954 (0.0030) +[2024-06-18 16:05:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.3, 300 sec: 42709.4). Total num frames: 2735439872. Throughput: 0: 42761.6. Samples: 2735599640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:41,995][12645] Avg episode reward: [(0, '0.226')] +[2024-06-18 16:05:44,254][12883] Updated weights for policy 0, policy_version 166964 (0.0044) +[2024-06-18 16:05:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2735652864. Throughput: 0: 42668.1. Samples: 2735725300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) +[2024-06-18 16:05:46,994][12645] Avg episode reward: [(0, '0.292')] +[2024-06-18 16:05:48,287][12883] Updated weights for policy 0, policy_version 166974 (0.0037) +[2024-06-18 16:05:51,873][12883] Updated weights for policy 0, policy_version 166984 (0.0032) +[2024-06-18 16:05:51,994][12645] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2735865856. Throughput: 0: 42539.5. Samples: 2735978060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:05:51,994][12645] Avg episode reward: [(0, '0.518')] +[2024-06-18 16:05:56,304][12883] Updated weights for policy 0, policy_version 166994 (0.0035) +[2024-06-18 16:05:56,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2736062464. Throughput: 0: 42572.7. Samples: 2736238900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:05:56,994][12645] Avg episode reward: [(0, '0.537')] +[2024-06-18 16:05:59,592][12883] Updated weights for policy 0, policy_version 167004 (0.0047) +[2024-06-18 16:06:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2736308224. Throughput: 0: 42487.1. Samples: 2736363060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:01,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 16:06:03,931][12883] Updated weights for policy 0, policy_version 167014 (0.0032) +[2024-06-18 16:06:06,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2736488448. Throughput: 0: 42312.5. Samples: 2736616220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:06,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 16:06:07,521][12883] Updated weights for policy 0, policy_version 167024 (0.0044) +[2024-06-18 16:06:11,671][12883] Updated weights for policy 0, policy_version 167034 (0.0034) +[2024-06-18 16:06:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2736717824. Throughput: 0: 42484.5. Samples: 2736878180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:11,994][12645] Avg episode reward: [(0, '0.599')] +[2024-06-18 16:06:13,180][12862] Signal inference workers to stop experience collection... (40050 times) +[2024-06-18 16:06:13,180][12862] Signal inference workers to resume experience collection... (40050 times) +[2024-06-18 16:06:13,202][12883] InferenceWorker_p0-w0: stopping experience collection (40050 times) +[2024-06-18 16:06:13,203][12883] InferenceWorker_p0-w0: resuming experience collection (40050 times) +[2024-06-18 16:06:14,985][12883] Updated weights for policy 0, policy_version 167044 (0.0037) +[2024-06-18 16:06:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2736914432. Throughput: 0: 42377.9. Samples: 2737000660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:16,994][12645] Avg episode reward: [(0, '0.578')] +[2024-06-18 16:06:19,095][12883] Updated weights for policy 0, policy_version 167054 (0.0023) +[2024-06-18 16:06:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2737143808. Throughput: 0: 42525.8. Samples: 2737258780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:21,994][12645] Avg episode reward: [(0, '0.646')] +[2024-06-18 16:06:22,824][12883] Updated weights for policy 0, policy_version 167064 (0.0055) +[2024-06-18 16:06:26,580][12883] Updated weights for policy 0, policy_version 167074 (0.0028) +[2024-06-18 16:06:26,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 2737356800. Throughput: 0: 42589.7. Samples: 2737516260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:26,996][12645] Avg episode reward: [(0, '0.507')] +[2024-06-18 16:06:30,358][12883] Updated weights for policy 0, policy_version 167084 (0.0023) +[2024-06-18 16:06:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2737553408. Throughput: 0: 42621.7. Samples: 2737643280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:31,994][12645] Avg episode reward: [(0, '0.365')] +[2024-06-18 16:06:34,165][12883] Updated weights for policy 0, policy_version 167094 (0.0036) +[2024-06-18 16:06:36,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2737782784. Throughput: 0: 42757.3. Samples: 2737902140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:36,994][12645] Avg episode reward: [(0, '0.517')] +[2024-06-18 16:06:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167101_2737782784.pth... +[2024-06-18 16:06:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166477_2727559168.pth +[2024-06-18 16:06:37,976][12883] Updated weights for policy 0, policy_version 167104 (0.0032) +[2024-06-18 16:06:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 2737979392. Throughput: 0: 42500.5. Samples: 2738151420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:41,994][12645] Avg episode reward: [(0, '0.455')] +[2024-06-18 16:06:42,234][12883] Updated weights for policy 0, policy_version 167114 (0.0036) +[2024-06-18 16:06:45,566][12883] Updated weights for policy 0, policy_version 167124 (0.0036) +[2024-06-18 16:06:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2738192384. Throughput: 0: 42584.9. Samples: 2738279380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:46,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 16:06:50,022][12883] Updated weights for policy 0, policy_version 167134 (0.0032) +[2024-06-18 16:06:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2738405376. Throughput: 0: 42608.8. Samples: 2738533620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) +[2024-06-18 16:06:51,994][12645] Avg episode reward: [(0, '0.593')] +[2024-06-18 16:06:53,169][12883] Updated weights for policy 0, policy_version 167144 (0.0028) +[2024-06-18 16:06:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2738618368. Throughput: 0: 42482.1. Samples: 2738789880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:06:56,994][12645] Avg episode reward: [(0, '0.514')] +[2024-06-18 16:06:57,551][12883] Updated weights for policy 0, policy_version 167154 (0.0034) +[2024-06-18 16:07:00,793][12883] Updated weights for policy 0, policy_version 167164 (0.0031) +[2024-06-18 16:07:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2738847744. Throughput: 0: 42725.8. Samples: 2738923320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:01,994][12645] Avg episode reward: [(0, '0.490')] +[2024-06-18 16:07:05,102][12883] Updated weights for policy 0, policy_version 167174 (0.0034) +[2024-06-18 16:07:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2739060736. Throughput: 0: 42651.5. Samples: 2739178100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:06,996][12645] Avg episode reward: [(0, '0.419')] +[2024-06-18 16:07:08,361][12883] Updated weights for policy 0, policy_version 167184 (0.0038) +[2024-06-18 16:07:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2739273728. Throughput: 0: 42639.8. Samples: 2739434960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:11,994][12645] Avg episode reward: [(0, '0.434')] +[2024-06-18 16:07:12,965][12883] Updated weights for policy 0, policy_version 167194 (0.0037) +[2024-06-18 16:07:16,107][12883] Updated weights for policy 0, policy_version 167204 (0.0026) +[2024-06-18 16:07:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2739486720. Throughput: 0: 42728.3. Samples: 2739566060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:16,994][12645] Avg episode reward: [(0, '0.332')] +[2024-06-18 16:07:20,413][12883] Updated weights for policy 0, policy_version 167214 (0.0037) +[2024-06-18 16:07:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2739699712. Throughput: 0: 42583.6. Samples: 2739818400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:21,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 16:07:23,779][12883] Updated weights for policy 0, policy_version 167224 (0.0028) +[2024-06-18 16:07:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 2739912704. Throughput: 0: 42834.6. Samples: 2740078980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:26,994][12645] Avg episode reward: [(0, '0.457')] +[2024-06-18 16:07:27,993][12883] Updated weights for policy 0, policy_version 167234 (0.0050) +[2024-06-18 16:07:31,429][12883] Updated weights for policy 0, policy_version 167244 (0.0038) +[2024-06-18 16:07:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2740142080. Throughput: 0: 42749.7. Samples: 2740203120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:31,994][12645] Avg episode reward: [(0, '0.471')] +[2024-06-18 16:07:35,608][12883] Updated weights for policy 0, policy_version 167254 (0.0038) +[2024-06-18 16:07:36,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2740355072. Throughput: 0: 42905.9. Samples: 2740464380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:36,994][12645] Avg episode reward: [(0, '0.356')] +[2024-06-18 16:07:38,263][12862] Signal inference workers to stop experience collection... (40100 times) +[2024-06-18 16:07:38,293][12883] InferenceWorker_p0-w0: stopping experience collection (40100 times) +[2024-06-18 16:07:38,319][12862] Signal inference workers to resume experience collection... (40100 times) +[2024-06-18 16:07:38,320][12883] InferenceWorker_p0-w0: resuming experience collection (40100 times) +[2024-06-18 16:07:39,231][12883] Updated weights for policy 0, policy_version 167264 (0.0027) +[2024-06-18 16:07:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2740551680. Throughput: 0: 42829.0. Samples: 2740717180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:41,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 16:07:43,235][12883] Updated weights for policy 0, policy_version 167274 (0.0036) +[2024-06-18 16:07:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2740764672. Throughput: 0: 42611.0. Samples: 2740840820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:46,994][12645] Avg episode reward: [(0, '0.367')] +[2024-06-18 16:07:47,155][12883] Updated weights for policy 0, policy_version 167284 (0.0034) +[2024-06-18 16:07:50,785][12883] Updated weights for policy 0, policy_version 167294 (0.0031) +[2024-06-18 16:07:51,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42876.9). Total num frames: 2741010432. Throughput: 0: 42878.7. Samples: 2741107640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:51,994][12645] Avg episode reward: [(0, '0.293')] +[2024-06-18 16:07:54,727][12883] Updated weights for policy 0, policy_version 167304 (0.0036) +[2024-06-18 16:07:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2741207040. Throughput: 0: 42947.1. Samples: 2741367580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:07:56,996][12645] Avg episode reward: [(0, '0.603')] +[2024-06-18 16:07:58,208][12883] Updated weights for policy 0, policy_version 167314 (0.0042) +[2024-06-18 16:08:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2741420032. Throughput: 0: 42715.7. Samples: 2741488260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:01,994][12645] Avg episode reward: [(0, '0.681')] +[2024-06-18 16:08:02,108][12883] Updated weights for policy 0, policy_version 167324 (0.0048) +[2024-06-18 16:08:05,727][12883] Updated weights for policy 0, policy_version 167334 (0.0049) +[2024-06-18 16:08:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2741665792. Throughput: 0: 43020.0. Samples: 2741754300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:06,994][12645] Avg episode reward: [(0, '0.530')] +[2024-06-18 16:08:09,869][12883] Updated weights for policy 0, policy_version 167344 (0.0036) +[2024-06-18 16:08:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2741829632. Throughput: 0: 43056.5. Samples: 2742016520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:11,994][12645] Avg episode reward: [(0, '0.682')] +[2024-06-18 16:08:13,260][12883] Updated weights for policy 0, policy_version 167354 (0.0032) +[2024-06-18 16:08:16,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2742059008. Throughput: 0: 43011.9. Samples: 2742138660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:16,995][12645] Avg episode reward: [(0, '0.510')] +[2024-06-18 16:08:17,469][12883] Updated weights for policy 0, policy_version 167364 (0.0033) +[2024-06-18 16:08:20,834][12883] Updated weights for policy 0, policy_version 167374 (0.0035) +[2024-06-18 16:08:21,994][12645] Fps is (10 sec: 49151.4, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 2742321152. Throughput: 0: 43120.7. Samples: 2742404820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:21,994][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 16:08:25,243][12883] Updated weights for policy 0, policy_version 167384 (0.0043) +[2024-06-18 16:08:26,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2742484992. Throughput: 0: 43184.0. Samples: 2742660460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:26,994][12645] Avg episode reward: [(0, '0.497')] +[2024-06-18 16:08:28,478][12883] Updated weights for policy 0, policy_version 167394 (0.0029) +[2024-06-18 16:08:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2742714368. Throughput: 0: 43126.2. Samples: 2742781500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:31,994][12645] Avg episode reward: [(0, '0.416')] +[2024-06-18 16:08:33,181][12883] Updated weights for policy 0, policy_version 167404 (0.0041) +[2024-06-18 16:08:36,095][12883] Updated weights for policy 0, policy_version 167414 (0.0023) +[2024-06-18 16:08:36,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2742960128. Throughput: 0: 43175.6. Samples: 2743050540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:36,994][12645] Avg episode reward: [(0, '0.617')] +[2024-06-18 16:08:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167418_2742976512.pth... +[2024-06-18 16:08:37,121][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166790_2732687360.pth +[2024-06-18 16:08:40,640][12883] Updated weights for policy 0, policy_version 167424 (0.0038) +[2024-06-18 16:08:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2743107584. Throughput: 0: 43189.9. Samples: 2743311120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:41,994][12645] Avg episode reward: [(0, '0.463')] +[2024-06-18 16:08:43,840][12862] Signal inference workers to stop experience collection... (40150 times) +[2024-06-18 16:08:43,840][12862] Signal inference workers to resume experience collection... (40150 times) +[2024-06-18 16:08:43,845][12883] Updated weights for policy 0, policy_version 167434 (0.0037) +[2024-06-18 16:08:43,867][12883] InferenceWorker_p0-w0: stopping experience collection (40150 times) +[2024-06-18 16:08:43,867][12883] InferenceWorker_p0-w0: resuming experience collection (40150 times) +[2024-06-18 16:08:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42765.8). Total num frames: 2743353344. Throughput: 0: 43144.9. Samples: 2743429780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:47,003][12645] Avg episode reward: [(0, '0.496')] +[2024-06-18 16:08:48,277][12883] Updated weights for policy 0, policy_version 167444 (0.0037) +[2024-06-18 16:08:51,412][12883] Updated weights for policy 0, policy_version 167454 (0.0028) +[2024-06-18 16:08:51,994][12645] Fps is (10 sec: 49152.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2743599104. Throughput: 0: 43080.6. Samples: 2743692920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:51,994][12645] Avg episode reward: [(0, '0.360')] +[2024-06-18 16:08:55,794][12883] Updated weights for policy 0, policy_version 167464 (0.0034) +[2024-06-18 16:08:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2743762944. Throughput: 0: 42894.3. Samples: 2743946760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:08:56,994][12645] Avg episode reward: [(0, '0.469')] +[2024-06-18 16:08:59,348][12883] Updated weights for policy 0, policy_version 167474 (0.0040) +[2024-06-18 16:09:01,996][12645] Fps is (10 sec: 40950.3, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2744008704. Throughput: 0: 42958.1. Samples: 2744071860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) +[2024-06-18 16:09:01,997][12645] Avg episode reward: [(0, '0.541')] +[2024-06-18 16:09:03,383][12883] Updated weights for policy 0, policy_version 167484 (0.0031) +[2024-06-18 16:09:06,932][12883] Updated weights for policy 0, policy_version 167494 (0.0041) +[2024-06-18 16:09:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2744221696. Throughput: 0: 42834.0. Samples: 2744332340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:06,994][12645] Avg episode reward: [(0, '0.565')] +[2024-06-18 16:09:10,997][12883] Updated weights for policy 0, policy_version 167504 (0.0031) +[2024-06-18 16:09:11,994][12645] Fps is (10 sec: 40969.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2744418304. Throughput: 0: 42929.8. Samples: 2744592300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:11,994][12645] Avg episode reward: [(0, '0.656')] +[2024-06-18 16:09:14,508][12883] Updated weights for policy 0, policy_version 167514 (0.0042) +[2024-06-18 16:09:16,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2744664064. Throughput: 0: 43007.1. Samples: 2744716820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:16,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 16:09:18,902][12883] Updated weights for policy 0, policy_version 167524 (0.0038) +[2024-06-18 16:09:21,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.8, 300 sec: 42820.2). Total num frames: 2744860672. Throughput: 0: 42675.7. Samples: 2744971040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:21,996][12645] Avg episode reward: [(0, '0.427')] +[2024-06-18 16:09:22,591][12883] Updated weights for policy 0, policy_version 167534 (0.0036) +[2024-06-18 16:09:26,553][12883] Updated weights for policy 0, policy_version 167544 (0.0036) +[2024-06-18 16:09:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2745057280. Throughput: 0: 42669.2. Samples: 2745231240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:26,994][12645] Avg episode reward: [(0, '0.415')] +[2024-06-18 16:09:30,222][12883] Updated weights for policy 0, policy_version 167554 (0.0043) +[2024-06-18 16:09:31,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2745303040. Throughput: 0: 42864.4. Samples: 2745358680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:31,994][12645] Avg episode reward: [(0, '0.531')] +[2024-06-18 16:09:34,255][12883] Updated weights for policy 0, policy_version 167564 (0.0038) +[2024-06-18 16:09:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2745499648. Throughput: 0: 42770.5. Samples: 2745617600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:36,994][12645] Avg episode reward: [(0, '0.607')] +[2024-06-18 16:09:37,677][12883] Updated weights for policy 0, policy_version 167574 (0.0032) +[2024-06-18 16:09:41,899][12883] Updated weights for policy 0, policy_version 167584 (0.0027) +[2024-06-18 16:09:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2745696256. Throughput: 0: 42955.9. Samples: 2745879780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:41,994][12645] Avg episode reward: [(0, '0.592')] +[2024-06-18 16:09:45,210][12883] Updated weights for policy 0, policy_version 167594 (0.0042) +[2024-06-18 16:09:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2745958400. Throughput: 0: 42948.3. Samples: 2746004440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:46,994][12645] Avg episode reward: [(0, '0.732')] +[2024-06-18 16:09:49,509][12883] Updated weights for policy 0, policy_version 167604 (0.0038) +[2024-06-18 16:09:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 2746138624. Throughput: 0: 42923.9. Samples: 2746263920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:51,994][12645] Avg episode reward: [(0, '0.709')] +[2024-06-18 16:09:53,198][12883] Updated weights for policy 0, policy_version 167614 (0.0032) +[2024-06-18 16:09:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2746351616. Throughput: 0: 42830.6. Samples: 2746519680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:09:56,994][12645] Avg episode reward: [(0, '0.680')] +[2024-06-18 16:09:56,998][12883] Updated weights for policy 0, policy_version 167624 (0.0029) +[2024-06-18 16:10:00,745][12883] Updated weights for policy 0, policy_version 167634 (0.0040) +[2024-06-18 16:10:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 2746597376. Throughput: 0: 42953.8. Samples: 2746649740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) +[2024-06-18 16:10:01,994][12645] Avg episode reward: [(0, '0.680')] +[2024-06-18 16:10:04,572][12883] Updated weights for policy 0, policy_version 167644 (0.0043) +[2024-06-18 16:10:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2746793984. Throughput: 0: 42991.4. Samples: 2746905560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:06,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 16:10:08,297][12883] Updated weights for policy 0, policy_version 167654 (0.0026) +[2024-06-18 16:10:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2746990592. Throughput: 0: 42840.6. Samples: 2747159060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:11,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 16:10:12,148][12883] Updated weights for policy 0, policy_version 167664 (0.0036) +[2024-06-18 16:10:15,747][12862] Signal inference workers to stop experience collection... (40200 times) +[2024-06-18 16:10:15,752][12862] Signal inference workers to resume experience collection... (40200 times) +[2024-06-18 16:10:15,776][12883] InferenceWorker_p0-w0: stopping experience collection (40200 times) +[2024-06-18 16:10:15,776][12883] InferenceWorker_p0-w0: resuming experience collection (40200 times) +[2024-06-18 16:10:15,897][12883] Updated weights for policy 0, policy_version 167674 (0.0019) +[2024-06-18 16:10:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2747219968. Throughput: 0: 42836.6. Samples: 2747286320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:16,994][12645] Avg episode reward: [(0, '0.655')] +[2024-06-18 16:10:19,733][12883] Updated weights for policy 0, policy_version 167684 (0.0033) +[2024-06-18 16:10:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42873.0, 300 sec: 42820.5). Total num frames: 2747432960. Throughput: 0: 42827.1. Samples: 2747544820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:21,994][12645] Avg episode reward: [(0, '0.625')] +[2024-06-18 16:10:23,421][12883] Updated weights for policy 0, policy_version 167694 (0.0034) +[2024-06-18 16:10:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2747645952. Throughput: 0: 42712.0. Samples: 2747801820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:26,994][12645] Avg episode reward: [(0, '0.595')] +[2024-06-18 16:10:27,724][12883] Updated weights for policy 0, policy_version 167704 (0.0039) +[2024-06-18 16:10:31,224][12883] Updated weights for policy 0, policy_version 167714 (0.0028) +[2024-06-18 16:10:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2747875328. Throughput: 0: 42731.6. Samples: 2747927360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:31,994][12645] Avg episode reward: [(0, '0.582')] +[2024-06-18 16:10:35,345][12883] Updated weights for policy 0, policy_version 167724 (0.0037) +[2024-06-18 16:10:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2748071936. Throughput: 0: 42784.0. Samples: 2748189200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:36,994][12645] Avg episode reward: [(0, '0.344')] +[2024-06-18 16:10:37,075][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167730_2748088320.pth... +[2024-06-18 16:10:37,139][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167101_2737782784.pth +[2024-06-18 16:10:38,827][12883] Updated weights for policy 0, policy_version 167734 (0.0035) +[2024-06-18 16:10:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2748284928. Throughput: 0: 42716.5. Samples: 2748441920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:41,994][12645] Avg episode reward: [(0, '0.651')] +[2024-06-18 16:10:42,951][12883] Updated weights for policy 0, policy_version 167744 (0.0037) +[2024-06-18 16:10:46,739][12883] Updated weights for policy 0, policy_version 167754 (0.0036) +[2024-06-18 16:10:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2748497920. Throughput: 0: 42617.8. Samples: 2748567540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:46,994][12645] Avg episode reward: [(0, '0.522')] +[2024-06-18 16:10:50,466][12883] Updated weights for policy 0, policy_version 167764 (0.0032) +[2024-06-18 16:10:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2748727296. Throughput: 0: 42767.2. Samples: 2748830080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:51,994][12645] Avg episode reward: [(0, '0.220')] +[2024-06-18 16:10:54,550][12883] Updated weights for policy 0, policy_version 167774 (0.0033) +[2024-06-18 16:10:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2748923904. Throughput: 0: 42827.4. Samples: 2749086300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:10:56,994][12645] Avg episode reward: [(0, '0.579')] +[2024-06-18 16:10:58,043][12883] Updated weights for policy 0, policy_version 167784 (0.0038) +[2024-06-18 16:11:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2749136896. Throughput: 0: 42731.6. Samples: 2749209240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:11:01,994][12645] Avg episode reward: [(0, '0.718')] +[2024-06-18 16:11:01,999][12883] Updated weights for policy 0, policy_version 167794 (0.0029) +[2024-06-18 16:11:05,390][12883] Updated weights for policy 0, policy_version 167804 (0.0041) +[2024-06-18 16:11:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2749349888. Throughput: 0: 42831.1. Samples: 2749472220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) +[2024-06-18 16:11:06,994][12645] Avg episode reward: [(0, '0.551')] +[2024-06-18 16:11:09,431][12883] Updated weights for policy 0, policy_version 167814 (0.0033) +[2024-06-18 16:11:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2749562880. Throughput: 0: 42856.8. Samples: 2749730380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:11,994][12645] Avg episode reward: [(0, '0.552')] +[2024-06-18 16:11:13,140][12883] Updated weights for policy 0, policy_version 167824 (0.0038) +[2024-06-18 16:11:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 2749775872. Throughput: 0: 42975.9. Samples: 2749861280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:16,994][12645] Avg episode reward: [(0, '0.523')] +[2024-06-18 16:11:17,050][12883] Updated weights for policy 0, policy_version 167834 (0.0043) +[2024-06-18 16:11:20,788][12883] Updated weights for policy 0, policy_version 167844 (0.0040) +[2024-06-18 16:11:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 2750005248. Throughput: 0: 42774.6. Samples: 2750114060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:21,994][12645] Avg episode reward: [(0, '0.301')] +[2024-06-18 16:11:25,140][12883] Updated weights for policy 0, policy_version 167854 (0.0030) +[2024-06-18 16:11:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2750218240. Throughput: 0: 42716.8. Samples: 2750364180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:26,994][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 16:11:28,394][12883] Updated weights for policy 0, policy_version 167864 (0.0029) +[2024-06-18 16:11:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2750414848. Throughput: 0: 42807.1. Samples: 2750493860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:31,994][12645] Avg episode reward: [(0, '0.431')] +[2024-06-18 16:11:32,898][12883] Updated weights for policy 0, policy_version 167874 (0.0034) +[2024-06-18 16:11:36,003][12883] Updated weights for policy 0, policy_version 167884 (0.0045) +[2024-06-18 16:11:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2750644224. Throughput: 0: 42713.7. Samples: 2750752200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:36,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 16:11:37,853][12862] Signal inference workers to stop experience collection... (40250 times) +[2024-06-18 16:11:37,853][12862] Signal inference workers to resume experience collection... (40250 times) +[2024-06-18 16:11:37,900][12883] InferenceWorker_p0-w0: stopping experience collection (40250 times) +[2024-06-18 16:11:37,900][12883] InferenceWorker_p0-w0: resuming experience collection (40250 times) +[2024-06-18 16:11:40,506][12883] Updated weights for policy 0, policy_version 167894 (0.0032) +[2024-06-18 16:11:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2750857216. Throughput: 0: 42722.9. Samples: 2751008820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:41,994][12645] Avg episode reward: [(0, '0.550')] +[2024-06-18 16:11:44,111][12883] Updated weights for policy 0, policy_version 167904 (0.0026) +[2024-06-18 16:11:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2751053824. Throughput: 0: 42835.4. Samples: 2751136840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:46,994][12645] Avg episode reward: [(0, '0.609')] +[2024-06-18 16:11:48,095][12883] Updated weights for policy 0, policy_version 167914 (0.0027) +[2024-06-18 16:11:51,658][12883] Updated weights for policy 0, policy_version 167924 (0.0034) +[2024-06-18 16:11:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 2751283200. Throughput: 0: 42750.4. Samples: 2751395980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:51,994][12645] Avg episode reward: [(0, '0.453')] +[2024-06-18 16:11:55,703][12883] Updated weights for policy 0, policy_version 167934 (0.0031) +[2024-06-18 16:11:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2751496192. Throughput: 0: 42834.6. Samples: 2751657940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:11:56,994][12645] Avg episode reward: [(0, '0.489')] +[2024-06-18 16:11:59,274][12883] Updated weights for policy 0, policy_version 167944 (0.0046) +[2024-06-18 16:12:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2751709184. Throughput: 0: 42675.7. Samples: 2751781680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:12:02,003][12645] Avg episode reward: [(0, '0.506')] +[2024-06-18 16:12:03,232][12883] Updated weights for policy 0, policy_version 167954 (0.0028) +[2024-06-18 16:12:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2751905792. Throughput: 0: 42732.0. Samples: 2752037000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:12:06,994][12645] Avg episode reward: [(0, '0.470')] +[2024-06-18 16:12:07,161][12883] Updated weights for policy 0, policy_version 167964 (0.0041) +[2024-06-18 16:12:10,974][12883] Updated weights for policy 0, policy_version 167974 (0.0036) +[2024-06-18 16:12:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2752118784. Throughput: 0: 42993.9. Samples: 2752298900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:12:11,994][12645] Avg episode reward: [(0, '0.585')] +[2024-06-18 16:12:14,810][12883] Updated weights for policy 0, policy_version 167984 (0.0025) +[2024-06-18 16:12:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2752348160. Throughput: 0: 42922.2. Samples: 2752425360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) +[2024-06-18 16:12:16,994][12645] Avg episode reward: [(0, '0.340')] +[2024-06-18 16:12:18,719][12883] Updated weights for policy 0, policy_version 167994 (0.0025) +[2024-06-18 16:12:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2752561152. Throughput: 0: 42779.2. Samples: 2752677260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:12:21,994][12645] Avg episode reward: [(0, '0.491')] +[2024-06-18 16:12:22,157][12883] Updated weights for policy 0, policy_version 168004 (0.0038) +[2024-06-18 16:12:26,191][12883] Updated weights for policy 0, policy_version 168014 (0.0023) +[2024-06-18 16:12:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2752774144. Throughput: 0: 42927.0. Samples: 2752940540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:12:26,994][12645] Avg episode reward: [(0, '0.749')] +[2024-06-18 16:12:30,056][12883] Updated weights for policy 0, policy_version 168024 (0.0040) +[2024-06-18 16:12:31,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2753003520. Throughput: 0: 42930.7. Samples: 2753068720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:12:31,994][12645] Avg episode reward: [(0, '0.688')] +[2024-06-18 16:12:33,642][12883] Updated weights for policy 0, policy_version 168034 (0.0037) +[2024-06-18 16:12:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2753216512. Throughput: 0: 42963.9. Samples: 2753329360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:12:36,994][12645] Avg episode reward: [(0, '0.615')] +[2024-06-18 16:12:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168043_2753216512.pth... +[2024-06-18 16:12:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167418_2742976512.pth +[2024-06-18 16:12:37,507][12883] Updated weights for policy 0, policy_version 168044 (0.0043) +[2024-06-18 16:12:41,751][12883] Updated weights for policy 0, policy_version 168054 (0.0034) +[2024-06-18 16:12:41,995][12645] Fps is (10 sec: 40952.8, 60 sec: 42597.1, 300 sec: 42875.8). Total num frames: 2753413120. Throughput: 0: 42941.0. Samples: 2753590360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:12:41,996][12645] Avg episode reward: [(0, '0.532')] +[2024-06-18 16:12:45,155][12883] Updated weights for policy 0, policy_version 168064 (0.0043) +[2024-06-18 16:12:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2753658880. Throughput: 0: 42974.6. Samples: 2753715540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:12:46,999][12645] Avg episode reward: [(0, '0.683')] +[2024-06-18 16:12:49,248][12883] Updated weights for policy 0, policy_version 168074 (0.0043) +[2024-06-18 16:12:50,924][12862] Signal inference workers to stop experience collection... (40300 times) +[2024-06-18 16:12:50,924][12862] Signal inference workers to resume experience collection... (40300 times) +[2024-06-18 16:12:50,937][12883] InferenceWorker_p0-w0: stopping experience collection (40300 times) +[2024-06-18 16:12:50,938][12883] InferenceWorker_p0-w0: resuming experience collection (40300 times) +[2024-06-18 16:12:51,994][12645] Fps is (10 sec: 42606.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2753839104. Throughput: 0: 42885.0. Samples: 2753966820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:12:51,994][12645] Avg episode reward: [(0, '0.608')] +[2024-06-18 16:12:52,632][12883] Updated weights for policy 0, policy_version 168084 (0.0027) +[2024-06-18 16:12:56,855][12883] Updated weights for policy 0, policy_version 168094 (0.0035) +[2024-06-18 16:12:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2754052096. Throughput: 0: 42835.0. Samples: 2754226480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:12:56,994][12645] Avg episode reward: [(0, '0.640')] +[2024-06-18 16:13:00,173][12883] Updated weights for policy 0, policy_version 168104 (0.0035) +[2024-06-18 16:13:02,000][12645] Fps is (10 sec: 44209.0, 60 sec: 42867.1, 300 sec: 42764.1). Total num frames: 2754281472. Throughput: 0: 42831.0. Samples: 2754353020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:13:02,000][12645] Avg episode reward: [(0, '0.528')] +[2024-06-18 16:13:04,395][12883] Updated weights for policy 0, policy_version 168114 (0.0030) +[2024-06-18 16:13:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2754494464. Throughput: 0: 42841.2. Samples: 2754605120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:13:06,994][12645] Avg episode reward: [(0, '0.527')] +[2024-06-18 16:13:08,139][12883] Updated weights for policy 0, policy_version 168124 (0.0029) +[2024-06-18 16:13:11,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2754691072. Throughput: 0: 42778.8. Samples: 2754865580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:13:11,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 16:13:12,451][12883] Updated weights for policy 0, policy_version 168134 (0.0036) +[2024-06-18 16:13:15,541][12883] Updated weights for policy 0, policy_version 168144 (0.0036) +[2024-06-18 16:13:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2754920448. Throughput: 0: 42820.5. Samples: 2754995640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:13:16,994][12645] Avg episode reward: [(0, '0.483')] +[2024-06-18 16:13:20,003][12883] Updated weights for policy 0, policy_version 168154 (0.0028) +[2024-06-18 16:13:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2755133440. Throughput: 0: 42786.2. Samples: 2755254740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) +[2024-06-18 16:13:21,994][12645] Avg episode reward: [(0, '0.446')] +[2024-06-18 16:13:22,936][12883] Updated weights for policy 0, policy_version 168164 (0.0034) +[2024-06-18 16:13:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2755346432. Throughput: 0: 42648.3. Samples: 2755509460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:13:26,994][12645] Avg episode reward: [(0, '0.558')] +[2024-06-18 16:13:27,617][12883] Updated weights for policy 0, policy_version 168174 (0.0028) +[2024-06-18 16:13:30,439][12883] Updated weights for policy 0, policy_version 168184 (0.0027) +[2024-06-18 16:13:31,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2755559424. Throughput: 0: 42757.1. Samples: 2755639600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:13:31,994][12645] Avg episode reward: [(0, '0.576')] +[2024-06-18 16:13:35,061][12883] Updated weights for policy 0, policy_version 168194 (0.0038) +[2024-06-18 16:13:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2755788800. Throughput: 0: 42949.7. Samples: 2755899560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:13:36,994][12645] Avg episode reward: [(0, '0.737')] +[2024-06-18 16:13:38,033][12883] Updated weights for policy 0, policy_version 168204 (0.0028) +[2024-06-18 16:13:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 2755969024. Throughput: 0: 42872.1. Samples: 2756155720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:13:41,994][12645] Avg episode reward: [(0, '0.747')] +[2024-06-18 16:13:42,770][12883] Updated weights for policy 0, policy_version 168214 (0.0036) +[2024-06-18 16:13:45,877][12883] Updated weights for policy 0, policy_version 168224 (0.0030) +[2024-06-18 16:13:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2756214784. Throughput: 0: 42803.6. Samples: 2756278920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:13:46,994][12645] Avg episode reward: [(0, '0.566')] +[2024-06-18 16:13:50,457][12883] Updated weights for policy 0, policy_version 168234 (0.0055) +[2024-06-18 16:13:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2756427776. Throughput: 0: 42957.9. Samples: 2756538220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:13:51,994][12645] Avg episode reward: [(0, '0.420')] +[2024-06-18 16:13:53,568][12883] Updated weights for policy 0, policy_version 168244 (0.0032) +[2024-06-18 16:13:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 2756624384. Throughput: 0: 42860.8. Samples: 2756794320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:13:56,994][12645] Avg episode reward: [(0, '0.388')] +[2024-06-18 16:13:58,286][12883] Updated weights for policy 0, policy_version 168254 (0.0036) +[2024-06-18 16:14:01,358][12883] Updated weights for policy 0, policy_version 168264 (0.0045) +[2024-06-18 16:14:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43149.0, 300 sec: 42876.1). Total num frames: 2756870144. Throughput: 0: 42678.6. Samples: 2756916180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:14:01,994][12645] Avg episode reward: [(0, '0.432')] +[2024-06-18 16:14:05,804][12883] Updated weights for policy 0, policy_version 168274 (0.0033) +[2024-06-18 16:14:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2757066752. Throughput: 0: 42837.4. Samples: 2757182420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:14:06,994][12645] Avg episode reward: [(0, '0.437')] +[2024-06-18 16:14:08,854][12883] Updated weights for policy 0, policy_version 168284 (0.0043) +[2024-06-18 16:14:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2757263360. Throughput: 0: 42755.6. Samples: 2757433460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:14:11,994][12645] Avg episode reward: [(0, '0.644')] +[2024-06-18 16:14:13,388][12883] Updated weights for policy 0, policy_version 168294 (0.0030) +[2024-06-18 16:14:16,523][12883] Updated weights for policy 0, policy_version 168304 (0.0039) +[2024-06-18 16:14:16,996][12645] Fps is (10 sec: 44226.9, 60 sec: 43142.9, 300 sec: 42876.1). Total num frames: 2757509120. Throughput: 0: 42704.0. Samples: 2757561380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:14:16,996][12645] Avg episode reward: [(0, '0.461')] +[2024-06-18 16:14:20,843][12883] Updated weights for policy 0, policy_version 168314 (0.0028) +[2024-06-18 16:14:21,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2757722112. Throughput: 0: 42887.1. Samples: 2757829480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:14:21,994][12645] Avg episode reward: [(0, '0.346')] +[2024-06-18 16:14:24,143][12883] Updated weights for policy 0, policy_version 168324 (0.0037) +[2024-06-18 16:14:26,994][12645] Fps is (10 sec: 37692.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2757885952. Throughput: 0: 42887.6. Samples: 2758085660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) +[2024-06-18 16:14:26,994][12645] Avg episode reward: [(0, '0.326')] +[2024-06-18 16:14:28,273][12862] Signal inference workers to stop experience collection... (40350 times) +[2024-06-18 16:14:28,273][12862] Signal inference workers to resume experience collection... (40350 times) +[2024-06-18 16:14:28,315][12883] InferenceWorker_p0-w0: stopping experience collection (40350 times) +[2024-06-18 16:14:28,315][12883] InferenceWorker_p0-w0: resuming experience collection (40350 times) +[2024-06-18 16:14:28,411][12883] Updated weights for policy 0, policy_version 168334 (0.0042) +[2024-06-18 16:14:31,838][12883] Updated weights for policy 0, policy_version 168344 (0.0043) +[2024-06-18 16:14:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2758148096. Throughput: 0: 42858.8. Samples: 2758207560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:14:31,994][12645] Avg episode reward: [(0, '0.641')] +[2024-06-18 16:14:35,908][12883] Updated weights for policy 0, policy_version 168354 (0.0032) +[2024-06-18 16:14:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2758328320. Throughput: 0: 42816.8. Samples: 2758464980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:14:36,994][12645] Avg episode reward: [(0, '0.682')] +[2024-06-18 16:14:37,060][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168356_2758344704.pth... +[2024-06-18 16:14:37,108][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167730_2748088320.pth +[2024-06-18 16:14:39,456][12883] Updated weights for policy 0, policy_version 168364 (0.0041) +[2024-06-18 16:14:41,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2758541312. Throughput: 0: 42702.2. Samples: 2758715920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:14:41,994][12645] Avg episode reward: [(0, '0.765')] +[2024-06-18 16:14:43,663][12883] Updated weights for policy 0, policy_version 168374 (0.0030) +[2024-06-18 16:14:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2758754304. Throughput: 0: 42956.5. Samples: 2758849220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:14:46,994][12645] Avg episode reward: [(0, '0.395')] +[2024-06-18 16:14:47,485][12883] Updated weights for policy 0, policy_version 168384 (0.0028) +[2024-06-18 16:14:51,118][12883] Updated weights for policy 0, policy_version 168394 (0.0032) +[2024-06-18 16:14:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2758983680. Throughput: 0: 42624.0. Samples: 2759100500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:14:51,994][12645] Avg episode reward: [(0, '0.382')] +[2024-06-18 16:14:55,230][12883] Updated weights for policy 0, policy_version 168404 (0.0026) +[2024-06-18 16:14:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2759196672. Throughput: 0: 42715.5. Samples: 2759355660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:14:56,998][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 16:14:59,130][12883] Updated weights for policy 0, policy_version 168414 (0.0028) +[2024-06-18 16:15:01,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 2759409664. Throughput: 0: 42892.0. Samples: 2759491520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:15:01,997][12645] Avg episode reward: [(0, '0.493')] +[2024-06-18 16:15:02,713][12883] Updated weights for policy 0, policy_version 168424 (0.0042) +[2024-06-18 16:15:06,636][12883] Updated weights for policy 0, policy_version 168434 (0.0032) +[2024-06-18 16:15:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2759622656. Throughput: 0: 42600.4. Samples: 2759746500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:15:06,994][12645] Avg episode reward: [(0, '0.495')] +[2024-06-18 16:15:10,315][12883] Updated weights for policy 0, policy_version 168444 (0.0027) +[2024-06-18 16:15:11,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2759852032. Throughput: 0: 42583.5. Samples: 2760001920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:15:11,994][12645] Avg episode reward: [(0, '0.610')] +[2024-06-18 16:15:14,110][12883] Updated weights for policy 0, policy_version 168454 (0.0036) +[2024-06-18 16:15:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 2760048640. Throughput: 0: 42794.2. Samples: 2760133300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:15:16,994][12645] Avg episode reward: [(0, '0.647')] +[2024-06-18 16:15:17,879][12883] Updated weights for policy 0, policy_version 168464 (0.0038) +[2024-06-18 16:15:21,827][12883] Updated weights for policy 0, policy_version 168474 (0.0028) +[2024-06-18 16:15:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2760278016. Throughput: 0: 42724.5. Samples: 2760387580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:15:21,999][12645] Avg episode reward: [(0, '0.604')] +[2024-06-18 16:15:25,542][12883] Updated weights for policy 0, policy_version 168484 (0.0025) +[2024-06-18 16:15:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2760474624. Throughput: 0: 42920.4. Samples: 2760647340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) +[2024-06-18 16:15:26,994][12645] Avg episode reward: [(0, '0.769')] +[2024-06-18 16:15:29,354][12883] Updated weights for policy 0, policy_version 168494 (0.0028) +[2024-06-18 16:15:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2760687616. Throughput: 0: 42793.0. Samples: 2760774900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:15:31,994][12645] Avg episode reward: [(0, '0.702')] +[2024-06-18 16:15:33,431][12883] Updated weights for policy 0, policy_version 168504 (0.0040) +[2024-06-18 16:15:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2760916992. Throughput: 0: 42896.4. Samples: 2761030840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:15:37,003][12645] Avg episode reward: [(0, '0.423')] +[2024-06-18 16:15:37,251][12883] Updated weights for policy 0, policy_version 168514 (0.0030) +[2024-06-18 16:15:40,901][12883] Updated weights for policy 0, policy_version 168524 (0.0036) +[2024-06-18 16:15:41,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2761129984. Throughput: 0: 43016.5. Samples: 2761291400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:15:41,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 16:15:44,733][12883] Updated weights for policy 0, policy_version 168534 (0.0036) +[2024-06-18 16:15:46,995][12645] Fps is (10 sec: 42594.4, 60 sec: 43143.7, 300 sec: 42764.9). Total num frames: 2761342976. Throughput: 0: 42830.9. Samples: 2761418860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:15:46,995][12645] Avg episode reward: [(0, '0.519')] +[2024-06-18 16:15:48,552][12883] Updated weights for policy 0, policy_version 168544 (0.0028) +[2024-06-18 16:15:51,999][12645] Fps is (10 sec: 44213.1, 60 sec: 43140.7, 300 sec: 42875.3). Total num frames: 2761572352. Throughput: 0: 42810.9. Samples: 2761673220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:15:51,999][12645] Avg episode reward: [(0, '0.711')] +[2024-06-18 16:15:52,185][12883] Updated weights for policy 0, policy_version 168554 (0.0032) +[2024-06-18 16:15:56,196][12883] Updated weights for policy 0, policy_version 168564 (0.0032) +[2024-06-18 16:15:56,994][12645] Fps is (10 sec: 44241.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2761785344. Throughput: 0: 43069.8. Samples: 2761940060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:15:56,994][12645] Avg episode reward: [(0, '0.602')] +[2024-06-18 16:15:59,742][12883] Updated weights for policy 0, policy_version 168574 (0.0032) +[2024-06-18 16:16:01,994][12645] Fps is (10 sec: 40982.3, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 2761981952. Throughput: 0: 42995.6. Samples: 2762068100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:16:01,994][12645] Avg episode reward: [(0, '0.402')] +[2024-06-18 16:16:03,899][12883] Updated weights for policy 0, policy_version 168584 (0.0036) +[2024-06-18 16:16:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2762227712. Throughput: 0: 43018.2. Samples: 2762323400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:16:06,994][12645] Avg episode reward: [(0, '0.387')] +[2024-06-18 16:16:07,283][12883] Updated weights for policy 0, policy_version 168594 (0.0027) +[2024-06-18 16:16:08,210][12862] Signal inference workers to stop experience collection... (40400 times) +[2024-06-18 16:16:08,244][12883] InferenceWorker_p0-w0: stopping experience collection (40400 times) +[2024-06-18 16:16:08,321][12862] Signal inference workers to resume experience collection... (40400 times) +[2024-06-18 16:16:08,321][12883] InferenceWorker_p0-w0: resuming experience collection (40400 times) +[2024-06-18 16:16:11,784][12883] Updated weights for policy 0, policy_version 168604 (0.0037) +[2024-06-18 16:16:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2762424320. Throughput: 0: 43146.4. Samples: 2762588920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:16:11,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 16:16:14,873][12883] Updated weights for policy 0, policy_version 168614 (0.0039) +[2024-06-18 16:16:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2762637312. Throughput: 0: 43006.1. Samples: 2762710180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:16:16,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 16:16:19,338][12883] Updated weights for policy 0, policy_version 168624 (0.0040) +[2024-06-18 16:16:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 2762883072. Throughput: 0: 42967.3. Samples: 2762964360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:16:21,994][12645] Avg episode reward: [(0, '0.686')] +[2024-06-18 16:16:22,902][12883] Updated weights for policy 0, policy_version 168634 (0.0034) +[2024-06-18 16:16:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2763046912. Throughput: 0: 43246.7. Samples: 2763237500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:16:26,994][12645] Avg episode reward: [(0, '0.618')] +[2024-06-18 16:16:27,015][12883] Updated weights for policy 0, policy_version 168644 (0.0033) +[2024-06-18 16:16:30,294][12883] Updated weights for policy 0, policy_version 168654 (0.0031) +[2024-06-18 16:16:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2763292672. Throughput: 0: 43126.7. Samples: 2763359520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) +[2024-06-18 16:16:31,994][12645] Avg episode reward: [(0, '0.511')] +[2024-06-18 16:16:34,745][12883] Updated weights for policy 0, policy_version 168664 (0.0037) +[2024-06-18 16:16:36,994][12645] Fps is (10 sec: 47513.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2763522048. Throughput: 0: 43280.6. Samples: 2763620620. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:16:36,994][12645] Avg episode reward: [(0, '0.575')] +[2024-06-18 16:16:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168672_2763522048.pth... +[2024-06-18 16:16:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168043_2753216512.pth +[2024-06-18 16:16:38,075][12883] Updated weights for policy 0, policy_version 168674 (0.0035) +[2024-06-18 16:16:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2763702272. Throughput: 0: 43135.1. Samples: 2763881140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:16:41,994][12645] Avg episode reward: [(0, '0.323')] +[2024-06-18 16:16:42,193][12883] Updated weights for policy 0, policy_version 168684 (0.0040) +[2024-06-18 16:16:45,630][12883] Updated weights for policy 0, policy_version 168694 (0.0039) +[2024-06-18 16:16:46,996][12645] Fps is (10 sec: 42589.4, 60 sec: 43416.7, 300 sec: 42931.3). Total num frames: 2763948032. Throughput: 0: 43017.3. Samples: 2764003980. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:16:46,997][12645] Avg episode reward: [(0, '0.337')] +[2024-06-18 16:16:49,697][12883] Updated weights for policy 0, policy_version 168704 (0.0044) +[2024-06-18 16:16:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42875.4, 300 sec: 42876.1). Total num frames: 2764144640. Throughput: 0: 43096.5. Samples: 2764262740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:16:51,994][12645] Avg episode reward: [(0, '0.372')] +[2024-06-18 16:16:53,258][12883] Updated weights for policy 0, policy_version 168714 (0.0027) +[2024-06-18 16:16:56,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2764357632. Throughput: 0: 42938.1. Samples: 2764521140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:16:56,994][12645] Avg episode reward: [(0, '0.433')] +[2024-06-18 16:16:57,567][12883] Updated weights for policy 0, policy_version 168724 (0.0041) +[2024-06-18 16:17:00,869][12883] Updated weights for policy 0, policy_version 168734 (0.0036) +[2024-06-18 16:17:01,996][12645] Fps is (10 sec: 44226.7, 60 sec: 43415.9, 300 sec: 42986.9). Total num frames: 2764587008. Throughput: 0: 42957.5. Samples: 2764643360. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:17:01,997][12645] Avg episode reward: [(0, '0.638')] +[2024-06-18 16:17:05,065][12883] Updated weights for policy 0, policy_version 168744 (0.0039) +[2024-06-18 16:17:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2764800000. Throughput: 0: 43143.9. Samples: 2764905840. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:17:06,994][12645] Avg episode reward: [(0, '0.588')] +[2024-06-18 16:17:08,276][12883] Updated weights for policy 0, policy_version 168754 (0.0032) +[2024-06-18 16:17:11,994][12645] Fps is (10 sec: 39330.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2764980224. Throughput: 0: 42722.7. Samples: 2765160020. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:17:11,994][12645] Avg episode reward: [(0, '0.561')] +[2024-06-18 16:17:12,532][12883] Updated weights for policy 0, policy_version 168764 (0.0036) +[2024-06-18 16:17:15,842][12883] Updated weights for policy 0, policy_version 168774 (0.0050) +[2024-06-18 16:17:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2765225984. Throughput: 0: 42742.2. Samples: 2765282920. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:17:16,994][12645] Avg episode reward: [(0, '0.740')] +[2024-06-18 16:17:20,832][12883] Updated weights for policy 0, policy_version 168784 (0.0037) +[2024-06-18 16:17:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2765422592. Throughput: 0: 42778.8. Samples: 2765545660. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:17:21,994][12645] Avg episode reward: [(0, '0.568')] +[2024-06-18 16:17:23,450][12883] Updated weights for policy 0, policy_version 168794 (0.0039) +[2024-06-18 16:17:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2765619200. Throughput: 0: 42625.8. Samples: 2765799300. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:17:26,994][12645] Avg episode reward: [(0, '0.441')] +[2024-06-18 16:17:28,449][12883] Updated weights for policy 0, policy_version 168804 (0.0030) +[2024-06-18 16:17:28,515][12862] Signal inference workers to stop experience collection... (40450 times) +[2024-06-18 16:17:28,562][12883] InferenceWorker_p0-w0: stopping experience collection (40450 times) +[2024-06-18 16:17:28,636][12862] Signal inference workers to resume experience collection... (40450 times) +[2024-06-18 16:17:28,637][12883] InferenceWorker_p0-w0: resuming experience collection (40450 times) +[2024-06-18 16:17:31,048][12883] Updated weights for policy 0, policy_version 168814 (0.0033) +[2024-06-18 16:17:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2765848576. Throughput: 0: 42693.8. Samples: 2765925100. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:17:31,994][12645] Avg episode reward: [(0, '0.729')] +[2024-06-18 16:17:36,016][12883] Updated weights for policy 0, policy_version 168824 (0.0032) +[2024-06-18 16:17:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42820.8). Total num frames: 2766045184. Throughput: 0: 42667.6. Samples: 2766182780. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) +[2024-06-18 16:17:36,994][12645] Avg episode reward: [(0, '0.468')] +[2024-06-18 16:17:38,894][12883] Updated weights for policy 0, policy_version 168834 (0.0044) +[2024-06-18 16:17:41,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2766274560. Throughput: 0: 42448.8. Samples: 2766431340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:17:41,994][12645] Avg episode reward: [(0, '0.513')] +[2024-06-18 16:17:44,136][12883] Updated weights for policy 0, policy_version 168844 (0.0030) +[2024-06-18 16:17:46,688][12883] Updated weights for policy 0, policy_version 168854 (0.0031) +[2024-06-18 16:17:46,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 2766503936. Throughput: 0: 42624.8. Samples: 2766561380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:17:46,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 16:17:51,749][12883] Updated weights for policy 0, policy_version 168864 (0.0040) +[2024-06-18 16:17:51,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2766684160. Throughput: 0: 42411.2. Samples: 2766814340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:17:51,994][12645] Avg episode reward: [(0, '0.383')] +[2024-06-18 16:17:54,571][12883] Updated weights for policy 0, policy_version 168874 (0.0031) +[2024-06-18 16:17:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42765.9). Total num frames: 2766897152. Throughput: 0: 42326.3. Samples: 2767064700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:17:56,994][12645] Avg episode reward: [(0, '0.554')] +[2024-06-18 16:17:59,285][12883] Updated weights for policy 0, policy_version 168884 (0.0035) +[2024-06-18 16:18:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 2767142912. Throughput: 0: 42450.8. Samples: 2767193200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:18:01,994][12645] Avg episode reward: [(0, '0.810')] +[2024-06-18 16:18:02,094][12883] Updated weights for policy 0, policy_version 168894 (0.0033) +[2024-06-18 16:18:06,859][12883] Updated weights for policy 0, policy_version 168904 (0.0028) +[2024-06-18 16:18:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2767339520. Throughput: 0: 42406.2. Samples: 2767453940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:18:06,994][12645] Avg episode reward: [(0, '0.712')] +[2024-06-18 16:18:09,684][12883] Updated weights for policy 0, policy_version 168914 (0.0036) +[2024-06-18 16:18:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2767552512. Throughput: 0: 42296.9. Samples: 2767702660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:18:11,994][12645] Avg episode reward: [(0, '0.626')] +[2024-06-18 16:18:14,416][12883] Updated weights for policy 0, policy_version 168924 (0.0045) +[2024-06-18 16:18:17,000][12645] Fps is (10 sec: 45846.6, 60 sec: 42867.0, 300 sec: 42930.7). Total num frames: 2767798272. Throughput: 0: 42490.0. Samples: 2767837420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:18:17,000][12645] Avg episode reward: [(0, '0.536')] +[2024-06-18 16:18:17,494][12883] Updated weights for policy 0, policy_version 168934 (0.0032) +[2024-06-18 16:18:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2767962112. Throughput: 0: 42429.3. Samples: 2768092100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:18:21,994][12645] Avg episode reward: [(0, '0.436')] +[2024-06-18 16:18:22,142][12883] Updated weights for policy 0, policy_version 168944 (0.0029) +[2024-06-18 16:18:25,419][12883] Updated weights for policy 0, policy_version 168954 (0.0031) +[2024-06-18 16:18:26,994][12645] Fps is (10 sec: 40986.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2768207872. Throughput: 0: 42434.9. Samples: 2768340900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:18:26,994][12645] Avg episode reward: [(0, '0.460')] +[2024-06-18 16:18:30,148][12883] Updated weights for policy 0, policy_version 168964 (0.0030) +[2024-06-18 16:18:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2768404480. Throughput: 0: 42600.9. Samples: 2768478420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) +[2024-06-18 16:18:31,994][12645] Avg episode reward: [(0, '0.612')]