| Comments |
| 3/14/2012 10:41:21 AM |
Bob Denny |
Closing, collected into SCHEDULER-816 (see link) |
| 3/8/2012 11:00:53 AM |
Bob Denny |

|
| 3/8/2012 11:00:39 AM |
Bob Denny |

|
| 3/8/2012 10:49:35 AM |
Bob Denny |
More problems. The whole thing of disconnecting and reconnecting weather is a loser. The 3.5 BETA is experiencing fatal weather related disconnects and "server gone" errors. This has to be re-worked!!!!!
Reopened all of the related issues on this. |
| 1/22/2012 4:52:39 PM |
Bob Denny |
|
| 1/22/2012 4:40:31 PM |
Bob Denny |
After all of this, and doing a lot of napkin engineering, I conclude:
Weather input to ACP, after transitioning from safe to unsafe, must stay unsafe for at least 5 minutes.
This will go into the docs. No changes are needed to 3.4, as that already protects from the other scenarios. |
| 1/22/2012 1:39:27 PM |
Bob Denny |
Attempt to turn weeather monitoring into a separate thread and polling loop failed. Cannot call into ACP from a thread other than the one on which the ACP objects were created. This blew 6 hours time yesterday. |
| 1/21/2012 10:19:53 AM |
Bob Denny |
OK, I see the problem. A very brief WX Unsafe. Dispatcher waits through the WX Safety script. By the time the WX Safety script completes, WX is good. Dispatcher fails to recognize the WX interrupt. The scope is parked and disconnects. Now the DIspatcher goes for the next Plan and BOOM, it encounters a "spontaneously" disconnected scope. Wow.
The keys to the city are in absolutely positively being able to sense a weather interrupt and the fact that the Obs is in standby. With this, the startup process will run before the Dispatcher tries to do more work in safe weather. |
| 1/21/2012 9:17:26 AM |
Bob Denny |
At Peter's the standard Weather script is used. It simply calls Telescope.Park(). Via internal ACP logic it parks, disconnects the scope(!) , then closes the dome. |
| 1/20/2012 3:57:41 PM |
Bob Denny |
Tested the situation here, it is discovering the weather unsafe condition!
20-Jan-2012 22:53:33.1: Acquire data for Observation test nebula...
20-Jan-2012 22:53:33.1: Send Observation test nebula to ACP Sequencer
20-Jan-2012 22:53:43.2: Sequencer is now active
20-Jan-2012 22:55:54.6: Sequencer is no longer active
20-Jan-2012 22:55:57.6: Post-job status check done (stat=Running)
20-Jan-2012 22:55:57.6: ACP ABORT: Acquisition process was interrupted by weather unsafe.
20-Jan-2012 22:55:57.6: Data acquisition failed for Observation test nebula.
20-Jan-2012 22:55:57.6: (Interrupted by weather unsafe condition)
20-Jan-2012 22:55:57.6: (Plan test nebula-Green-1 will be resubmitted in its entirety.
20-Jan-2012 22:55:57.6: (ACP reported Weather became unsafe.)
20-Jan-2012 22:55:57.7: ConnectWeather(), already connected.
20-Jan-2012 22:55:57.7: -- Weather Unsafe --
20-Jan-2012 22:56:13.8: ConnectWeather(), already connected.
|
| 1/20/2012 3:26:48 PM |
Bob Denny |
Example of first problem, weather interrupt during acquisition, came back to Scheduler and failed 0.2 sec later in Efficiency() trying to get the slew time. Working on this one first.
In ACP itself:
00:11:58 (starting exposure)
00:35:20 AAG_ACPWeatherFeed (v7.01) - UNSAFE
00:35:22 **WEATHER ALERT: The weather has become unsafe!
ACP console log closed 15-Dec-2011 00:35:23 UTC
And in the Scheduler log:
15-Dec-2011 00:09:50.4: Acquire data for Observation ngc 598...
15-Dec-2011 00:09:50.4: Send Observation ngc 598 to ACP Sequencer
15-Dec-2011 00:40:14.7: ACP ERROR: Failed to complete acquisition process (see ACP run log)
15-Dec-2011 00:40:14.7: Data acquisition failed for Observation ngc 598.
15-Dec-2011 00:40:14.7: (ACP reported See the ACP run log file)
15-Dec-2011 00:40:14.9: **EXCEPTION IN SCHEDULER:
15-Dec-2011 00:40:14.9: The telescope is not connected.
15-Dec-2011 00:40:14.9: Traceback:
at ACP.TelescopeClass.get_RightAscension()
at DC3.Scheduler.ACPSequencer.SlewTime(Double RA, Double Dec)
at DC3.Scheduler.Engine.Efficiency(Plan P, DateTime t)
at DC3.Scheduler.Engine.SelectBestPlan(ArrayList PList)
at DC3.Scheduler.Engine.DoSchedulePass()
at DC3.Scheduler.Engine.Run()
Weird, this did not pick up that it was a Weather failure! It should have explicitly said "Interrupted by weather unsafe condition". WxFail was false in the call to UpdatePlanAndObservation(). This comes from looking at the FailureReason set by AcquireScheduler.
Look at the timing - Scheduler missed the end of AcquireScheduler and continued to wait through the ACP Weather script (another 5 min.), which disconnects the telescope. |
| 1/20/2012 2:08:36 PM |
Bob Denny |
At a minimum, need to make a weather disconnect turn into a simple weather UNSAFE as far as the scheduler is concerned. Make this a mail message event. |
| 1/20/2012 2:04:57 PM |
Bob Denny |
The listing below from Perez' system when I stopped the dispatcher. It resulted in not only a Dead scheduler but also a Dear Bill Gates popup! It looks like an extremely unlucky coincidence of me doing that and it being the time for observatory shutdown? |
| 1/20/2012 2:03:01 PM |
Bob Denny |
As you can see, weather was disconnected and immediately thereafter it polled weather!!!
20-Jan-2012 12:40:11.4: Dispatcher stopped at 20-Jan-2012 12:40:11 UTC
20-Jan-2012 12:40:11.4: **DISPATCHER INTERRUPT LEVEL 1 RECEIVED**
20-Jan-2012 12:40:17.5: Weather disconnected.
20-Jan-2012 12:40:17.5: -- Observatory Shutdown --
20-Jan-2012 12:40:17.5: **EXCEPTION IN SCHEDULER:
20-Jan-2012 12:40:17.5: Automation error
Object is not connected to server
20-Jan-2012 12:40:17.5: Traceback:
at ACP.WeatherClass.get_Safe()
at DC3.Scheduler.ACPSequencer.FilteredWeatherSafe()
at DC3.Scheduler.ACPSequencer.get_WeatherSafe()
at DC3.Scheduler.Engine.PollWeather()
at DC3.Scheduler.Engine.Run()
20-Jan-2012 12:40:17.5: Start ACP Sequencer's ShutdownObs script
20-Jan-2012 12:40:17.9: Run statistics:
20-Jan-2012 12:40:17.9: Observations: 238
20-Jan-2012 12:40:17.9: Considered: 238
20-Jan-2012 12:40:17.9: Completed: 224
20-Jan-2012 12:40:17.9: Skipped: 14
20-Jan-2012 12:40:17.9: Never Eligible: 0
20-Jan-2012 12:40:17.9: Failed: 0
20-Jan-2012 12:40:17.9: Shutter-Open efficiency: 38.39%
20-Jan-2012 12:40:17.9: Overall Efficiency: 49.76%
20-Jan-2012 12:40:17.9: Release ACP sequencer
Log closed at Fri, Jan 20 2012 12:40:19 UTC (actual time)
|
| 1/16/2012 5:39:39 PM |
Bob Denny |
Also see this Comm Center post by Todd Benko. He has a point! This will require some really careful thought!!! |
| 12/27/2011 9:14:33 AM |
Bob Denny |
Also see this Comm Center thread. While we're at it, we need to deal with an independent closure of the dome by the weather station (AAG or Boltwood) via (e.g.) the DDW controller's hardware safety input. If weather is unsafe, and the dome is "closing" this may be correct. Peter's error in this thread was "unexpected state". |
| 12/26/2011 5:50:03 PM |
Bob Denny |
He has posted many more Comm Center articles with variations on this. Need to re-examine all weather timing issues. There are holes. |