From: Eric Gauthier (elg@bu.edu)
Date: Mon Dec 08 2003 - 11:34:54 CST
Buzz,
To give you an example, we ran some tests with DhcpdManager_pm.pl, which
uses the same code as Files_Remote.pm to make the client-server connections.
When I run test_DhcpdManager_pm.pl from the command line with only 1 test in
it, the script takes 3 minutes to fail:
# time ../test_DhcpdManager_pm.pl
couldn't connect to server at ../test_DhcpdManager_pm.pl line 27.
real 3m9.089s
user 0m0.060s
sys 0m0.000s
(line 27 is "die "couldn't connect to server" unless $conn;")
For us, our end users will not wait 3 minutes for a page to be displayed and
we really don't want them waiting that long just to get a failure message.
Again, this situation only seems to occur when the server is up (ping checks
ok) but the daemon is unreachable for some reason (i.e. its not running, its
filtered out, etc). We haven't run into this in practice yet, but we have
seen it in our failure tests/simulations. To get a better response time, we
modified Files_Remote.pm (version 2.0 - which I believe is the latest) to
use a timeout variable, loaded from Variables.pm, to stop a connection
attempt sooner. Below, I've included the diff between the standard version
2.0 file and the one we are using. The changes are those around line 578
and also line 92, which includes the timeout value from Variables.pm. The
other differences are just commented out error messages to change what the
end user sees on their web page (i.e. one error message and not three).
92c92
< use Variables qw ( $KeyName $Key $DHCPDMANAGER $DEBUG $PING
DHCPDMANAGER_TIMEOUT );
--- > use Variables qw ( $KeyName $Key $DHCPDMANAGER $DEBUG $PING ); 472c472 < # admin_error("failed to connect to server '$ServerIP' (via dhcpdm): $! \n","Network") unless $conn; --- > admin_error("failed to connect to server '$ServerIP' (via dhcpdm): $! \n","Network") unless $conn; 510c510 < # admin_error ("Unable to contact $ServerIP for host query","Network") unless defined $mac; --- > admin_error ("Unable to contact $ServerIP for host query","Network") unless defined $mac; 540c540 < $pingcmd = "$PING -c 1 -w 1 -n -q $ip > /dev/null 2>/dev/null"; --- > $pingcmd = "$PING -c 1 -w 1 -n -q $ip > /dev/null"; 578,582c578 < my $conn; < eval { < local $SIG{ALRM} = sub { die undef; } ; < alarm $DHCPDMANAGER_TIMEOUT; < $conn = DhcpdManager->new( --- > my $conn = DhcpdManager->new( 587,588d582 < $alarm 0; < }; 591,592c585,586 < # admin_error("Could not open connection to dhcpdmanager at $server", "Network") < # unless $no_errors; --- > admin_error("Could not open connection to dhcpdmanager at $server", "Network") > unless $no_errors;You'll also need modifications to Variables to get the timeout value by adding $DHCPDMANAGER_TIMEOUT to the export and adding in a line somewhere with: our $DHCPDMANAGER_TIMEOUT = "15"; Where "15" is your timeout in seconds.
Has anyone else seen this problem? If we went with these modifications, does anyone see any other issues that we might be causing?
Thanks!
Eric Gauthier Network Engineer 617-353-8218 ~^~ elg@bu.edu Boston University - Office of IT
-----Original Message----- From: Buzz [mailto:buzz@oska.com] Sent: Sunday, December 07, 2003 7:47 PM To: netreg@southwestern.edu; "Eric Gauthier" Subject: Re: NetReg: NetReg v2 hang
Eric, the dhcpdmanager is a VERY integral part of the Netreg2 system, and without it running on each remote dhcpd server, a lot of things won't work as expected. I'm not suprised that the webpages "go slow", as most pages perform queries to the dhcpdmanager "on the fly". You should be able to easily see if the dhcpdmanager has problems on the "server overview page", which does a number of different checks (ping,manager,dhcpd server checks) to make sure all is running as it should be.
The only suggestion/s I can make are to ensure that you are running the CVS version, and if the problem still occurs, then please feel free to submit a patch. :-)
David B.
----- Original Message ----- From: "Eric Gauthier" <elg@bu.edu> To: <netreg@southwestern.edu> Sent: Saturday, December 06, 2003 8:59 AM Subject: NetReg: NetReg v2 hang
> Hello, > > I did a quick (though honestly not thorough) look through the archives and I > couldn't see any notes about this. We're running a pair of DHCP servers in > "failover" mode and wanted to put the NetReg version 2 DhcpdManager client > onto both of them. In our testing, we found that if the dhcpdmanager on > either DHCP server died but the system was still pingable, then the NetReg > pages and routines would just hang. It looks like this is related to the > lack of a timeout in the DhcpdManager->new call. Has anyone seen this? > > We built a temporary fix using alarms into Files_Remote.pm's > connect_dhcpdmanager subroutine, but I wanted to see if there was an > official patch/fix/upgrade or if someone could point out some installation > mishap on our part that might be causing this. > > Thanks! > > Eric Gauthier > Network Engineer > 617-353-8218 ~^~ elg@bu.edu > Boston University - Office of IT > > ********************************************************************** > To unsubscribe from this list, send an e-mail message to > majordomo@southwestern.edu containing a single line with the words: > unsubscribe netreg > Send requests for assistance to: owner-netreg@southwestern.edu > ********************************************************************** > >
********************************************************************** To unsubscribe from this list, send an e-mail message to majordomo@southwestern.edu containing a single line with the words: unsubscribe netreg Send requests for assistance to: owner-netreg@southwestern.edu **********************************************************************
This archive was generated by hypermail 2.1.4 : Thu Aug 12 2004 - 12:01:42 CDT