RE: NetReg: NetReg v2 hang

New Message Reply Date view Thread view Subject view Author view Attachment view

From: Eric Gauthier (elg@bu.edu)
Date: Tue Dec 09 2003 - 10:50:19 CST


Buzz,

You probably just want to ignore the fact that we commented out all those
lines. In general, they're a good idea. We've done a lot of customizations
to NetReg for our local environment, including using an existing direct
OMAPI interface from our NetRev v1 days for the final add/delete step. The
only time we use the dhcpd_manager client is to look up the mac address
(which NetReg uses as well as other programs that our security team has
built) or to scan through the leases file. So, the only time we see these
error messages is when the manager fails during mac-lookup. Before we
commented out the error messages, we were seeing the following printed on
the web page:

        - Could no open connection to dhcpdmanager at X
          (from sub connect_dhcpdmanager)
        - Failed to connect to server X
          (from sub get_host_info_remote)
        - Unable to contact server X
          (from sub get_host_info_remote_all)
        - Unable to determine your MAC address
          (from a BU specific subroutine - "get_MAC").

Instead of all four messages, we just wanted the last one to appear so we
commented out the first three. Again, in our case, I think the only time
connect_dhcpdmanager is called is from within our get_MAC routine. For most
other people this is not true so the individual error messages are really
necessary.

Sorry about the confusion :(

Eric Gauthier
  Network Engineer
  617-353-8218 ~^~ elg@bu.edu
Boston University - Office of IT

-----Original Message-----
From: Buzz [mailto:buzz@oska.com]
Sent: Monday, December 08, 2003 7:35 PM
To: Eric Gauthier
Cc: netreg@southwestern.edu
Subject: Re: NetReg: NetReg v2 hang

Hi Eric,
    I like the addition of the alarm/signal to give up after X seconds, and
I've added that to the CVS code base, but why did you comment out all the
"admin_error" function calls - these are there so that the users get told
there is a problem? I haven't changed that part as of yet.

David.

----- Original Message -----
From: "Eric Gauthier" <elg@bu.edu>
To: <buzz@oska.com>
Cc: <netreg@southwestern.edu>
Sent: Tuesday, December 09, 2003 3:34 AM
Subject: RE: NetReg: NetReg v2 hang

> Buzz,
>
> To give you an example, we ran some tests with DhcpdManager_pm.pl, which
> uses the same code as Files_Remote.pm to make the client-server
connections.
> When I run test_DhcpdManager_pm.pl from the command line with only 1 test
in
> it, the script takes 3 minutes to fail:
> # time ../test_DhcpdManager_pm.pl
> couldn't connect to server at ../test_DhcpdManager_pm.pl line 27.
>
> real 3m9.089s
> user 0m0.060s
> sys 0m0.000s
> (line 27 is "die "couldn't connect to server" unless $conn;")
> For us, our end users will not wait 3 minutes for a page to be displayed
and
> we really don't want them waiting that long just to get a failure message.
>
> Again, this situation only seems to occur when the server is up (ping
checks
> ok) but the daemon is unreachable for some reason (i.e. its not running,
its
> filtered out, etc). We haven't run into this in practice yet, but we have
> seen it in our failure tests/simulations. To get a better response time,
we
> modified Files_Remote.pm (version 2.0 - which I believe is the latest) to
> use a timeout variable, loaded from Variables.pm, to stop a connection
> attempt sooner. Below, I've included the diff between the standard
version
> 2.0 file and the one we are using. The changes are those around line 578
> and also line 92, which includes the timeout value from Variables.pm. The
> other differences are just commented out error messages to change what the
> end user sees on their web page (i.e. one error message and not three).
>
> 92c92
> < use Variables qw ( $KeyName $Key $DHCPDMANAGER $DEBUG $PING
> DHCPDMANAGER_TIMEOUT );
> ---
> > use Variables qw ( $KeyName $Key $DHCPDMANAGER $DEBUG $PING );
> 472c472
> < # admin_error("failed to connect to server '$ServerIP' (via dhcpdm):
$!
> \n","Network") unless $conn;
> ---
> > admin_error("failed to connect to server '$ServerIP' (via dhcpdm):
$!
> \n","Network") unless $conn;
> 510c510
> < # admin_error ("Unable to contact $ServerIP for host
> query","Network") unless defined $mac;
> ---
> > admin_error ("Unable to contact $ServerIP for host
> query","Network") unless defined $mac;
> 540c540
> < $pingcmd = "$PING -c 1 -w 1 -n -q $ip > /dev/null 2>/dev/null";
> ---
> > $pingcmd = "$PING -c 1 -w 1 -n -q $ip > /dev/null";
> 578,582c578
> < my $conn;
> < eval {
> < local $SIG{ALRM} = sub { die undef; } ;
> < alarm $DHCPDMANAGER_TIMEOUT;
> < $conn = DhcpdManager->new(
> ---
> > my $conn = DhcpdManager->new(
> 587,588d582
> < $alarm 0;
> < };
> 591,592c585,586
> < # admin_error("Could not open connection to dhcpdmanager at
> $server", "Network")
> < # unless $no_errors;
> ---
> > admin_error("Could not open connection to dhcpdmanager at
> $server", "Network")
> > unless $no_errors;
>
> You'll also need modifications to Variables to get the timeout value by
> adding $DHCPDMANAGER_TIMEOUT to the export and adding in a line somewhere
> with:
> our $DHCPDMANAGER_TIMEOUT = "15";
> Where "15" is your timeout in seconds.
>
> Has anyone else seen this problem? If we went with these modifications,
> does anyone see any other issues that we might be causing?
>
> Thanks!
>
> Eric Gauthier
> Network Engineer
> 617-353-8218 ~^~ elg@bu.edu
> Boston University - Office of IT
>
>
>
> -----Original Message-----
> From: Buzz [mailto:buzz@oska.com]
> Sent: Sunday, December 07, 2003 7:47 PM
> To: netreg@southwestern.edu; "Eric Gauthier"
> Subject: Re: NetReg: NetReg v2 hang
>
>
> Eric,
> the dhcpdmanager is a VERY integral part of the Netreg2 system, and
> without it running on each remote dhcpd server, a lot of things won't
work
> as expected. I'm not suprised that the webpages "go slow", as most pages
> perform queries to the dhcpdmanager "on the fly". You should be able to
> easily see if the dhcpdmanager has problems on the "server overview page",
> which does a number of different checks (ping,manager,dhcpd server checks)
> to make sure all is running as it should be.
>
> The only suggestion/s I can make are to ensure that you are running the
CVS
> version, and if the problem still occurs, then please feel free to submit
a
> patch. :-)
>
> David B.
>
> ----- Original Message -----
> From: "Eric Gauthier" <elg@bu.edu>
> To: <netreg@southwestern.edu>
> Sent: Saturday, December 06, 2003 8:59 AM
> Subject: NetReg: NetReg v2 hang
>
>
> > Hello,
> >
> > I did a quick (though honestly not thorough) look through the archives
and
> I
> > couldn't see any notes about this. We're running a pair of DHCP servers
> in
> > "failover" mode and wanted to put the NetReg version 2 DhcpdManager
client
> > onto both of them. In our testing, we found that if the dhcpdmanager on
> > either DHCP server died but the system was still pingable, then the
NetReg
> > pages and routines would just hang. It looks like this is related to
the
> > lack of a timeout in the DhcpdManager->new call. Has anyone seen this?
> >
> > We built a temporary fix using alarms into Files_Remote.pm's
> > connect_dhcpdmanager subroutine, but I wanted to see if there was an
> > official patch/fix/upgrade or if someone could point out some
installation
> > mishap on our part that might be causing this.
> >
> > Thanks!
> >
> > Eric Gauthier
> > Network Engineer
> > 617-353-8218 ~^~ elg@bu.edu
> > Boston University - Office of IT
> >
> > **********************************************************************
> > To unsubscribe from this list, send an e-mail message to
> > majordomo@southwestern.edu containing a single line with the words:
> > unsubscribe netreg
> > Send requests for assistance to: owner-netreg@southwestern.edu
> > **********************************************************************
> >
> >
>
>
>
>
>

**********************************************************************
To unsubscribe from this list, send an e-mail message to
majordomo@southwestern.edu containing a single line with the words:
unsubscribe netreg
Send requests for assistance to: owner-netreg@southwestern.edu
**********************************************************************


New Message Reply Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.4 : Thu Aug 12 2004 - 12:01:42 CDT