Quantcast

Hostname not found

classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Hostname not found

Dan Langille
NOTE: this message is not about resolving the permissions error.  It is about [what appears to be] a DNS issue.

Please help me understand which hostname is not being resolved in these messages.  My checks with dig to verify DNS have found no errors.

I have checked:

Client:                 crey-fd
Read Storage:           "bacula-sd-01-file" 
Write Storage:          "tape01" (From Job resource)

I've checked the bacula configuration files, looked for the Address, and run 'dig +short' on the hostname

Then, with the IP address from the previous dig, I do 'dig +short -x'.  I get the original hostname.

Ideas please?



First email:
###
10-Feb 15:17 bacula-dir JobId 256313: Warning: FileSet MD5 digest not found.
10-Feb 15:17 bacula-dir JobId 256313: The following 1 JobId was chosen to be copied: 255998
10-Feb 15:17 bacula-dir JobId 256313: Copying using JobId=255998 Job=slocum_jail_snapshots.2017-02-05_03.04.00_46
10-Feb 15:17 bacula-dir JobId 256313: Bootstrap records written to /usr/local/bacula/working/bacula-dir.restore.27.bsr
10-Feb 15:17 bacula-dir JobId 256313: Start Copying JobId 256313, Job=CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
10-Feb 15:17 bacula-dir JobId 256313: Using Device "vDrive-0" to read.
10-Feb 15:17 bacula-sd-01-sd JobId 256313: Ready to read from volume "FullAuto-1369" on file device "vDrive-0" (/usr/local/bacula/volumes).
10-Feb 15:17 bacula-sd-01-sd JobId 256313: Forward spacing Volume "FullAuto-1369" to file:block 0:214.
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:453 Wrote 3814 bytes to client:Hostname not found:9103, but only 0 accepted.
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: read.c:284 Error sending to File daemon. ERR=Broken pipe
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Elapsed time=00:00:08, Transfer rate=14.46 K Bytes/second
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has errors=1 on call to client:Hostname not found:9103
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: fd_cmds.c:142 Read data not accepted
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has errors=1 on call to client:Hostname not found:9103
10-Feb 15:18 bacula-dir JobId 256313: Error: Bacula bacula-dir 7.4.4 (28Sep16):
 Build OS:               amd64-portbld-freebsd11.0 freebsd 11.0-RELEASE-p6
 Prev Backup JobId:      255998
 Prev Backup Job:        slocum_jail_snapshots.2017-02-05_03.04.00_46
 New Backup JobId:       256314
 Current JobId:          256313
 Current Job:            CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
 Backup Level:           Full
 Client:                 crey-fd
 FileSet:                "EmptyCopyToTape" 2011-02-20 20:53:31
 Read Pool:              "FullFile" (From Job resource)
 Read Storage:           "bacula-sd-01-file" (From Pool resource)
 Write Pool:             "FullsLTO4" (From Job resource)
 Write Storage:          "tape01" (From Job resource)
 Catalog:                "MyCatalog" (From Client resource)
 Start time:             10-Feb-2017 15:17:55
 End time:               10-Feb-2017 15:18:05
 Elapsed time:           10 secs
 Priority:               430
 SD Files Written:       9
 SD Bytes Written:       115,699 (115.6 KB)
 Rate:                   11.6 KB/s
 Volume name(s):         
 Volume Session Id:      98
 Volume Session Time:    1486562541
 Last Volume Bytes:      0 (0 B)
 SD Errors:              3
 SD termination status:  Error
 Termination:            *** Copying Error ***
###

Second email, which corresponds to the above copy job:
###
10-Feb 15:17 bacula-dir JobId 256314: Using Device "LTO_0" to write.
10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3304 Issuing autochanger "load slot 10, drive 0" command for vol 000013L4.
10-Feb 15:18 tape01-sd JobId 256314: Fatal error: 3992 Bad autochanger "load slot 10, drive 0": ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied
###

-- 
Dan Langille - BSDCan / PGCon




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Mike Fröhner
How do you client config look like for crey-fd? It must be something like:

Client {
   Name = crey-fd
   Address = <hostname>
   ...
}

If you specify a hostname for Address this hostname must be resolvable.
Did you check this?

Mike

On 02/10/2017 05:02 PM, Dan Langille wrote:

> NOTE: this message is not about resolving the permissions error.  It is
> about [what appears to be] a DNS issue.
>
> Please help me understand which hostname is not being resolved in these
> messages.  My checks with dig to verify DNS have found no errors.
>
> I have checked:
>
> Client:                 crey-fd
> Read Storage:           "bacula-sd-01-file"
> Write Storage:          "tape01" (From Job resource)
>
> I've checked the bacula configuration files, looked for the Address, and
> run 'dig +short' on the hostname
>
> Then, with the IP address from the previous dig, I do 'dig +short -x'.
>  I get the original hostname.
>
> Ideas please?
>
>
>
> First email:
> ###
> 10-Feb 15:17 bacula-dir JobId 256313: Warning: FileSet MD5 digest not found.
> 10-Feb 15:17 bacula-dir JobId 256313: The following 1 JobId was chosen
> to be copied: 255998
> 10-Feb 15:17 bacula-dir JobId 256313: Copying using JobId=255998
> Job=slocum_jail_snapshots.2017-02-05_03.04.00_46
> 10-Feb 15:17 bacula-dir JobId 256313: Bootstrap records written to
> /usr/local/bacula/working/bacula-dir.restore.27.bsr
> 10-Feb 15:17 bacula-dir JobId 256313: Start Copying JobId 256313,
> Job=CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
> 10-Feb 15:17 bacula-dir JobId 256313: Using Device "vDrive-0" to read.
> 10-Feb 15:17 bacula-sd-01-sd JobId 256313: Ready to read from volume
> "FullAuto-1369" on file device "vDrive-0" (/usr/local/bacula/volumes).
> 10-Feb 15:17 bacula-sd-01-sd JobId 256313: Forward spacing Volume
> "FullAuto-1369" to file:block 0:214.
> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:453 Wrote 3814
> bytes to client:Hostname not found:9103, but only 0 accepted.
> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: read.c:284 Error
> sending to File daemon. ERR=Broken pipe
> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Elapsed time=00:00:08,
> Transfer rate=14.46 K Bytes/second
> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has
> errors=1 on call to client:Hostname not found:9103
> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: fd_cmds.c:142
> Read data not accepted
> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has
> errors=1 on call to client:Hostname not found:9103
> 10-Feb 15:18 bacula-dir JobId 256313: Error: Bacula bacula-dir 7.4.4
> (28Sep16):
>  Build OS:               amd64-portbld-freebsd11.0 freebsd 11.0-RELEASE-p6
>  Prev Backup JobId:      255998
>  Prev Backup Job:        slocum_jail_snapshots.2017-02-05_03.04.00_46
>  New Backup JobId:       256314
>  Current JobId:          256313
>  Current Job:            CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
>  Backup Level:           Full
>  Client:                 crey-fd
>  FileSet:                "EmptyCopyToTape" 2011-02-20 20:53:31
>  Read Pool:              "FullFile" (From Job resource)
>  Read Storage:           "bacula-sd-01-file" (From Pool resource)
>  Write Pool:             "FullsLTO4" (From Job resource)
>  Write Storage:          "tape01" (From Job resource)
>  Catalog:                "MyCatalog" (From Client resource)
>  Start time:             10-Feb-2017 15:17:55
>  End time:               10-Feb-2017 15:18:05
>  Elapsed time:           10 secs
>  Priority:               430
>  SD Files Written:       9
>  SD Bytes Written:       115,699 (115.6 KB)
>  Rate:                   11.6 KB/s
>  Volume name(s):
>  Volume Session Id:      98
>  Volume Session Time:    1486562541
>  Last Volume Bytes:      0 (0 B)
>  SD Errors:              3
>  SD termination status:  Error
>  Termination:            *** Copying Error ***
> ###
>
> Second email, which corresponds to the above copy job:
> ###
> 10-Feb 15:17 bacula-dir JobId 256314: Using Device "LTO_0" to write.
> 10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
> 0" command: ERR=Child exited with code 1.
> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>
> 10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
> 0" command: ERR=Child exited with code 1.
> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>
> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
> 0" command: ERR=Child exited with code 1.
> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>
> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
> 0" command: ERR=Child exited with code 1.
> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>
> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
> 0" command: ERR=Child exited with code 1.
> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>
> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
> 0" command: ERR=Child exited with code 1.
> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>
> 10-Feb 15:18 tape01-sd JobId 256314: 3304 Issuing autochanger "load slot
> 10, drive 0" command for vol 000013L4.
> 10-Feb 15:18 tape01-sd JobId 256314: Fatal error: 3992 Bad autochanger
> "load slot 10, drive 0": ERR=Child exited with code 1.
> Results=cannot open SCSI device '/dev/pass3' - Permission denied
> ###
>
> --
> Dan Langille - BSDCan / PGCon
> [hidden email] <mailto:[hidden email]>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
>
>
>
> _______________________________________________
> Bacula-users mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Kern Sibbald
In reply to this post by Dan Langille
Hello Dan,

When in your case, the SD opens a socket, it looks for the hostname, and for some reason it did not find it. Are you 100% sure that the DNS resolver is working on your SD machine?  What do you have set for SDCallsClient?  Perhaps for some particular SDCallsClient, the SD does not have everything it needs for filling in the error message. 

Best regards,
Kern

On 02/10/2017 05:02 PM, Dan Langille wrote:
NOTE: this message is not about resolving the permissions error.  It is about [what appears to be] a DNS issue.

Please help me understand which hostname is not being resolved in these messages.  My checks with dig to verify DNS have found no errors.

I have checked:

Client:                 crey-fd
Read Storage:           "bacula-sd-01-file" 
Write Storage:          "tape01" (From Job resource)

I've checked the bacula configuration files, looked for the Address, and run 'dig +short' on the hostname

Then, with the IP address from the previous dig, I do 'dig +short -x'.  I get the original hostname.

Ideas please?



First email:
###
10-Feb 15:17 bacula-dir JobId 256313: Warning: FileSet MD5 digest not found.
10-Feb 15:17 bacula-dir JobId 256313: The following 1 JobId was chosen to be copied: 255998
10-Feb 15:17 bacula-dir JobId 256313: Copying using JobId=255998 Job=slocum_jail_snapshots.2017-02-05_03.04.00_46
10-Feb 15:17 bacula-dir JobId 256313: Bootstrap records written to /usr/local/bacula/working/bacula-dir.restore.27.bsr
10-Feb 15:17 bacula-dir JobId 256313: Start Copying JobId 256313, Job=CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
10-Feb 15:17 bacula-dir JobId 256313: Using Device "vDrive-0" to read.
10-Feb 15:17 bacula-sd-01-sd JobId 256313: Ready to read from volume "FullAuto-1369" on file device "vDrive-0" (/usr/local/bacula/volumes).
10-Feb 15:17 bacula-sd-01-sd JobId 256313: Forward spacing Volume "FullAuto-1369" to file:block 0:214.
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:453 Wrote 3814 bytes to client:Hostname not found:9103, but only 0 accepted.
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: read.c:284 Error sending to File daemon. ERR=Broken pipe
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Elapsed time=00:00:08, Transfer rate=14.46 K Bytes/second
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has errors=1 on call to client:Hostname not found:9103
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: fd_cmds.c:142 Read data not accepted
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has errors=1 on call to client:Hostname not found:9103
10-Feb 15:18 bacula-dir JobId 256313: Error: Bacula bacula-dir 7.4.4 (28Sep16):
 Build OS:               amd64-portbld-freebsd11.0 freebsd 11.0-RELEASE-p6
 Prev Backup JobId:      255998
 Prev Backup Job:        slocum_jail_snapshots.2017-02-05_03.04.00_46
 New Backup JobId:       256314
 Current JobId:          256313
 Current Job:            CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
 Backup Level:           Full
 Client:                 crey-fd
 FileSet:                "EmptyCopyToTape" 2011-02-20 20:53:31
 Read Pool:              "FullFile" (From Job resource)
 Read Storage:           "bacula-sd-01-file" (From Pool resource)
 Write Pool:             "FullsLTO4" (From Job resource)
 Write Storage:          "tape01" (From Job resource)
 Catalog:                "MyCatalog" (From Client resource)
 Start time:             10-Feb-2017 15:17:55
 End time:               10-Feb-2017 15:18:05
 Elapsed time:           10 secs
 Priority:               430
 SD Files Written:       9
 SD Bytes Written:       115,699 (115.6 KB)
 Rate:                   11.6 KB/s
 Volume name(s):         
 Volume Session Id:      98
 Volume Session Time:    1486562541
 Last Volume Bytes:      0 (0 B)
 SD Errors:              3
 SD termination status:  Error
 Termination:            *** Copying Error ***
###

Second email, which corresponds to the above copy job:
###
10-Feb 15:17 bacula-dir JobId 256314: Using Device "LTO_0" to write.
10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive 0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3304 Issuing autochanger "load slot 10, drive 0" command for vol 000013L4.
10-Feb 15:18 tape01-sd JobId 256314: Fatal error: 3992 Bad autochanger "load slot 10, drive 0": ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied
###

-- 
Dan Langille - BSDCan / PGCon





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dan Langille
In reply to this post by Mike Fröhner
On Feb 10, 2017, at 11:24 AM, Mike Fröhner <[hidden email]> wrote:

How do you client config look like for crey-fd? It must be something like:

Client {
  Name = crey-fd
  Address = <hostname>
  ...
}

If you specify a hostname for Address this hostname must be resolvable. 

Yes:

Client {
  Name           = crey-fd
  Address        = crey.int.unixathome.org


Did you check this?

Yes, that works:

$ dig +short crey.int.unixathome.org
10.55.0.10
$ dig +short -x 10.55.0.10
crey.int.unixathome.org.


Mike

On 02/10/2017 05:02 PM, Dan Langille wrote:
NOTE: this message is not about resolving the permissions error.  It is
about [what appears to be] a DNS issue.

Please help me understand which hostname is not being resolved in these
messages.  My checks with dig to verify DNS have found no errors.

I have checked:

Client:                 crey-fd
Read Storage:           "bacula-sd-01-file"
Write Storage:          "tape01" (From Job resource)

I've checked the bacula configuration files, looked for the Address, and
run 'dig +short' on the hostname

Then, with the IP address from the previous dig, I do 'dig +short -x'.
I get the original hostname.

Ideas please?



First email:
###
10-Feb 15:17 bacula-dir JobId 256313: Warning: FileSet MD5 digest not found.
10-Feb 15:17 bacula-dir JobId 256313: The following 1 JobId was chosen
to be copied: 255998
10-Feb 15:17 bacula-dir JobId 256313: Copying using JobId=255998
Job=slocum_jail_snapshots.2017-02-05_03.04.00_46
10-Feb 15:17 bacula-dir JobId 256313: Bootstrap records written to
/usr/local/bacula/working/bacula-dir.restore.27.bsr
10-Feb 15:17 bacula-dir JobId 256313: Start Copying JobId 256313,
Job=CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
10-Feb 15:17 bacula-dir JobId 256313: Using Device "vDrive-0" to read.
10-Feb 15:17 bacula-sd-01-sd JobId 256313: Ready to read from volume
"FullAuto-1369" on file device "vDrive-0" (/usr/local/bacula/volumes).
10-Feb 15:17 bacula-sd-01-sd JobId 256313: Forward spacing Volume
"FullAuto-1369" to file:block 0:214.
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:453 Wrote 3814
bytes to client:Hostname not found:9103, but only 0 accepted.
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: read.c:284 Error
sending to File daemon. ERR=Broken pipe
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Elapsed time=00:00:08,
Transfer rate=14.46 K Bytes/second
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has
errors=1 on call to client:Hostname not found:9103
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: fd_cmds.c:142
Read data not accepted
10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has
errors=1 on call to client:Hostname not found:9103
10-Feb 15:18 bacula-dir JobId 256313: Error: Bacula bacula-dir 7.4.4
(28Sep16):
Build OS:               amd64-portbld-freebsd11.0 freebsd 11.0-RELEASE-p6
Prev Backup JobId:      255998
Prev Backup Job:        slocum_jail_snapshots.2017-02-05_03.04.00_46
New Backup JobId:       256314
Current JobId:          256313
Current Job:            CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
Backup Level:           Full
Client:                 crey-fd
FileSet:                "EmptyCopyToTape" 2011-02-20 20:53:31
Read Pool:              "FullFile" (From Job resource)
Read Storage:           "bacula-sd-01-file" (From Pool resource)
Write Pool:             "FullsLTO4" (From Job resource)
Write Storage:          "tape01" (From Job resource)
Catalog:                "MyCatalog" (From Client resource)
Start time:             10-Feb-2017 15:17:55
End time:               10-Feb-2017 15:18:05
Elapsed time:           10 secs
Priority:               430
SD Files Written:       9
SD Bytes Written:       115,699 (115.6 KB)
Rate:                   11.6 KB/s
Volume name(s):
Volume Session Id:      98
Volume Session Time:    1486562541
Last Volume Bytes:      0 (0 B)
SD Errors:              3
SD termination status:  Error
Termination:            *** Copying Error ***
###

Second email, which corresponds to the above copy job:
###
10-Feb 15:17 bacula-dir JobId 256314: Using Device "LTO_0" to write.
10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
0" command: ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied

10-Feb 15:18 tape01-sd JobId 256314: 3304 Issuing autochanger "load slot
10, drive 0" command for vol 000013L4.
10-Feb 15:18 tape01-sd JobId 256314: Fatal error: 3992 Bad autochanger
"load slot 10, drive 0": ERR=Child exited with code 1.
Results=cannot open SCSI device '/dev/pass3' - Permission denied
###

--
Dan Langille - BSDCan / PGCon
[hidden email] <[hidden email]>





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot



_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Mike Fröhner
If bacula-dir and -sd run on different hosts: Did you also verify this
works on the bacula-sd host and not just on the bacula-dir host.

On 02/10/2017 06:14 PM, Dan Langille wrote:

>> On Feb 10, 2017, at 11:24 AM, Mike Fröhner <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> How do you client config look like for crey-fd? It must be something like:
>>
>> Client {
>>   Name = crey-fd
>>   Address = <hostname>
>>   ...
>> }
>>
>> If you specify a hostname for Address this hostname must be resolvable.
>
> Yes:
>
> Client {
>   Name           = crey-fd
>   Address        = crey.int.unixathome.org <http://crey.int.unixathome.org>
>
>
>> Did you check this?
>
> Yes, that works:
>
> $ dig +short crey.int.unixathome.org <http://crey.int.unixathome.org>
> 10.55.0.10
> $ dig +short -x 10.55.0.10
> crey.int.unixathome.org <http://crey.int.unixathome.org>.
> $
>
>>
>> Mike
>>
>> On 02/10/2017 05:02 PM, Dan Langille wrote:
>>> NOTE: this message is not about resolving the permissions error.  It is
>>> about [what appears to be] a DNS issue.
>>>
>>> Please help me understand which hostname is not being resolved in these
>>> messages.  My checks with dig to verify DNS have found no errors.
>>>
>>> I have checked:
>>>
>>> Client:                 crey-fd
>>> Read Storage:           "bacula-sd-01-file"
>>> Write Storage:          "tape01" (From Job resource)
>>>
>>> I've checked the bacula configuration files, looked for the Address, and
>>> run 'dig +short' on the hostname
>>>
>>> Then, with the IP address from the previous dig, I do 'dig +short -x'.
>>> I get the original hostname.
>>>
>>> Ideas please?
>>>
>>>
>>>
>>> First email:
>>> ###
>>> 10-Feb 15:17 bacula-dir JobId 256313: Warning: FileSet MD5 digest not
>>> found.
>>> 10-Feb 15:17 bacula-dir JobId 256313: The following 1 JobId was chosen
>>> to be copied: 255998
>>> 10-Feb 15:17 bacula-dir JobId 256313: Copying using JobId=255998
>>> Job=slocum_jail_snapshots.2017-02-05_03.04.00_46
>>> 10-Feb 15:17 bacula-dir JobId 256313: Bootstrap records written to
>>> /usr/local/bacula/working/bacula-dir.restore.27.bsr
>>> 10-Feb 15:17 bacula-dir JobId 256313: Start Copying JobId 256313,
>>> Job=CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
>>> 10-Feb 15:17 bacula-dir JobId 256313: Using Device "vDrive-0" to read.
>>> 10-Feb 15:17 bacula-sd-01-sd JobId 256313: Ready to read from volume
>>> "FullAuto-1369" on file device "vDrive-0" (/usr/local/bacula/volumes).
>>> 10-Feb 15:17 bacula-sd-01-sd JobId 256313: Forward spacing Volume
>>> "FullAuto-1369" to file:block 0:214.
>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:453 Wrote 3814
>>> bytes to client:Hostname not found:9103, but only 0 accepted.
>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: read.c:284 Error
>>> sending to File daemon. ERR=Broken pipe
>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Elapsed time=00:00:08,
>>> Transfer rate=14.46 K Bytes/second
>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has
>>> errors=1 on call to client:Hostname not found:9103
>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: fd_cmds.c:142
>>> Read data not accepted
>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has
>>> errors=1 on call to client:Hostname not found:9103
>>> 10-Feb 15:18 bacula-dir JobId 256313: Error: Bacula bacula-dir 7.4.4
>>> (28Sep16):
>>> Build OS:               amd64-portbld-freebsd11.0 freebsd 11.0-RELEASE-p6
>>> Prev Backup JobId:      255998
>>> Prev Backup Job:        slocum_jail_snapshots.2017-02-05_03.04.00_46
>>> New Backup JobId:       256314
>>> Current JobId:          256313
>>> Current Job:            CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
>>> Backup Level:           Full
>>> Client:                 crey-fd
>>> FileSet:                "EmptyCopyToTape" 2011-02-20 20:53:31
>>> Read Pool:              "FullFile" (From Job resource)
>>> Read Storage:           "bacula-sd-01-file" (From Pool resource)
>>> Write Pool:             "FullsLTO4" (From Job resource)
>>> Write Storage:          "tape01" (From Job resource)
>>> Catalog:                "MyCatalog" (From Client resource)
>>> Start time:             10-Feb-2017 15:17:55
>>> End time:               10-Feb-2017 15:18:05
>>> Elapsed time:           10 secs
>>> Priority:               430
>>> SD Files Written:       9
>>> SD Bytes Written:       115,699 (115.6 KB)
>>> Rate:                   11.6 KB/s
>>> Volume name(s):
>>> Volume Session Id:      98
>>> Volume Session Time:    1486562541
>>> Last Volume Bytes:      0 (0 B)
>>> SD Errors:              3
>>> SD termination status:  Error
>>> Termination:            *** Copying Error ***
>>> ###
>>>
>>> Second email, which corresponds to the above copy job:
>>> ###
>>> 10-Feb 15:17 bacula-dir JobId 256314: Using Device "LTO_0" to write.
>>> 10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>> 0" command: ERR=Child exited with code 1.
>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>
>>> 10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>> 0" command: ERR=Child exited with code 1.
>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>
>>> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>> 0" command: ERR=Child exited with code 1.
>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>
>>> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>> 0" command: ERR=Child exited with code 1.
>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>
>>> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>> 0" command: ERR=Child exited with code 1.
>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>
>>> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>> 0" command: ERR=Child exited with code 1.
>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>
>>> 10-Feb 15:18 tape01-sd JobId 256314: 3304 Issuing autochanger "load slot
>>> 10, drive 0" command for vol 000013L4.
>>> 10-Feb 15:18 tape01-sd JobId 256314: Fatal error: 3992 Bad autochanger
>>> "load slot 10, drive 0": ERR=Child exited with code 1.
>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>> ###
>>>
>>> --
>>> Dan Langille - BSDCan / PGCon
>>> [hidden email] <mailto:[hidden email]> <mailto:[hidden email]>
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, SlashDot.org
>>> <http://slashdot.org/>! http://sdm.link/slashdot
>>>
>>>
>>>
>>> _______________________________________________
>>> Bacula-users mailing list
>>> [hidden email]
>>> <mailto:[hidden email]>
>>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>>
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, SlashDot.org
>> <http://slashdot.org/>! http://sdm.link/slashdot
>> _______________________________________________
>> Bacula-users mailing list
>> [hidden email]
>> <mailto:[hidden email]>
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dan Langille
On 02/10/2017 05:02 PM, Dan Langille wrote:

>>>> NOTE: this message is not about resolving the permissions error.  It is
>>>> about [what appears to be] a DNS issue.
>>>>
>>>> Please help me understand which hostname is not being resolved in these
>>>> messages.  My checks with dig to verify DNS have found no errors.
>>>>
>>>> I have checked:
>>>>
>>>> Client:                 crey-fd
>>>> Read Storage:           "bacula-sd-01-file"
>>>> Write Storage:          "tape01" (From Job resource)
>>>>
>>>> I've checked the bacula configuration files, looked for the Address, and
>>>> run 'dig +short' on the hostname
>>>>
>>>> Then, with the IP address from the previous dig, I do 'dig +short -x'.
>>>> I get the original hostname.
>>>>
>>>> Ideas please?
>>>>
>>>>
>>>>
>>>> First email:
>>>> ###
>>>> 10-Feb 15:17 bacula-dir JobId 256313: Warning: FileSet MD5 digest not
>>>> found.
>>>> 10-Feb 15:17 bacula-dir JobId 256313: The following 1 JobId was chosen
>>>> to be copied: 255998
>>>> 10-Feb 15:17 bacula-dir JobId 256313: Copying using JobId=255998
>>>> Job=slocum_jail_snapshots.2017-02-05_03.04.00_46
>>>> 10-Feb 15:17 bacula-dir JobId 256313: Bootstrap records written to
>>>> /usr/local/bacula/working/bacula-dir.restore.27.bsr
>>>> 10-Feb 15:17 bacula-dir JobId 256313: Start Copying JobId 256313,
>>>> Job=CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
>>>> 10-Feb 15:17 bacula-dir JobId 256313: Using Device "vDrive-0" to read.
>>>> 10-Feb 15:17 bacula-sd-01-sd JobId 256313: Ready to read from volume
>>>> "FullAuto-1369" on file device "vDrive-0" (/usr/local/bacula/volumes).
>>>> 10-Feb 15:17 bacula-sd-01-sd JobId 256313: Forward spacing Volume
>>>> "FullAuto-1369" to file:block 0:214.
>>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:453 Wrote 3814
>>>> bytes to client:Hostname not found:9103, but only 0 accepted.
>>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: read.c:284 Error
>>>> sending to File daemon. ERR=Broken pipe
>>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Elapsed time=00:00:08,
>>>> Transfer rate=14.46 K Bytes/second
>>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has
>>>> errors=1 on call to client:Hostname not found:9103
>>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Fatal error: fd_cmds.c:142
>>>> Read data not accepted
>>>> 10-Feb 15:18 bacula-sd-01-sd JobId 256313: Error: bsock.c:375 Socket has
>>>> errors=1 on call to client:Hostname not found:9103
>>>> 10-Feb 15:18 bacula-dir JobId 256313: Error: Bacula bacula-dir 7.4.4
>>>> (28Sep16):
>>>> Build OS:               amd64-portbld-freebsd11.0 freebsd 11.0-RELEASE-p6
>>>> Prev Backup JobId:      255998
>>>> Prev Backup Job:        slocum_jail_snapshots.2017-02-05_03.04.00_46
>>>> New Backup JobId:       256314
>>>> Current JobId:          256313
>>>> Current Job:            CopyToTape-Full-LTO4.2017-02-10_15.17.52_17
>>>> Backup Level:           Full
>>>> Client:                 crey-fd
>>>> FileSet:                "EmptyCopyToTape" 2011-02-20 20:53:31
>>>> Read Pool:              "FullFile" (From Job resource)
>>>> Read Storage:           "bacula-sd-01-file" (From Pool resource)
>>>> Write Pool:             "FullsLTO4" (From Job resource)
>>>> Write Storage:          "tape01" (From Job resource)
>>>> Catalog:                "MyCatalog" (From Client resource)
>>>> Start time:             10-Feb-2017 15:17:55
>>>> End time:               10-Feb-2017 15:18:05
>>>> Elapsed time:           10 secs
>>>> Priority:               430
>>>> SD Files Written:       9
>>>> SD Bytes Written:       115,699 (115.6 KB)
>>>> Rate:                   11.6 KB/s
>>>> Volume name(s):
>>>> Volume Session Id:      98
>>>> Volume Session Time:    1486562541
>>>> Last Volume Bytes:      0 (0 B)
>>>> SD Errors:              3
>>>> SD termination status:  Error
>>>> Termination:            *** Copying Error ***
>>>> ###
>>>>
>>>> Second email, which corresponds to the above copy job:
>>>> ###
>>>> 10-Feb 15:17 bacula-dir JobId 256314: Using Device "LTO_0" to write.
>>>> 10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>>> 0" command: ERR=Child exited with code 1.
>>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>>
>>>> 10-Feb 15:17 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>>> 0" command: ERR=Child exited with code 1.
>>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>>
>>>> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>>> 0" command: ERR=Child exited with code 1.
>>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>>
>>>> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>>> 0" command: ERR=Child exited with code 1.
>>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>>
>>>> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>>> 0" command: ERR=Child exited with code 1.
>>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>>
>>>> 10-Feb 15:18 tape01-sd JobId 256314: 3991 Bad autochanger "loaded? drive
>>>> 0" command: ERR=Child exited with code 1.
>>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>>
>>>> 10-Feb 15:18 tape01-sd JobId 256314: 3304 Issuing autochanger "load slot
>>>> 10, drive 0" command for vol 000013L4.
>>>> 10-Feb 15:18 tape01-sd JobId 256314: Fatal error: 3992 Bad autochanger
>>>> "load slot 10, drive 0": ERR=Child exited with code 1.
>>>> Results=cannot open SCSI device '/dev/pass3' - Permission denied
>>>> ###
>>>>
>>>> --
>>>> Dan Langille - BSDCan / PGCon
>>>> [hidden email] <mailto:[hidden email]> <mailto:[hidden email]>


> On 02/10/2017 06:14 PM, Dan Langille wrote:
>>> On Feb 10, 2017, at 11:24 AM, Mike Fröhner <[hidden email]
>>> <mailto:[hidden email]>> wrote:
>>>
>>> How do you client config look like for crey-fd? It must be something like:
>>>
>>> Client {
>>>  Name = crey-fd
>>>  Address = <hostname>
>>>  ...
>>> }
>>>
>>> If you specify a hostname for Address this hostname must be resolvable.
>>
>> Yes:
>>
>> Client {
>>  Name           = crey-fd
>>  Address        = crey.int.unixathome.org <http://crey.int.unixathome.org>
>>
>>
>>> Did you check this?
>>
>> Yes, that works:
>>
>> $ dig +short crey.int.unixathome.org <http://crey.int.unixathome.org>
>> 10.55.0.10
>> $ dig +short -x 10.55.0.10
>> crey.int.unixathome.org <http://crey.int.unixathome.org>.
>> $

> On Feb 10, 2017, at 12:18 PM, Mike Fröhner <[hidden email]> wrote:
>
> If bacula-dir and -sd run on different hosts: Did you also verify this
> works on the bacula-sd host and not just on the bacula-dir host.

They are on different host.  The checks were run on four hosts:

- bacula-dir host
- bacula-sd hosts (both read and write)
- a fourth host unrelated to the backup

--
Dan Langille - BSDCan / PGCon
[hidden email]




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

dweimer
In reply to this post by Dan Langille

On 2017-02-10 10:02 am, Dan Langille wrote:

NOTE: this message is not about resolving the permissions error.  It is about [what appears to be] a DNS issue.
 
Please help me understand which hostname is not being resolved in these messages.  My checks with dig to verify DNS have found no errors.
 
I have checked:
 
Client:                 crey-fd
Read Storage:           "bacula-sd-01-file" 
Write Storage:          "tape01" (From Job resource)
 
I've checked the bacula configuration files, looked for the Address, and run 'dig +short' on the hostname
 
Then, with the IP address from the previous dig, I do 'dig +short -x'.  I get the original hostname.
 
Ideas please?
 
 
Dan, unfortunately this isn't solution to your problem, but I can say, your not alone I have had periodic issues with Bacula failing to resolve host names on FreeBSD. I just gave up and went to IP Addresses. When it occurs it has only been the Bacula process. I have tried testing every which way I can from the OS and always received successful resolution of names. I have had it do it after running fine for several days, then without a reboot, service restart or anything just start failing to resolve. Rebooting server sometimes would fix it, but sometimes it wouldn't, even removed Bacula and reinstalled once without it fixing the problem.
 
I just upgraded my home Bacula server to 7.4.5 this morning and after seeing this, tried switching one of my clients over to name instead of of IP and it did resolve ok and check status of that client. Then checked with my storage daemon switched to name as well. So it doesn't seem to be the latest update that broke your system, perhaps you found that same bug I have periodically ran into, but not found a solution to yet.
 
--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dan Langille
On Feb 10, 2017, at 11:45 AM, Dean E. Weimer <[hidden email]> wrote:

On 2017-02-10 10:02 am, Dan Langille wrote:

NOTE: this message is not about resolving the permissions error.  It is about [what appears to be] a DNS issue.
 
Please help me understand which hostname is not being resolved in these messages.  My checks with dig to verify DNS have found no errors.
 
I have checked:
 
Client:                 crey-fd
Read Storage:           "bacula-sd-01-file" 
Write Storage:          "tape01" (From Job resource)
 
I've checked the bacula configuration files, looked for the Address, and run 'dig +short' on the hostname
 
Then, with the IP address from the previous dig, I do 'dig +short -x'.  I get the original hostname.
 
Ideas please?
 
 
Dan, unfortunately this isn't solution to your problem, but I can say, your not alone I have had periodic issues with Bacula failing to resolve host names on FreeBSD. I just gave up and went to IP Addresses. When it occurs it has only been the Bacula process. I have tried testing every which way I can from the OS and always received successful resolution of names. I have had it do it after running fine for several days, then without a reboot, service restart or anything just start failing to resolve. Rebooting server sometimes would fix it, but sometimes it wouldn't, even removed Bacula and reinstalled once without it fixing the problem.
 
I just upgraded my home Bacula server to 7.4.5 this morning and after seeing this, tried switching one of my clients over to name instead of of IP and it did resolve ok and check status of that client. Then checked with my storage daemon switched to name as well. So it doesn't seem to be the latest update that broke your system, perhaps you found that same bug I have periodically ran into, but not found a solution to yet.

On the server in question, are you using unbound?  I've found that the SD in question is using unbound, running FreeBSD 11, and there is a DNS issue on that host.

-- 
Dan Langille - BSDCan / PGCon





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

dweimer

On 2017-02-10 10:47 am, Dan Langille wrote:

On Feb 10, 2017, at 11:45 AM, Dean E. Weimer <[hidden email]> wrote:

On 2017-02-10 10:02 am, Dan Langille wrote:

NOTE: this message is not about resolving the permissions error.  It is about [what appears to be] a DNS issue.
 
Please help me understand which hostname is not being resolved in these messages.  My checks with dig to verify DNS have found no errors.
 
I have checked:
 
Client:                 crey-fd
Read Storage:           "bacula-sd-01-file" 
Write Storage:          "tape01" (From Job resource)
 
I've checked the bacula configuration files, looked for the Address, and run 'dig +short' on the hostname
 
Then, with the IP address from the previous dig, I do 'dig +short -x'.  I get the original hostname.
 
Ideas please?
 
 
Dan, unfortunately this isn't solution to your problem, but I can say, your not alone I have had periodic issues with Bacula failing to resolve host names on FreeBSD. I just gave up and went to IP Addresses. When it occurs it has only been the Bacula process. I have tried testing every which way I can from the OS and always received successful resolution of names. I have had it do it after running fine for several days, then without a reboot, service restart or anything just start failing to resolve. Rebooting server sometimes would fix it, but sometimes it wouldn't, even removed Bacula and reinstalled once without it fixing the problem.
 
I just upgraded my home Bacula server to 7.4.5 this morning and after seeing this, tried switching one of my clients over to name instead of of IP and it did resolve ok and check status of that client. Then checked with my storage daemon switched to name as well. So it doesn't seem to be the latest update that broke your system, perhaps you found that same bug I have periodically ran into, but not found a solution to yet.
 
On the server in question, are you using unbound?  I've found that the SD in question is using unbound, running FreeBSD 11, and there is a DNS issue on that host.

-- 
Dan Langille - BSDCan / PGCon
 
I have had it occur on two separate servers, my home Server and one at work. Both installations are fairly small, only backing up my other FreeBSD servers at work since they aren't supported by our main Tivoli installation. Can't say for sure when it occurred the first time, At least since October of 2015, that's the oldest date time of last modified on my client configuration files, and it has a commented out Address= line with hostname followed by a replacement line with IP. can't remember which version of FreeBSD was at that time and when the switch to Unbound occurred in base.
 
We have two larger installations at work running in remote facilities that are on Linux, and haven't had the issue. However they are also older versions of Bacula as the Linux distributions they are running don't have the latest in their package management systems. They were installed on Linux to hopefully simplify a possible upgrade to Bacula Enterprise in the future. However management has turned down the budget item the every year, and my personal (likely biased view/opinion) of FreeBSD being better keeps making me want to switch them over..
 
I will switch my home system over to names, to see if I can get it to occur again and possible find something to help troubleshoot it with, but it may take a while to occur.
 
 
--
Thanks,
   Dean E. Weimer
   http://www.dweimer.net/

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Kern Sibbald
In reply to this post by Dan Langille
Hello,

I suspect that this is a problem with the FreeBSD networking implementation.  If I remember right on FreeBSD, when doing name lookups, if the packet size is not *exactly* what FreeBSD wants, it fails the call.  On Linux and other machines (Solaris, Mac), as long as the packet size is equal or greater than what is needed the OS call succeeds.  If I am not mistaken, Bacula allocates space for the larger of IPv4 and IPv6 (which is always IPv6), and so if you are using an IPv4 network, Bacula may send OS calls with a packet size larger than actually required.

If this is the case, I would consider it a FreeBSD bug.  For me to fix it is a bit complicated, because I need to know exactly what call is failing and the values that FreeBSD wants.  By the way, it is possible this is already fixed in the Enterprise version where FreeBSD is supported too.  If that is the case, in my next round of backporting to start next week, it will get fixed.

Best regards,
Kern

On 02/10/2017 05:47 PM, Dan Langille wrote:
On Feb 10, 2017, at 11:45 AM, Dean E. Weimer <[hidden email]> wrote:

On 2017-02-10 10:02 am, Dan Langille wrote:

NOTE: this message is not about resolving the permissions error.  It is about [what appears to be] a DNS issue.
 
Please help me understand which hostname is not being resolved in these messages.  My checks with dig to verify DNS have found no errors.
 
I have checked:
 
Client:                 crey-fd
Read Storage:           "bacula-sd-01-file" 
Write Storage:          "tape01" (From Job resource)
 
I've checked the bacula configuration files, looked for the Address, and run 'dig +short' on the hostname
 
Then, with the IP address from the previous dig, I do 'dig +short -x'.  I get the original hostname.
 
Ideas please?
 
 
Dan, unfortunately this isn't solution to your problem, but I can say, your not alone I have had periodic issues with Bacula failing to resolve host names on FreeBSD. I just gave up and went to IP Addresses. When it occurs it has only been the Bacula process. I have tried testing every which way I can from the OS and always received successful resolution of names. I have had it do it after running fine for several days, then without a reboot, service restart or anything just start failing to resolve. Rebooting server sometimes would fix it, but sometimes it wouldn't, even removed Bacula and reinstalled once without it fixing the problem.
 
I just upgraded my home Bacula server to 7.4.5 this morning and after seeing this, tried switching one of my clients over to name instead of of IP and it did resolve ok and check status of that client. Then checked with my storage daemon switched to name as well. So it doesn't seem to be the latest update that broke your system, perhaps you found that same bug I have periodically ran into, but not found a solution to yet.

On the server in question, are you using unbound?  I've found that the SD in question is using unbound, running FreeBSD 11, and there is a DNS issue on that host.

-- 
Dan Langille - BSDCan / PGCon






------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dan Langille
In reply to this post by dweimer
On Feb 10, 2017, at 12:14 PM, Dean E. Weimer <[hidden email]> wrote:

On 2017-02-10 10:47 am, Dan Langille wrote:

On Feb 10, 2017, at 11:45 AM, Dean E. Weimer <[hidden email]> wrote:

On 2017-02-10 10:02 am, Dan Langille wrote:

NOTE: this message is not about resolving the permissions error.  It is about [what appears to be] a DNS issue.
 
Please help me understand which hostname is not being resolved in these messages.  My checks with dig to verify DNS have found no errors.
 
I have checked:
 
Client:                 crey-fd
Read Storage:           "bacula-sd-01-file" 
Write Storage:          "tape01" (From Job resource)
 
I've checked the bacula configuration files, looked for the Address, and run 'dig +short' on the hostname
 
Then, with the IP address from the previous dig, I do 'dig +short -x'.  I get the original hostname.
 
Ideas please?
 
 
Dan, unfortunately this isn't solution to your problem, but I can say, your not alone I have had periodic issues with Bacula failing to resolve host names on FreeBSD. I just gave up and went to IP Addresses. When it occurs it has only been the Bacula process. I have tried testing every which way I can from the OS and always received successful resolution of names. I have had it do it after running fine for several days, then without a reboot, service restart or anything just start failing to resolve. Rebooting server sometimes would fix it, but sometimes it wouldn't, even removed Bacula and reinstalled once without it fixing the problem.
 
I just upgraded my home Bacula server to 7.4.5 this morning and after seeing this, tried switching one of my clients over to name instead of of IP and it did resolve ok and check status of that client. Then checked with my storage daemon switched to name as well. So it doesn't seem to be the latest update that broke your system, perhaps you found that same bug I have periodically ran into, but not found a solution to yet.
 
On the server in question, are you using unbound?  I've found that the SD in question is using unbound, running FreeBSD 11, and there is a DNS issue on that host.

-- 
Dan Langille - BSDCan / PGCon
 
I have had it occur on two separate servers, my home Server and one at work. Both installations are fairly small, only backing up my other FreeBSD servers at work since they aren't supported by our main Tivoli installation. Can't say for sure when it occurred the first time, At least since October of 2015, that's the oldest date time of last modified on my client configuration files, and it has a commented out Address= line with hostname followed by a replacement line with IP. can't remember which version of FreeBSD was at that time and when the switch to Unbound occurred in base.
 
We have two larger installations at work running in remote facilities that are on Linux, and haven't had the issue. However they are also older versions of Bacula as the Linux distributions they are running don't have the latest in their package management systems. They were installed on Linux to hopefully simplify a possible upgrade to Bacula Enterprise in the future. However management has turned down the budget item the every year, and my personal (likely biased view/opinion) of FreeBSD being better keeps making me want to switch them over..
 
I will switch my home system over to names, to see if I can get it to occur again and possible find something to help troubleshoot it with, but it may take a while to occur.

I asked about unbound because I mistakenly thought I had discovered a problem with the WRITE SD after reading Kern's email.  I was wrong. It was a typo in the hostname on my dig query.

I run bind internally here. The hosts are running FreeBSD 10 or 11. Every Address directive contains a FQDN.  I am not using IP addresses anywhere in the Bacula configuration.

I have not upgraded to 7.4.5 yet; all the servers are on 7.4.4 and so are most of clients.  I think your ADDRESS problem is not related to Bacula itself... because it's working here.

-- 
Dan Langille - BSDCan / PGCon





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dan Langille
In reply to this post by Kern Sibbald
On Feb 10, 2017, at 12:20 PM, Kern Sibbald <[hidden email]> wrote:

Hello,

I suspect that this is a problem with the FreeBSD networking implementation.  If I remember right on FreeBSD, when doing name lookups, if the packet size is not *exactly* what FreeBSD wants, it fails the call.  On Linux and other machines (Solaris, Mac), as long as the packet size is equal or greater than what is needed the OS call succeeds.  If I am not mistaken, Bacula allocates space for the larger of IPv4 and IPv6 (which is always IPv6), and so if you are using an IPv4 network, Bacula may send OS calls with a packet size larger than actually required.

I just spoke with a FreeBSD developer.  They are unaware of anything special in the FreeBSD ports tree for patching FreeBSD when it comes to doing name lookups.  Specifically, gethostby*(), getipnodeby*() just work...

If you can reproduce/encounter a situation which fails, we will look at it and fix it.  In short, I do not think this is an issue with the FreeBSD networking implementation.

I suspect it's a local DNS misconfiguration on one of my hosts.  Which ones, I don't know yet.  Your first post mentioned the SDs, so I checked them. They seem OK now.  I will verify them again if I see them again.

For testing purposes, I will use dig +short and verify that all of these FQDN resolve on the FD, the SD, the director, and from from a fourth host:

The FQDN of the FD (even though it is not used in a Copy job)
The FQDN of the read SD
The FQDN of the write SD
The FQDN of the Director

I will also check that the PTR record for the A record also resolves back to the FQDN.

If this is the case, I would consider it a FreeBSD bug.  For me to fix it is a bit complicated, because I need to know exactly what call is failing and the values that FreeBSD wants.  By the way, it is possible this is already fixed in the Enterprise version where FreeBSD is supported too.  If that is the case, in my next round of backporting to start next week, it will get fixed.

If this was the case, I'd expect apps on FreeBSD to be failing everywhere.  They aren't.  I've never patched anything for DNS issues either.

I suspect it's more likely to be an issue on one of the Bacula nodes in question (SD or FD) where there is a local DNS issue.

-- 
Dan Langille - BSDCan / PGCon
[hidden email]






------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dimitri Maziuk
On 02/10/2017 11:55 AM, Dan Langille wrote:

> I suspect it's more likely to be an issue on one of the Bacula nodes
in question (SD or FD) where there is a local DNS issue.

I'm sure you checked already, but is there ns caching going on? It used
to be a real pita on irix and is back again on osx. Once it gets into
its little brain that host doesn't resolve, you get all sorts of fun
behaviours with ssh failing (irix) while nslookup says all's peachy, or
safari not connecting while ping works in terminal.

--
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users

signature.asc (197 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dan Langille
> On Feb 10, 2017, at 1:16 PM, Dimitri Maziuk <[hidden email]> wrote:
>
> On 02/10/2017 11:55 AM, Dan Langille wrote:
>
>> I suspect it's more likely to be an issue on one of the Bacula nodes
> in question (SD or FD) where there is a local DNS issue.
>
> I'm sure you checked already, but is there ns caching going on? It used
> to be a real pita on irix and is back again on osx. Once it gets into
> its little brain that host doesn't resolve, you get all sorts of fun
> behaviours with ssh failing (irix) while nslookup says all's peachy, or
> safari not connecting while ping works in terminal.


For what it's worth, none of these hosts are running OSX.  They get rebooted regularly (mostly for upgrades).

I'll keep that in mind if I see the issue again.

--
Dan Langille - BSDCan / PGCon
[hidden email]




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dan Langille
In reply to this post by Dimitri Maziuk
> On Feb 10, 2017, at 1:16 PM, Dimitri Maziuk <[hidden email]> wrote:
>
> On 02/10/2017 11:55 AM, Dan Langille wrote:
>
>> I suspect it's more likely to be an issue on one of the Bacula nodes
> in question (SD or FD) where there is a local DNS issue.
>
> I'm sure you checked already, but is there ns caching going on? It used
> to be a real pita on irix and is back again on osx. Once it gets into
> its little brain that host doesn't resolve, you get all sorts of fun
> behaviours with ssh failing (irix) while nslookup says all's peachy, or
> safari not connecting while ping works in terminal.


For what it's worth, none of these hosts are running OSX.  They get rebooted regularly (mostly for upgrades).

I'll keep that in mind if I see the issue again.

Thank you

--
Dan Langille - BSDCan / PGCon
[hidden email]



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

webmaster
In reply to this post by Dan Langille

Hi Dan,

I would recommend start both, bacula-sd and bacula-dir in debug mode and inspect what is going on by taking a deep look at the console output. You could start by only using debug mode for sd.
Use debug level -d 400. It helped me a lot since I know about this feature.

Cheers,
-fuz

Dan Langille <[hidden email]> hat am 10. Februar 2017 um 18:55 geschrieben:

On Feb 10, 2017, at 12:20 PM, Kern Sibbald <[hidden email]> wrote:

Hello,

I suspect that this is a problem with the FreeBSD networking implementation.  If I remember right on FreeBSD, when doing name lookups, if the packet size is not *exactly* what FreeBSD wants, it fails the call.  On Linux and other machines (Solaris, Mac), as long as the packet size is equal or greater than what is needed the OS call succeeds.  If I am not mistaken, Bacula allocates space for the larger of IPv4 and IPv6 (which is always IPv6), and so if you are using an IPv4 network, Bacula may send OS calls with a packet size larger than actually required.

I just spoke with a FreeBSD developer.  They are unaware of anything special in the FreeBSD ports tree for patching FreeBSD when it comes to doing name lookups.  Specifically, gethostby*(), getipnodeby*() just work...

If you can reproduce/encounter a situation which fails, we will look at it and fix it.  In short, I do not think this is an issue with the FreeBSD networking implementation.

I suspect it's a local DNS misconfiguration on one of my hosts.  Which ones, I don't know yet.  Your first post mentioned the SDs, so I checked them. They seem OK now.  I will verify them again if I see them again.

For testing purposes, I will use dig +short and verify that all of these FQDN resolve on the FD, the SD, the director, and from from a fourth host:

The FQDN of the FD (even though it is not used in a Copy job)
The FQDN of the read SD
The FQDN of the write SD
The FQDN of the Director

I will also check that the PTR record for the A record also resolves back to the FQDN.

If this is the case, I would consider it a FreeBSD bug.  For me to fix it is a bit complicated, because I need to know exactly what call is failing and the values that FreeBSD wants.  By the way, it is possible this is already fixed in the Enterprise version where FreeBSD is supported too.  If that is the case, in my next round of backporting to start next week, it will get fixed.

If this was the case, I'd expect apps on FreeBSD to be failing everywhere.  They aren't.  I've never patched anything for DNS issues either.

I suspect it's more likely to be an issue on one of the Bacula nodes in question (SD or FD) where there is a local DNS issue.

-- 
Dan Langille - BSDCan / PGCon
[hidden email]






 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users


 


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Dan Langille
In reply to this post by Dan Langille
On Feb 10, 2017, at 12:55 PM, Dan Langille <[hidden email]> wrote:

On Feb 10, 2017, at 12:20 PM, Kern Sibbald <[hidden email]> wrote:

Hello,

I suspect that this is a problem with the FreeBSD networking implementation.  If I remember right on FreeBSD, when doing name lookups, if the packet size is not *exactly* what FreeBSD wants, it fails the call.  On Linux and other machines (Solaris, Mac), as long as the packet size is equal or greater than what is needed the OS call succeeds.  If I am not mistaken, Bacula allocates space for the larger of IPv4 and IPv6 (which is always IPv6), and so if you are using an IPv4 network, Bacula may send OS calls with a packet size larger than actually required.

I just spoke with a FreeBSD developer.  They are unaware of anything special in the FreeBSD ports tree for patching FreeBSD when it comes to doing name lookups.  Specifically, gethostby*(), getipnodeby*() just work...

If you can reproduce/encounter a situation which fails, we will look at it and fix it.  In short, I do not think this is an issue with the FreeBSD networking implementation.

I suspect it's a local DNS misconfiguration on one of my hosts.  Which ones, I don't know yet.  Your first post mentioned the SDs, so I checked them. They seem OK now.  I will verify them again if I see them again.

For testing purposes, I will use dig +short and verify that all of these FQDN resolve on the FD, the SD, the director, and from from a fourth host:

The FQDN of the FD (even though it is not used in a Copy job)
The FQDN of the read SD
The FQDN of the write SD
The FQDN of the Director

I will also check that the PTR record for the A record also resolves back to the FQDN.

If this is the case, I would consider it a FreeBSD bug.  For me to fix it is a bit complicated, because I need to know exactly what call is failing and the values that FreeBSD wants.  By the way, it is possible this is already fixed in the Enterprise version where FreeBSD is supported too.  If that is the case, in my next round of backporting to start next week, it will get fixed.

If this was the case, I'd expect apps on FreeBSD to be failing everywhere.  They aren't.  I've never patched anything for DNS issues either.

I suspect it's more likely to be an issue on one of the Bacula nodes in question (SD or FD) where there is a local DNS issue.

For documentation: I found that one of my three name servers was not correctly resolving PTR records.  That is:

 $ dig +short -x 10.52.0.1 @10.55.0.1

Whereas the other two nameservers:

$ dig +short -x 10.52.0.1 @10.55.0.13
bast.int.unixathome.org.

$ dig +short -x 10.52.0.1 @10.55.0.73
bast.int.unixathome.org.

That issue since been fixed.  It took a while to track down.  That nameserver is running pfSense, using their GUI.  The issue is right there on the page:

Note: IN-ADDR.ARPA will be automaticaly included in config files when reverse zone option is checked.

The checkbox is four controls farther down the page.

The expected zone 0.55.10.in-addr.arpa was actually zone 0.55.10.in-addr.arpa.in-addr.arpa

I'll keep an eye on the Bacula logs to see if this shows up again.

Thank you.

-- 
Dan Langille - BSDCan / PGCon



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Caribe Schreiber
In reply to this post by Dimitri Maziuk
It can also help to use the host <my.host.com> command instead of nslookup or dig.  It'll go through all the resolve.conf resolvers first instead of just going straight to the DNS server for an answer.  I've gotten burned by using dig instead of the host command a few times and it can be a real bummer.

~Caribe

On 02/10/2017 12:16 PM, Dimitri Maziuk wrote:
On 02/10/2017 11:55 AM, Dan Langille wrote:

I suspect it's more likely to be an issue on one of the Bacula nodes
in question (SD or FD) where there is a local DNS issue.

I'm sure you checked already, but is there ns caching going on? It used
to be a real pita on irix and is back again on osx. Once it gets into
its little brain that host doesn't resolve, you get all sorts of fun
behaviours with ssh failing (irix) while nslookup says all's peachy, or
safari not connecting while ping works in terminal.



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users

--
Caribe Schreiber
Auction Harmony
Phone: (612) 605-7301 x 105
Fax: (612) 605-7302
www.AuctionHarmony.com
Join Us! Like
          us on Facebook Follow us on Twitter

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Josip Deanovic
On Friday 2017-02-10 15:12:57 Caribe Schreiber wrote:
> It can also help to use the host <my.host.com> command instead of
> nslookup or dig.  It'll go through all the resolve.conf resolvers first
> instead of just going straight to the DNS server for an answer.  I've
> gotten burned by using dig instead of the host command a few times and
> it can be a real bummer.

Caribe, nslookup and dig will also go through the list of the name
servers specified in the /etc/resolv.conf file

For the dig tool it is even specifically stated in its manual page.

And of course, none of them will read the /etc/hosts file.

--
Josip Deanovic

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Hostname not found

Kern Sibbald
In reply to this post by Dan Langille
Hello Dan,

Well, I am happy to know that it is not Bacula related :-)

Kern

On 02/10/2017 09:46 PM, Dan Langille wrote:
On Feb 10, 2017, at 12:55 PM, Dan Langille <[hidden email]> wrote:

On Feb 10, 2017, at 12:20 PM, Kern Sibbald <[hidden email]> wrote:

Hello,

I suspect that this is a problem with the FreeBSD networking implementation.  If I remember right on FreeBSD, when doing name lookups, if the packet size is not *exactly* what FreeBSD wants, it fails the call.  On Linux and other machines (Solaris, Mac), as long as the packet size is equal or greater than what is needed the OS call succeeds.  If I am not mistaken, Bacula allocates space for the larger of IPv4 and IPv6 (which is always IPv6), and so if you are using an IPv4 network, Bacula may send OS calls with a packet size larger than actually required.

I just spoke with a FreeBSD developer.  They are unaware of anything special in the FreeBSD ports tree for patching FreeBSD when it comes to doing name lookups.  Specifically, gethostby*(), getipnodeby*() just work...

If you can reproduce/encounter a situation which fails, we will look at it and fix it.  In short, I do not think this is an issue with the FreeBSD networking implementation.

I suspect it's a local DNS misconfiguration on one of my hosts.  Which ones, I don't know yet.  Your first post mentioned the SDs, so I checked them. They seem OK now.  I will verify them again if I see them again.

For testing purposes, I will use dig +short and verify that all of these FQDN resolve on the FD, the SD, the director, and from from a fourth host:

The FQDN of the FD (even though it is not used in a Copy job)
The FQDN of the read SD
The FQDN of the write SD
The FQDN of the Director

I will also check that the PTR record for the A record also resolves back to the FQDN.

If this is the case, I would consider it a FreeBSD bug.  For me to fix it is a bit complicated, because I need to know exactly what call is failing and the values that FreeBSD wants.  By the way, it is possible this is already fixed in the Enterprise version where FreeBSD is supported too.  If that is the case, in my next round of backporting to start next week, it will get fixed.

If this was the case, I'd expect apps on FreeBSD to be failing everywhere.  They aren't.  I've never patched anything for DNS issues either.

I suspect it's more likely to be an issue on one of the Bacula nodes in question (SD or FD) where there is a local DNS issue.

For documentation: I found that one of my three name servers was not correctly resolving PTR records.  That is:

 $ dig +short -x 10.52.0.1 @10.55.0.1

Whereas the other two nameservers:

$ dig +short -x 10.52.0.1 @10.55.0.13
bast.int.unixathome.org.

$ dig +short -x 10.52.0.1 @10.55.0.73
bast.int.unixathome.org.

That issue since been fixed.  It took a while to track down.  That nameserver is running pfSense, using their GUI.  The issue is right there on the page:

Note: IN-ADDR.ARPA will be automaticaly included in config files when reverse zone option is checked.

The checkbox is four controls farther down the page.

The expected zone 0.55.10.in-addr.arpa was actually zone 0.55.10.in-addr.arpa.in-addr.arpa

I'll keep an eye on the Bacula logs to see if this shows up again.

Thank you.

-- 
Dan Langille - BSDCan / PGCon




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
12
Loading...