[bug?] bacula-fd: strippath vs. accurate (checksums)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[bug?] bacula-fd: strippath vs. accurate (checksums)

Markus Jung
Hi devs,

i think bacula-fd has a slight problem with the combination of strippath
and accurate backup using checksums.

Short description:
bacula-fd (tries) to open files using the stripped path instead of using
the location at which the file really resides. This happens most likely
while performing the accurate backup checks.

OS: Ubuntu 15.10
bacula-fd: 5.2.6 (stock ubuntu)
(i think the problem is present in recent bacula versions, too)

Symptom: log messages like
> Error:      Cannot open /etc/lvm/archive/vg_hdd_00039-1823321258.vg: ERR=Datei oder Verzeichnis nicht gefunden.
(file or directory no found)

while running a backup on an snapshot volume. The file existed within
the snapshot, but not outside. The backup itself runs on something like
> /mnt/snapshot/etc/...

strippath = 2 takes the snapshot prefix away.

And in fact, while bacula-fd is digging into the snapshot directory
tree, it opens original files and not the ones within the snapshot.
> lsof -p $(pidof bacula-fd) | tail
> bacula-fd 5778 root   10r   DIR 252,21     4096 3670018 /mnt/snapshot/mnt/vg_hdd/static/markus/Bilder/Fotos/2014
> bacula-fd 5778 root   11r   DIR 252,21     4096 3670311 /mnt/snapshot/mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald
> bacula-fd 5778 root   12r   DIR 252,21     4096 3670313 /mnt/snapshot/mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF
> bacula-fd 5778 root   13r   REG  252,7 14530555 3670387 /mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0007.NEF

The last line should not be like this.

I suppose the problem is related to src/filed/accurate.c. Below is an
excerpt of bacula-fd running at a debug-level of 600

> markusnb-fd: find.c:291-0 enter accept_file: fname=/mnt/snapshot/mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0026.NEF.xmp
> markusnb-fd: find_one.c:374-0 File ----: /mnt/snapshot/mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0007.NEF
> markusnb-fd: backup.c:1453-0 stripped=2 count=2 numsep=12 sep>count=1
> markusnb-fd: backup.c:1503-0 fname=/mnt/snapshot/mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0007.NEF stripped=/mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0007.NEF link=/mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0007.NEF
> markusnb-fd: htable.c:132-0 Leave hash_index hash=0x9e23d3a7a5e4fc78 index=61248
> markusnb-fd: htable.c:404-0 lookup return 7f5c1b3d53d8
> markusnb-fd: accurate.c:82-0 lookup </mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0007.NEF> ok
> markusnb-fd: crypto.c:607-0 crypto_digest_new jcr=7f5c1c0008e8
> markusnb-fd: verify.c:297-0 === digest_file
> markusnb-fd: bfile.c:963-0 open file /mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0007.NEF
> markusnb-fd: bfile.c:987-0 Open file 13
> markusnb-fd: verify.c:354-0 === read_digest
> markusnb-fd: bfile.c:1028-0 Close file 13
> markusnb-fd: htable.c:132-0 Leave hash_index hash=0x9e23d3a7a5e4fc78 index=61248
> markusnb-fd: htable.c:404-0 lookup return 7f5c1b3d53d8
> markusnb-fd: find_one.c:436-0 Non-directory incremental: /mnt/snapshot/mnt/vg_hdd/static/markus/Bilder/Fotos/2014/0131 Wald/NEF/DSC_0007.NEF

I think accurate.c:accurate_check_file() does not handle the stripped
prefix properly. It calls strip_path() and performs all computations
until the very end using the stripped path, including the checksum
computation which bopens the stripped path, if i understood everything
right. unstrip_path() is called just before leaving the function.

By roughly comparing the ubuntu source package to the git branches of
7.0, 7.2 and 7.4 i think they are affected by this problem as well.

One final remark: If strippath is not used to restore the real file
hierarchy, this issue might be much more problematic, because it will
trigger for every file (and not just for every changed one).
However, i am unsure if it has a real influence on the backup archive or
just spams the backup log.


Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
Bacula-devel mailing list
[hidden email]