What is the real meaning of the VolErrors column in the table Media

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

What is the real meaning of the VolErrors column in the table Media

Panayiotis Gotsis
Hello all,

We are using version bacula_5.2.6+dfsg-9 of bacula for our
environment. Our setup up till recently was saving backups to disk
files. We have introduced a TS4500 library recently and we have
experienced the following problem.

For one of the tapes of the TS4500 library, there is a count of 1 in
the VolErrors column of the Media table. We have seen tapes or disk
files marked with Status Error, but not just an increase in this
counter.

This is what is stored in the DB.

mysql> select
VolumeName,PoolId,LastWritten,VolJobs,VolFiles,VolBytes,VolErrors,VolStatus,VolRetention From Media WHERE LastWritten>0 AND PoolId=6 ORDER BY LastWritten;                                                                      
+------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+
| VolumeName | PoolId | LastWritten         | VolJobs | VolFiles | VolBytes      | VolErrors | VolStatus | VolRetention |
+------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+
| EDE132L6   |      6 | 2017-06-15 01:20:43 |    2947 |     5361 | 2888346811392 |         0 | Full      |      7776000 |
| EDE467L6   |      6 | 2017-06-27 17:11:14 |     456 |      629 |  275665453056 |         1 | Append    |      7776000 |
+------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+
2 rows in set (0.00 sec)

It seems that some error got recorded for the EDE467L6 Volume but this
was not a big problem, or otherwise the Volume would be marked as
Error. The volume is still being used for backups.
 
Taking from the examples/sample-query.sql file, under the section "List
Volumes likely to need replacement from age or errors" we see that the
VolErrors is used for this query. However we have not found any
reference to what it truly means. In addition, as we are monitoring
the output of this query via icinga, we are pretty sure of when it
happened (give or take 10 minutes) and there is no error related to
this Volume within this time period.

I have tried to check the source code on hints of where this value is
set, and I have seen that this is part of the MEDIA_DBR structure, but
it is not clear on where the trigger for its increase is.

Can anyone shed some light on what it actually means, on whether some
backup job actually failed, or what kind of procedure can I use to
more specifically pinpoint the problem?

Thanks


--
------------------------------------------------------------
Παναγιώτης Γκότσης                      [hidden email]
                                                  2107471091
Μηχανικός Συστημάτων
Κέντρο Διαχείρισης Δικτύου
Εθνικό Δίκτυο Έρευνας και Τεχνολογίας - http://www.grnet.gr

Panayiotis Gotsis                       [hidden email]
                                               +302107471091
System Engineer
Network Operations Center
Greek Research & Technology Network   - http://www.grnet.gr

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|

Re: What is the real meaning of the VolErrors column in the table Media

Kern Sibbald
Hello,

You are running on a very old Bacula, so VolErrors probably does not
have any significant meaning.  The concept is that when the Storage
daemon sees an error with a volume, it will increment that field.  If
you want to see what is implemented, you will need to look at the
VOLUME_CAT_INFO structure in the SD.

Note: the volume status can change from append to error then back to
append, and the VolErrors for the moment is a lifetime count. If it is a
very big number I would probably want to know what was going on.  
Otherwise, it is probably nothing to worry about.

To the best of my knowledge nothing ever clears the field once it is set.

Best regards,

Kern


On 06/28/2017 12:04 PM, Panayiotis Gotsis wrote:

> Hello all,
>
> We are using version bacula_5.2.6+dfsg-9 of bacula for our
> environment. Our setup up till recently was saving backups to disk
> files. We have introduced a TS4500 library recently and we have
> experienced the following problem.
>
> For one of the tapes of the TS4500 library, there is a count of 1 in
> the VolErrors column of the Media table. We have seen tapes or disk
> files marked with Status Error, but not just an increase in this
> counter.
>
> This is what is stored in the DB.
>
> mysql> select
> VolumeName,PoolId,LastWritten,VolJobs,VolFiles,VolBytes,VolErrors,VolStatus,VolRetention
> From Media WHERE LastWritten>0 AND PoolId=6 ORDER BY LastWritten;
> +------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+
>
> | VolumeName | PoolId | LastWritten         | VolJobs | VolFiles |
> VolBytes      | VolErrors | VolStatus | VolRetention |
> +------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+
>
> | EDE132L6   |      6 | 2017-06-15 01:20:43 |    2947 |     5361 |
> 2888346811392 |         0 | Full      |      7776000 |
> | EDE467L6   |      6 | 2017-06-27 17:11:14 |     456 |      629 |  
> 275665453056 |         1 | Append    |      7776000 |
> +------------+--------+---------------------+---------+----------+---------------+-----------+-----------+--------------+
>
> 2 rows in set (0.00 sec)
>
> It seems that some error got recorded for the EDE467L6 Volume but this
> was not a big problem, or otherwise the Volume would be marked as
> Error. The volume is still being used for backups.
>
> Taking from the examples/sample-query.sql file, under the section "List
> Volumes likely to need replacement from age or errors" we see that the
> VolErrors is used for this query. However we have not found any
> reference to what it truly means. In addition, as we are monitoring
> the output of this query via icinga, we are pretty sure of when it
> happened (give or take 10 minutes) and there is no error related to
> this Volume within this time period.
>
> I have tried to check the source code on hints of where this value is
> set, and I have seen that this is part of the MEDIA_DBR structure, but
> it is not clear on where the trigger for its increase is.
>
> Can anyone shed some light on what it actually means, on whether some
> backup job actually failed, or what kind of procedure can I use to
> more specifically pinpoint the problem?
>
> Thanks
>
>


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users