Quantcast

Database Size

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Database Size

James Chamberlain
Hi all,

I’m getting a touch concerned about the size of my Bacula database, and was wondering what I can do to prune it, compress it, or otherwise keep it at a manageable size.  The database itself currently stands at 324 GB, and is using 90% of the file system it’s on.  I’m running Bacula 7.4.0 on CentOS 6.8, with PostgreSQL 8.4.20 as the database.  My file and job retention times are set to 180 days, and my volume retention time is set to 365 days.  Is there any other information I can share which would help you help me track this down?

Thanks,

James
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Database Size

Josip Deanovic
On Wednesday 2017-03-15 12:57:33 James Chamberlain wrote:

> Hi all,
>
> I’m getting a touch concerned about the size of my Bacula database, and
> was wondering what I can do to prune it, compress it, or otherwise keep
> it at a manageable size.  The database itself currently stands at 324
> GB, and is using 90% of the file system it’s on.  I’m running Bacula
> 7.4.0 on CentOS 6.8, with PostgreSQL 8.4.20 as the database.  My file
> and job retention times are set to 180 days, and my volume retention
> time is set to 365 days.  Is there any other information I can share
> which would help you help me track this down?


Hi James!

Are you performing a lot of Verify jobs?
Are you using "AutoPrune = yes" in your Client and Pool resources?

--
Josip Deanovic

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Database Size

James Chamberlain

> On Mar 15, 2017, at 10:17 PM, Josip Deanovic <[hidden email]> wrote:
>
> On Wednesday 2017-03-15 12:57:33 James Chamberlain wrote:
>> Hi all,
>>
>> I’m getting a touch concerned about the size of my Bacula database, and
>> was wondering what I can do to prune it, compress it, or otherwise keep
>> it at a manageable size.  The database itself currently stands at 324
>> GB, and is using 90% of the file system it’s on.  I’m running Bacula
>> 7.4.0 on CentOS 6.8, with PostgreSQL 8.4.20 as the database.  My file
>> and job retention times are set to 180 days, and my volume retention
>> time is set to 365 days.  Is there any other information I can share
>> which would help you help me track this down?
>
>
> Hi James!
>
> Are you performing a lot of Verify jobs?
> Are you using "AutoPrune = yes" in your Client and Pool resources?


Hi Josip,

Yes, I’ve got "AutoPrune = yes” in all my Client and Pool resources.

Thanks,

James
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Database Size

James Chamberlain
In reply to this post by James Chamberlain
On Mar 16, 2017, at 3:29 AM, Mikhail Krasnobaev <[hidden email]> wrote:
 
15.03.2017, 19:57, "James Chamberlain" <[hidden email]>:

Hi all,

I’m getting a touch concerned about the size of my Bacula database, and was wondering what I can do to prune it, compress it, or otherwise keep it at a manageable size. The database itself currently stands at 324 GB, and is using 90% of the file system it’s on. I’m running Bacula 7.4.0 on CentOS 6.8, with PostgreSQL 8.4.20 as the database. My file and job retention times are set to 180 days, and my volume retention time is set to 365 days. Is there any other information I can share which would help you help me track this down?

Thanks,

James

Good day,

do you run any maintenance jobs on the database?

Like:
--------------
[root@1c83centos ~]# cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
 
# dump all databases once every 24 hours
45 4 * * * root nice -n 19 su - postgres -c "pg_dumpall --clean" | gzip -9 > /home/pgbackup/postgres_all.sql.gz
 
# vacuum all databases every night (full vacuum on Sunday night, lazy vacuum every other night)
45 3 * * 0 root nice -n 19 su - postgres -c "vacuumdb --all --full --analyze"
45 3 * * 1-6 root nice -n 19 su - postgres -c "vacuumdb --all --analyze --quiet"
 
# re-index all databases once a week
0 3 * * 0 root nice -n 19 su - postgres -c 'psql -t -c "select datname from pg_database order by datname;" | xargs -n 1 -I"{}" -- psql -U postgres {} -c "reindex database {};"'
-----------------
vacuumdb is a utility for cleaning a PostgreSQL database. vacuumdb will also generate internal statistics used by the PostgreSQL query optimizer.

I don’t believe I’ve been doing any of this.  I’ll read up on the documentation and see about putting these into place.

Thanks!

James

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Database Size

Kern Sibbald

Hello,

I recently took a look at my catalog a bit more in detail when an upgrade of my backup server from 14.04 to 16.04 failed (I have 6 systems where the upgrade totally failed and left me with a broken system), and so I reloaded the Bacula catalog from scratch and in doing so I realized that there were lots and lots of old records in it from jobs that I had run several years ago.  This happens when you create a job or a client, then stop using that job or client (or even remove the client) so that no more jobs for that client run.  What is important is that Bacula prunes only when a job runs unless you do it manually, and if jobs never run, the retention periods never apply and you end up with lots and lots of unused (orphaned) records in the catalog.  The only way to clean it up is to see what jobs exist in the database and prune/purge those which are no longer used -- this is done manual with bconsole.

The same happens if you have lots and lots of temporary files that get backed up.  There are mail programs that create a temporary file for each email, then delete it a day or two later.  If these files are backed up even once, they will create lots of name entries in the database.  This can be cleaned up by using dbcheck.

Finally, if you can afford a bit of downtime on your database, first back it up, then delete it and recreate it with the backup.  This creates a database that is nicely compacted.  If you regularly run vacuums this is probably not necessary, but in extreme cases such as after deleting hundreds of old backup jobs or clients, it can be a quick way to compact the database.

Note also, your retention periods are quite long so if you have lots of jobs (more than 100) that run every night, you will need a big database. 

Best regards,

Kern


On 03/16/2017 03:17 PM, James Chamberlain wrote:
On Mar 16, 2017, at 3:29 AM, Mikhail Krasnobaev <[hidden email]> wrote:
 
15.03.2017, 19:57, "James Chamberlain" <[hidden email]>:

Hi all,

I’m getting a touch concerned about the size of my Bacula database, and was wondering what I can do to prune it, compress it, or otherwise keep it at a manageable size. The database itself currently stands at 324 GB, and is using 90% of the file system it’s on. I’m running Bacula 7.4.0 on CentOS 6.8, with PostgreSQL 8.4.20 as the database. My file and job retention times are set to 180 days, and my volume retention time is set to 365 days. Is there any other information I can share which would help you help me track this down?

Thanks,

James

Good day,

do you run any maintenance jobs on the database?

Like:
--------------
[root@1c83centos ~]# cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
 
# dump all databases once every 24 hours
45 4 * * * root nice -n 19 su - postgres -c "pg_dumpall --clean" | gzip -9 > /home/pgbackup/postgres_all.sql.gz
 
# vacuum all databases every night (full vacuum on Sunday night, lazy vacuum every other night)
45 3 * * 0 root nice -n 19 su - postgres -c "vacuumdb --all --full --analyze"
45 3 * * 1-6 root nice -n 19 su - postgres -c "vacuumdb --all --analyze --quiet"
 
# re-index all databases once a week
0 3 * * 0 root nice -n 19 su - postgres -c 'psql -t -c "select datname from pg_database order by datname;" | xargs -n 1 -I"{}" -- psql -U postgres {} -c "reindex database {};"'
-----------------
vacuumdb is a utility for cleaning a PostgreSQL database. vacuumdb will also generate internal statistics used by the PostgreSQL query optimizer.

I don’t believe I’ve been doing any of this.  I’ll read up on the documentation and see about putting these into place.

Thanks!

James


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Database Size

Josip Deanovic
On Thursday 2017-03-16 15:31:37 Kern Sibbald wrote:

> Hello,
>
> I recently took a look at my catalog a bit more in detail when an
> upgrade of my backup server from 14.04 to 16.04 failed (I have 6
> systems where the upgrade totally failed and left me with a broken
> system), and so I reloaded the Bacula catalog from scratch and in doing
> so I realized that there were lots and lots of old records in it from
> jobs that I had run several years ago.  This happens when you create a
> job or a client, then stop using that job or client (or even remove the
> client) so that no more jobs for that client run.  What is important is
> that Bacula prunes only when a job runs unless you do it manually, and
> if jobs never run, the retention periods never apply and you end up
> with lots and lots of unused (orphaned) records in the catalog.  The
> only way to clean it up is to see what jobs exist in the database and
> prune/purge those which are no longer used -- this is done manual with
> bconsole.

I believe that dbcheck can be used to clean the orphaned records.

--
Josip Deanovic

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Database Size

James Chamberlain
In reply to this post by Kern Sibbald
Hi Kern,

If 100 is a large number of jobs, I have a relatively small number (23).  I don’t have any “former” clients that I’m not backing up anymore.  One thing I *do* know is that I have an absolute ton of tiny little files, though I’m pretty sure that most of them stick around.  According to my statistics here, Bacula has 1,284,029,677 files in its catalog.  I can probably afford a little downtime on my database, so I may take that option if I get above 95% utilization on the file system.

If my retention periods are quite long, do you have any recommendations on what would be more typical values?

Thanks,

James


On Mar 16, 2017, at 10:31 AM, Kern Sibbald <[hidden email]> wrote:

Hello,

I recently took a look at my catalog a bit more in detail when an upgrade of my backup server from 14.04 to 16.04 failed (I have 6 systems where the upgrade totally failed and left me with a broken system), and so I reloaded the Bacula catalog from scratch and in doing so I realized that there were lots and lots of old records in it from jobs that I had run several years ago.  This happens when you create a job or a client, then stop using that job or client (or even remove the client) so that no more jobs for that client run.  What is important is that Bacula prunes only when a job runs unless you do it manually, and if jobs never run, the retention periods never apply and you end up with lots and lots of unused (orphaned) records in the catalog.  The only way to clean it up is to see what jobs exist in the database and prune/purge those which are no longer used -- this is done manual with bconsole.

The same happens if you have lots and lots of temporary files that get backed up.  There are mail programs that create a temporary file for each email, then delete it a day or two later.  If these files are backed up even once, they will create lots of name entries in the database.  This can be cleaned up by using dbcheck.

Finally, if you can afford a bit of downtime on your database, first back it up, then delete it and recreate it with the backup.  This creates a database that is nicely compacted.  If you regularly run vacuums this is probably not necessary, but in extreme cases such as after deleting hundreds of old backup jobs or clients, it can be a quick way to compact the database.

Note also, your retention periods are quite long so if you have lots of jobs (more than 100) that run every night, you will need a big database. 

Best regards,

Kern


On 03/16/2017 03:17 PM, James Chamberlain wrote:
On Mar 16, 2017, at 3:29 AM, Mikhail Krasnobaev <[hidden email]> wrote:
 
15.03.2017, 19:57, "James Chamberlain" <[hidden email]>:

Hi all,

I’m getting a touch concerned about the size of my Bacula database, and was wondering what I can do to prune it, compress it, or otherwise keep it at a manageable size. The database itself currently stands at 324 GB, and is using 90% of the file system it’s on. I’m running Bacula 7.4.0 on CentOS 6.8, with PostgreSQL 8.4.20 as the database. My file and job retention times are set to 180 days, and my volume retention time is set to 365 days. Is there any other information I can share which would help you help me track this down?

Thanks,

James

Good day,

do you run any maintenance jobs on the database?

Like:
--------------
[root@1c83centos ~]# cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
 
# dump all databases once every 24 hours
45 4 * * * root nice -n 19 su - postgres -c "pg_dumpall --clean" | gzip -9 > /home/pgbackup/postgres_all.sql.gz
 
# vacuum all databases every night (full vacuum on Sunday night, lazy vacuum every other night)
45 3 * * 0 root nice -n 19 su - postgres -c "vacuumdb --all --full --analyze"
45 3 * * 1-6 root nice -n 19 su - postgres -c "vacuumdb --all --analyze --quiet"
 
# re-index all databases once a week
0 3 * * 0 root nice -n 19 su - postgres -c 'psql -t -c "select datname from pg_database order by datname;" | xargs -n 1 -I"{}" -- psql -U postgres {} -c "reindex database {};"'
-----------------
vacuumdb is a utility for cleaning a PostgreSQL database. vacuumdb will also generate internal statistics used by the PostgreSQL query optimizer.

I don’t believe I’ve been doing any of this.  I’ll read up on the documentation and see about putting these into place.

Thanks!

James


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/bacula-users
Loading...