Quantcast

[PATCH] [REVIEW:3-5,3-6] size optimisation

classic Classic list List threaded Threaded
6 messages Options
Lionel Elie Mamane Lionel Elie Mamane
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[PATCH] [REVIEW:3-5,3-6] size optimisation

I'm unsure whether I should apply this to libreoffice-3-6 or maybe
even libreoffice-3-5; I'd be glad of your opinions.

The issue is that embedded HSQLDB does not reclaim space occupied by
deleted rows; it only overwrites them with new rows. So basically this
means that the data portion of an .odb file NEVER shrinks and only
grows or stagnates.

In particular, people that try to make a minimal reproduction case for
bug reports by deleting a lot of rows still have big files that don't
fit in our bugzilla's size limit (that's how I noticed this issue).

This patch tells HSQLDB, on each flush, to "defrag" the database. This
can take time for big databases :-| and will happen at least for each
file save operation.

The alternative would be to introduce an UI element
"compress/cleanup/defrag database", but:

1) It would necessarily be specific to sdbc(x) direct drivers as AFAIK
   odbc / jdbc / ... don't have a standard way to do such an
   operation.

2) It is not "do the right thing by default"

3) People using big databases should switch to a "real" database
   system anyway (and use LibreOffice base as a graphical front-end to
   it).

4) More work, and touching the UI, so I won't do it by myself. If we
   decide it is the better idea and someone wants to collaborate on
   that...

--
Lionel

_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice

0001-embedded-HSQLDB-reclaim-space-occupied-by-deleted-ro.patch (1K) Download Attachment
Miklos Vajna-2 Miklos Vajna-2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [PATCH] [REVIEW:3-5,3-6] size optimisation

Hi Lionel,

On Tue, Jul 03, 2012 at 08:33:13PM +0200, Lionel Elie Mamane <[hidden email]> wrote:
> The alternative would be to introduce an UI element
> "compress/cleanup/defrag database", but:
>
> 1) It would necessarily be specific to sdbc(x) direct drivers as AFAIK
>    odbc / jdbc / ... don't have a standard way to do such an
>    operation.

Hm, but what you're doing right now is just executing a statement,
that's possible with any odbc/jdbc as well, right?

Anyway I understand if you want to avoid all the heavy-lifting, but then
I would avoid this hack for -3-5 -- and for -3-6 you can push it
yourself.

Miklos
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Lionel Elie Mamane Lionel Elie Mamane
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [PATCH] [REVIEW:3-5,3-6] size optimisation

On Wed, Jul 04, 2012 at 09:30:20AM +0200, Miklos Vajna wrote:
> On Tue, Jul 03, 2012 at 08:33:13PM +0200, Lionel Elie Mamane <[hidden email]> wrote:

>> The alternative would be to introduce an UI element
>> "compress/cleanup/defrag database", but:

>> 1) It would necessarily be specific to sdbc(x) direct drivers as AFAIK
>>    odbc / jdbc / ... don't have a standard way to do such an
>>    operation.

> Hm, but what you're doing right now is just executing a statement,
> that's possible with any odbc/jdbc as well, right?

Yes, but *which* statement that is depends on the underlying database
engine. For example:

 * HSQLDB: "CHECKOPOINT DEFRAG;" will do it on the whole database

 * MySQL:  "OPTIMIZE TABLE foo, bar, qux;" will do it on tables foo,
            bar and qux. To do on the whole database, need to list all
            tables.

 * PostgreSQL: "VACUUM FULL ANALYZE;" will do it on the whole database

--
Lionel
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Miklos Vajna-2 Miklos Vajna-2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [PATCH] [REVIEW:3-5,3-6] size optimisation

On Wed, Jul 04, 2012 at 10:34:36AM +0200, Lionel Elie Mamane <[hidden email]> wrote:

> Yes, but *which* statement that is depends on the underlying database
> engine. For example:
>
>  * HSQLDB: "CHECKOPOINT DEFRAG;" will do it on the whole database
>
>  * MySQL:  "OPTIMIZE TABLE foo, bar, qux;" will do it on tables foo,
>             bar and qux. To do on the whole database, need to list all
>    tables.
>
>  * PostgreSQL: "VACUUM FULL ANALYZE;" will do it on the whole database

Ah, makes sense -- and handling that correctly would indeed require a
new API.
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Michael Meeks-2 Michael Meeks-2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [PATCH] [REVIEW:3-5,3-6] size optimisation

In reply to this post by Lionel Elie Mamane
Hi Lionel,

On Tue, 2012-07-03 at 20:33 +0200, Lionel Elie Mamane wrote:
> I'm unsure whether I should apply this to libreoffice-3-6 or maybe
> even libreoffice-3-5; I'd be glad of your opinions.

        Sounds reasonable for -3-6 to me (but doesn't require approval there as
a bug fix:-).

> This patch tells HSQLDB, on each flush, to "defrag" the database. This
> can take time for big databases :-| and will happen at least for each
> file save operation.

        I would add a magic env-var to turn that on/off - so people can (at a
pinch) disable it if need be (and we can isolate it's effect better).

> The alternative would be to introduce an UI element
> "compress/cleanup/defrag database", but:
>
> 1) It would necessarily be specific to sdbc(x) direct drivers as AFAIK
>    odbc / jdbc / ... don't have a standard way to do such an
>    operation.

        :-)

> 4) More work, and touching the UI, so I won't do it by myself. If we
>    decide it is the better idea and someone wants to collaborate on
>    that...

        We could add an easy hack or something. If we could abstract the
hoovering in some nice way in the code, so this is possible later, I'd
be happier - but - you're the boss in base :-)

        So - I'd just go for it, and wait for user feedback during 3.6 beta.

        Thanks !

                Michael.

--
[hidden email]  <><, Pseudo Engineer, itinerant idiot

_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Alexander Thurgood Alexander Thurgood
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [PATCH] [REVIEW:3-5,3-6] size optimisation

In reply to this post by Lionel Elie Mamane
Le 04/07/12 10:34, Lionel Elie Mamane a écrit :

Hi Lionel,

A big thumbs up from me. With a bit of luck, it might make hsqldb
embedded dbs a bit less corruption prone...


Alex

_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Loading...