Quantcast

[HELP] fdo#51239 how to debug "it is slow"

classic Classic list List threaded Threaded
13 messages Options
Lionel Elie Mamane Lionel Elie Mamane
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

[HELP] fdo#51239 how to debug "it is slow"

I'd appreciate some help on fdo#51239 which is of the form "doing this
is horribly slow"; how can I profile LibO, i.e. find out where it
spends all its time; if it spends a lot of time in function FOO, can I
also get information on what the call stack looks like "most of the
time"?

Either someone teaches me how to do it (keep in mind the "time spent"
might be in C++ or Java code), or someone just does it and reports
findings in the bug?


Thanks in advance,

--
Lionel
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Lubos Lunak Lubos Lunak
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

On Wednesday 11 of July 2012, Lionel Elie Mamane wrote:
> I'd appreciate some help on fdo#51239 which is of the form "doing this
> is horribly slow"; how can I profile LibO, i.e. find out where it
> spends all its time; if it spends a lot of time in function FOO, can I
> also get information on what the call stack looks like "most of the
> time"?

 Have a look at Callgrind (http://valgrind.org/docs/manual/cl-manual.html).

 I find it rather straightforward to use (in short: valgrind --tool=callgrind
<app>; kcachegrind callgrind.out.<number> and keep clicking on functions in
the 'Callees' tab), especially with the manual explaining things, but if you
run into problems or advanced topics, feel free to ask.

> Either someone teaches me how to do it (keep in mind the "time spent"
> might be in C++ or Java code), or someone just does it and reports
> findings in the bug?

 I have no idea how Callgrind handles Java.

--
 Lubos Lunak
 [hidden email]
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Noel Grandin Noel Grandin
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

In reply to this post by Lionel Elie Mamane
try googling for "profiling C++" or "profiling on <insert your operating
system here>"

On 2012-07-11 11:48, Lionel Elie Mamane wrote:

> I'd appreciate some help on fdo#51239 which is of the form "doing this
> is horribly slow"; how can I profile LibO, i.e. find out where it
> spends all its time; if it spends a lot of time in function FOO, can I
> also get information on what the call stack looks like "most of the
> time"?
>
> Either someone teaches me how to do it (keep in mind the "time spent"
> might be in C++ or Java code), or someone just does it and reports
> findings in the bug?
>
>
> Thanks in advance,
>



Disclaimer: http://www.peralex.com/disclaimer.html


_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Lionel Elie Mamane Lionel Elie Mamane
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

In reply to this post by Lubos Lunak
On Wed, Jul 11, 2012 at 12:10:09PM +0200, Lubos Lunak wrote:
> On Wednesday 11 of July 2012, Lionel Elie Mamane wrote:

>> I'd appreciate some help on fdo#51239 which is of the form "doing this
>> is horribly slow"; how can I profile LibO, i.e. find out where it
>> spends all its time; if it spends a lot of time in function FOO, can I
>> also get information on what the call stack looks like "most of the
>> time"?

>  Have a look at Callgrind (http://valgrind.org/docs/manual/cl-manual.html).

Thanks for the pointer; it did not handle Java code in a meaningful
way and it did not follow costs across threads, but it pointed me to
the C++ code calling the long-running Java code, which was started in
a fresh thread, but I successfully "guessed" what created this fresh
thread, and so I got started.

So the issue is resolved in 3.6 & master; 3.5 will take more effort to
backport.

--
Lionel
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Stephan Bergmann-2 Stephan Bergmann-2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

On 07/18/2012 10:06 AM, Lionel Elie Mamane wrote:
> The issue *originally* reported in fdo#51239, which has been bumped to
> fdo#52170: in LibreOffice 3.5.2, much slower than "before", on
> GNU/Linux, but NOT on Windows. Unknown on Mac. Not entirely clear what
> "before" it is. *This* issue *could* be JVM-thread-attach/detach
> related, or some other "interaction between Java and C++" problem. My
> "hunch" in this direction is because it does not happen on MS Windows.
> So if you could look into it, even if only to rule it out, that would
> be useful.

Looked into it on x64 Linux master, against both an 1.7.0 openjdk (as
included in Fedora 17) and a jre-6u33-linux-x64 downloaded from Oracle.
  In either case, it does take significantly to open the Sales Invoices
table (from <https://bugs.freedesktop.org/attachment.cgi?id=63257>), and
then it takes ages to jump to the end.  However, I could not find
anything obviously suspicious at the JVM level.

I locally reverted
<http://cgit.freedesktop.org/libreoffice/core/commit/?id=bb59742bcf4883af5876a2ffadcc4a689e414b60>
"Confine JDBC driver to thread-affine apartment for Java 6 performance"
and then everything became even more slow, so it apparently is not the
case that the effect of that fix has meanwhile regressed again.

I noticed that there are truckloads of calls to
m_pVm->AttachCurrentThread from jvmaccess::VirtualMachine::attachThread
(jvmaccess/source/virtualmachine.cxx).  Such "outermost" calls into JNI
(i.e., not recursively from a thread that is already in a JNI call,
where jvmaccess::VirtualMachine::attachThread would be cheap) are always
expensive (and the changes to the JVM that we tried to address with
bb59742bcf4883af5876a2ffadcc4a689e414b60 only made them extra-extra
expensive, when they already had been expensive).  The LO database code
is notorious for making truckloads of such expensive outermost JNI
calls, but I do not know if this has increased substantially recently
and so alone could explain any recent performance regressions.

Stephan
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Lionel Elie Mamane Lionel Elie Mamane
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

On Thu, Jul 19, 2012 at 02:12:30PM +0200, Stephan Bergmann wrote:
> On 07/18/2012 10:06 AM, Lionel Elie Mamane wrote:

>> The issue *originally* reported in fdo#51239, which has been bumped to
>> fdo#52170: in LibreOffice 3.5.2, much slower than "before", on
>> GNU/Linux, but NOT on Windows. Unknown on Mac. Not entirely clear what
>> "before" it is. *This* issue *could* be JVM-thread-attach/detach
>> related, or some other "interaction between Java and C++" problem. My
>> "hunch" in this direction is because it does not happen on MS Windows.
>> So if you could look into it, even if only to rule it out, that would
>> be useful.

> Looked into it on x64 Linux master (...)

Thanks.

> it does take significantly to open the Sales Invoices table (from
> <https://bugs.freedesktop.org/attachment.cgi?id=63257>), and then it
> takes ages to jump to the end.  However, I could not find anything
> obviously suspicious at the JVM level.

> I noticed that there are truckloads of calls to
> m_pVm->AttachCurrentThread from
> jvmaccess::VirtualMachine::attachThread
> (jvmaccess/source/virtualmachine.cxx).  Such "outermost" calls into
> JNI (i.e., not recursively from a thread that is already in a JNI
> call, where jvmaccess::VirtualMachine::attachThread would be cheap)
> are always expensive (...).  The LO database code is notorious for
> making truckloads of such expensive outermost JNI calls, (...)

Try to give me an idea what would cause such calls. Is it Java code
calling C(++) code (presumably through UNO) or C(++) code calling Java
code (again presumably through UNO)?

For example, to minimise the number of Java calls from C++ , maybe we
could, when we retrieve a whole row through JDBC, doing this from Java
code instead of from C++ code:

for (i=0; i < row->getNumberOfColumns; ++i)
   cacheEntry[i]=row->getColumnValue(i);

This would bring us down from one call per column to one call per row,
hopefully without duplicating *too* *much* logic between a C++ version
and a Java version.

Or maybe if we know that we are about to make a truckload of Java
calls from C++, wrap it in a Java trampoline that just calls the C++
function that will make many Java calls? So that we are in the
"recursive" case? Hmm... that could be hackish, since it would involve
generic code looking at whether it is connecting to the database
engine through JDBC (Java) or some other way.

A breakpoint at jvmaccess::VirtualMachine::attachThread in gdb does
not seem to be immediately useful; the backtrace does not show me
where that "comes from". Is there a way to see that?

--
Lionel
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Stephan Bergmann-2 Stephan Bergmann-2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

On 07/19/2012 02:38 PM, Lionel Elie Mamane wrote:

> On Thu, Jul 19, 2012 at 02:12:30PM +0200, Stephan Bergmann wrote:
>> On 07/18/2012 10:06 AM, Lionel Elie Mamane wrote:
>>> The issue *originally* reported in fdo#51239, which has been bumped to
>>> fdo#52170: in LibreOffice 3.5.2, much slower than "before", on
>>> GNU/Linux, but NOT on Windows. Unknown on Mac. Not entirely clear what
>>> "before" it is. *This* issue *could* be JVM-thread-attach/detach
>>> related, or some other "interaction between Java and C++" problem. My
>>> "hunch" in this direction is because it does not happen on MS Windows.
>>> So if you could look into it, even if only to rule it out, that would
>>> be useful.
>
>> I noticed that there are truckloads of calls to
>> m_pVm->AttachCurrentThread from
>> jvmaccess::VirtualMachine::attachThread
>> (jvmaccess/source/virtualmachine.cxx).  Such "outermost" calls into
>> JNI (i.e., not recursively from a thread that is already in a JNI
>> call, where jvmaccess::VirtualMachine::attachThread would be cheap)
>> are always expensive (...).  The LO database code is notorious for
>> making truckloads of such expensive outermost JNI calls, (...)
>
> Try to give me an idea what would cause such calls. Is it Java code
> calling C(++) code (presumably through UNO) or C(++) code calling Java
> code (again presumably through UNO)?

C++ code in connectivity and/or dbacess calling Java, not through UNO
but directly talking JNI (though going through UNO would not actually
make a difference to the performance implications).  Grepping for
jvmaccess::VirtualMachine::AttachGuard in connectivity and dbaccess
should show the places where this is done.

> For example, to minimise the number of Java calls from C++ , maybe we
> could, when we retrieve a whole row through JDBC, doing this from Java
> code instead of from C++ code:
>
> for (i=0; i < row->getNumberOfColumns; ++i)
>     cacheEntry[i]=row->getColumnValue(i);
>
> This would bring us down from one call per column to one call per row,
> hopefully without duplicating *too* *much* logic between a C++ version
> and a Java version.

Yes, this could be helpful.  (When I discussed this with Ocke years ago,
he always claimed there it was not possible to redesign the
connectivity/dbaccess code to reduce the number of such "outermost" JNI
calls, though.)

> A breakpoint at jvmaccess::VirtualMachine::attachThread in gdb does
> not seem to be immediately useful; the backtrace does not show me
> where that "comes from". Is there a way to see that?

That might be due to
<http://cgit.freedesktop.org/libreoffice/core/commit/?id=bb59742bcf4883af5876a2ffadcc4a689e414b60>
"Confine JDBC driver to thread-affine apartment for Java 6 performance"
offloading part of the code off the main thread into a dedicated thread.
  You can try reverting that locally (at the expense of even slower
performance).

Stephan
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Lionel Elie Mamane Lionel Elie Mamane
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

On Thu, Jul 19, 2012 at 02:53:35PM +0200, Stephan Bergmann wrote:
> On 07/19/2012 02:38 PM, Lionel Elie Mamane wrote:
>> On Thu, Jul 19, 2012 at 02:12:30PM +0200, Stephan Bergmann wrote:
>>> On 07/18/2012 10:06 AM, Lionel Elie Mamane wrote:

>>> I noticed that there are truckloads of calls to
>>> m_pVm->AttachCurrentThread from
>>> jvmaccess::VirtualMachine::attachThread
>>> (jvmaccess/source/virtualmachine.cxx).  Such "outermost" calls into
>>> JNI (i.e., not recursively from a thread that is already in a JNI
>>> call, where jvmaccess::VirtualMachine::attachThread would be cheap)
>>> are always expensive (...).  The LO database code is notorious for
>>> making truckloads of such expensive outermost JNI calls, (...)

>> Try to give me an idea what would cause such calls. Is it Java code
>> calling C(++) code (presumably through UNO) or C(++) code calling Java
>> code (again presumably through UNO)?

> Grepping for jvmaccess::VirtualMachine::AttachGuard in connectivity
> and dbaccess should show the places where this is done.

It is wrapped in a SDBThreadAttach class, and this grep yields the
expected places. Good, I see now.

>> For example, to minimise the number of Java calls from C++ , maybe we
>> could, when we retrieve a whole row through JDBC, doing this from Java
>> code instead of from C++ code:

>> for (i=0; i < row->getNumberOfColumns; ++i)
>>     cacheEntry[i]=row->getColumnValue(i);

>> This would bring us down from one call per column to one call per row,
>> hopefully without duplicating *too* *much* logic between a C++ version
>> and a Java version.

> Yes, this could be helpful.  (When I discussed this with Ocke years
> ago, he always claimed there it was not possible to redesign the
> connectivity/dbaccess code to reduce the number of such "outermost"
> JNI calls, though.)

Well, I haven't thought of a clean way to do it yet; several
approaches exist. I'll have to sleep on it. Listing them here mostly
not to forget them.

One idea is to extend our SDBC API (which is basically a C++ version
of JDBC, which is an extension of ODBC, which is an extension of
X/Open SQL CLI, ...); it currently has only "retrieve one column
value"; we'd add an *optional* "get a whole row". If the optional
extension is not implemented, the code would fallback to a C++
implementation of "get a whole row" that makes multiple calls to
"retrieve one column value".

Replace source/drivers/jdbc/JStatement.cxx (and friends), which *are*
called through UNO, by a full-Java implementation; ah no, that
wouldn't help, since each UNO call would pass the expensive C++/Java
barrier :-|

Change source/drivers/jdbc/JStatement.cxx (and friends) to retrieve
the whole row (through an ad-hoc Java function) and cache it? Which
would make us a *third* level of caching, but well... Maybe it can
cooperate with KeySet so that KeySet does not cache again in this
case. Hmmm... the caching nature of KeySet was introduced because of
ODBC limitations; maybe we could push that to drivers (through a
helper wrapper class they can use to provide this):

 - the SDBC-ODBC layer would use the helper class "as is"; since it is
   to blame for the limitations that introduce the need for
   caching... Only it would suffer from the consequences.

 - the SDBC-JDBC layer would have a special implementation that
   fetches the whole row in one ad-hoc Java function call.

I'm fond of my trampoline idea, but 't would have to be done in a
clean way; not sure yet how.

--
Lionel
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Michael Meeks-2 Michael Meeks-2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"


On Thu, 2012-07-19 at 15:47 +0200, Lionel Elie Mamane wrote:
> I'm fond of my trampoline idea, but 't would have to be done in a
> clean way; not sure yet how.

        Heh ;-) I'm still holding my breath for the end-of-problem Firebird
solution myself ...

        I wonder if there are some sillies that we could tighten up there
though - eg. there could be random ref/unref thrash in the Java / bridge
traffic that might be easy to reduce. In theory there is this beautiful
built-in UNO tracing logic that I've never used - that can show us all
the calls that happen between various pieces - Stephan: could that get
us a trace of the back-to-back cross-language thrash ?

        Presumably just looking at that would give us some immediate
inspiration :-)

        ATB,

                Michael.

--
[hidden email]  <><, Pseudo Engineer, itinerant idiot

_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Stephan Bergmann-2 Stephan Bergmann-2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

On 07/19/2012 06:03 PM, Michael Meeks wrote:

> On Thu, 2012-07-19 at 15:47 +0200, Lionel Elie Mamane wrote:
>> I'm fond of my trampoline idea, but 't would have to be done in a
>> clean way; not sure yet how.
>
> Heh ;-) I'm still holding my breath for the end-of-problem Firebird
> solution myself ...
>
> I wonder if there are some sillies that we could tighten up there
> though - eg. there could be random ref/unref thrash in the Java / bridge
> traffic that might be easy to reduce. In theory there is this beautiful
> built-in UNO tracing logic that I've never used - that can show us all
> the calls that happen between various pieces - Stephan: could that get
> us a trace of the back-to-back cross-language thrash ?

This is about direct use of JNI, not via UNO.

Stephan
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Lionel Elie Mamane Lionel Elie Mamane
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

In reply to this post by Michael Meeks-2
On Thu, Jul 19, 2012 at 05:03:32PM +0100, Michael Meeks wrote:
> On Thu, 2012-07-19 at 15:47 +0200, Lionel Elie Mamane wrote:

>> I'm fond of my trampoline idea, but 't would have to be done in a
>> clean way; not sure yet how.

> Heh ;-) I'm still holding my breath for the end-of-problem Firebird
> solution myself ...

Even if we switch to Firebird for our embedded database, we still want
to support connecting to any database through any JDBC driver, so this
question of heavy C++ <-> Java cost will still be on the table.

--
Lionel
_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Noel Grandin Noel Grandin
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

In reply to this post by Lionel Elie Mamane

On 2012-07-19 14:38, Lionel Elie Mamane wrote:

> For example, to minimise the number of Java calls from C++ , maybe we
> could, when we retrieve a whole row through JDBC, doing this from Java
> code instead of from C++ code:
>
> for (i=0; i < row->getNumberOfColumns; ++i)
>     cacheEntry[i]=row->getColumnValue(i);
>
> This would bring us down from one call per column to one call per row,
> hopefully without duplicating *too* *much* logic between a C++ version
> and a Java version.
>
>

For bonus points, retrieve a whole chunk of rows at a time, say around 1000.



Disclaimer: http://www.peralex.com/disclaimer.html


_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Michael Meeks-2 Michael Meeks-2
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: [HELP] fdo#51239 how to debug "it is slow"

In reply to this post by Stephan Bergmann-2

On Thu, 2012-07-19 at 18:09 +0200, Stephan Bergmann wrote:
> This is about direct use of JNI, not via UNO.

        Ah - I wonder ... ;-) if it is the transition that is the problem;
could we have a Java thread polling a loopback socket and sucking simple
commands & parsing & executing them, then passing the data back in some
other form ;-> It'd be hideous if that actually was quicker than a
direct JNI call but ... ;-) it's possible. Particularly in the world of
under-utilized multi-threaded CPUs it might not be -so- bad.

        Bit of a pain cross-platform, but might work for suitably chunky
methods; I wonder - could we use the wonder binary-urp magic to
prototype it [ though I guess we'd need a native Java implementation of
that - hmm ].

        Urgh !

                Michael.

--
[hidden email]  <><, Pseudo Engineer, itinerant idiot

_______________________________________________
LibreOffice mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/libreoffice
Loading...