Render differences of DOCX between Word and LO/AOO

classic Classic list List threaded Threaded
6 messages Options
Jens Tröger Jens Tröger
Reply | Threaded
Open this post in threaded view
|

Render differences of DOCX between Word and LO/AOO

Hello,

I am not sure if this forum is the right place to bring up this topic for discussion, or if another one is more appropriate; please advise.

Working with a larger Word DOCX file I noticed a few issues.  First off, the document’s statistics:

  MS Word: 427 pages, 96521 words, 563,392 characters (incl. spaces)
  OpenOffice: 435 pages, 99413 words, 577,397 characters (incl. spaces)

The pages make sense (see below) but how is it possible that both word count and character count of the same document differ by the thousands?

Regarding the page count, please see the attached image.  It looks to me as if word spacing and character kerning are slightly different, resulting in a cumulative “error” that causes different numbers of words per line.  That Shows in the footnote which stretches footnote’s box, thus causing the new paragraph to be pushed to the next page (Word) or not (LibreOffice).

The net effect of this ripples through the entire document, resulting in different page counts and paragraphs being placed on different pages. For example, the document I mentioned above contains chapter 41 in Word on page 195 (effective 232 of 427) whereas the same chapter 41 in LibreOffice is on 183 (effective 212 of 435).

This, in turn, becomes a problem if the author types out page numbers instead of using internal references: the pages numbers in the text then represent in no way the page numbers of the document.

Any explanation of the different word and character counts?  And is there a recommended way to close the “rendering gap” between Word and LibreOffice?

Thanks!
Jens

PS: Notice how LibreOffice didn’t get the font for the headings right!

_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Regina Henschel Regina Henschel
Reply | Threaded
Open this post in threaded view
|

Re: Render differences of DOCX between Word and LO/AOO

Hi Jens,

Jens Tröger schrieb am 23.01.2018 um 04:26:
> Hello,
>
> I am not sure if this forum is the right place to bring up this topic
> for discussion, or if another one is more appropriate; please advise.

This list is for developers to discuss problems in the code. For
discussion about using LibreOffice use the list
[hidden email] or use Ask.
https://www.libreoffice.org/get-help/mailing-lists/
https://wiki.documentfoundation.org/Ask/Getting_Started

If you speak about a specific document, you should make the document
available somewhere. The [hidden email] list does not
allow attachments.

Kind regards
Regina
_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Toki Toki
Reply | Threaded
Open this post in threaded view
|

Re: Render differences of DOCX between Word and LO/AOO

In reply to this post by Jens Tröger
On 01/23/2018 03:26 AM, Jens Tröger wrote:

> Any explanation of the different word and character counts?

Depending upon how one counts, there are either three, or four different
definitions of what constitutes a character. Microsoft Word and
LibreOffice use different definitions, which consequently results in a
different character count for the same text.

Each language has between two and seven ways of defining what
constitutes a word, and, consequently, how to count the number of words
in a document.

LibreOffice and Microsoft Office selected different definitions of what
constitutes a word, which results in each program claiming a different
number of words in the same document.

The simple fix would be an extension that supplies the number of
characters, according to each of those definitions, and then provides
the number of words, according to each of the various rule sets used be
each language.

Assuming the extension drops languages not used in the document, you'd
have a page displaying five lines of character counts and between five
and ten lines of word counts.

Incorporating such an extension into LibO would be counter-productive,
because it would confuse far more people than it would help.

Regina wrote:
>to discuss problems in the code.

The question reflects an issue with the code. More specifically, why
German word counts are used, instead of malformed US_English word counts.

jonathon


_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice

signature.asc (836 bytes) Download Attachment
Jens Tröger Jens Tröger
Reply | Threaded
Open this post in threaded view
|

Re: Render differences of DOCX between Word and LO/AOO

Thank you Jonathon for the char/word count explanation.

However, I’m more concerned about the differences in rendering (which in
turn accounts for the large difference in page count) and the loss of font
use (see the heading). If need be I can provide a test document that
demonstrates these issues.



--
Sent from: http://nabble.documentfoundation.org/Dev-f1639786.html
_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Chris Sherlock Chris Sherlock
Reply | Threaded
Open this post in threaded view
|

Re: Render differences of DOCX between Word and LO/AOO

Can you submit a bug?


Chris


On 1 Feb 2018, at 4:21 am, Jens Tröger <[hidden email]> wrote:

Thank you Jonathon for the char/word count explanation.

However, I’m more concerned about the differences in rendering (which in
turn accounts for the large difference in page count) and the loss of font
use (see the heading). If need be I can provide a test document that
demonstrates these issues.



--
Sent from: http://nabble.documentfoundation.org/Dev-f1639786.html
_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice


_______________________________________________
LibreOffice mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/libreoffice
Jens Tröger Jens Tröger
Reply | Threaded
Open this post in threaded view
|

Re: Render differences of DOCX between Word and LO/AOO