Re: line breaks
by Kees Bakker - 12/24/06 8:06 AM
In Reply to: Example by thegreg82
As I suspected this is more complex than it seems to be.
If I copy and paste the text about the cold war from your post into Word, it shows up with a new-line character (same as when you press shift-enter in Word) at the end of each line. That's fully correct. If you have a look at the html-source of the message, there are <br> line-breaks in it, as shown in
"During the Cold War, the United States maintained nuclear forces that were<BR>sized and structured to deter any attack by the Soviet Union and its Warsaw Pact<BR>allies, and if deterrence failed, to defeat the Soviet Union. In"
A <br> is the 'new-line' command in html. And the only reason it's there is that the designer of the webpage (or the program he used) put it there intentionally to force a new line. So Word obeys the intention of the maker of the webpage. Nothing wrong with that.
Your link is to a pdf-file. If I open that (either with Acrobat or with Foxit reader, and either locally or from the web) and use the text selection tool to copy part of the text to Word, the end of the line shows up as a paragraph marker in Word. That has absolutely nothing to do with the web, it's just that the Adobe or Foxit programmers thought this the right thing to do. Go and complain with them.
The last case: normal html. I copied a small piece of text form www.time.com. In the browser it looks like
This was the year of the web generation, a year
that saw the rise of a new digital democracy.
Meet 15 of the web generation's biggest movers
but that's just because my browsers (IE 6) rendering engine puts it on the screen that way to fit in into the available space (the column size). If you look at the html-source, you see there are no <br>-tags inside, so the designer chose to have IE determine the exact lay-out (as usual). And if I paste this to Word, it shows up just as you expect, as one paragraph without line breaks. Well, in fact it shows up as a bulleted list, because the designer of the webpage enclosed it in a <li>-tag.
I can see nothing wrong with this the way Word handles copies from an html-source.
As I said, you might have doubts about the way the text-tool of some pdf-readers handle a new line in the document, but that's a quite other subject. It might be inherent to the way a .pdf-document is structured internally, but I couldn't tell you that.
Was this reply helpful? (0) (0)