HOWTO: Removing Carriage Returns (Line Breaks, Form Feeds) from Documents

 

I wrote the original version of this article in June, 2007. This week, I’m expanding an article from mid-2007 —it seems that a lot of people want to remove carriage returns (form feeds) from their documents!

I bet that the most common time is when people want to forward an email that’s already been forwarded a couple times. The original email inserted carriage returns after 72 (or so) characters on a line. Then, when the email gets forwarded, each email program doing the forwarding inserts "> " at the beginning of each line, so they get longer and longer, and get wrapped at column 72 again and again.

I had answered a question about Notepad++, a really nice — and free — text editor that’s designed for programmers of any type (but is much better than the regular Notepad for everyone).

The question was "How can I remove formfeeds using Notepad++?"

Tech Tip
If that doesn’t make sense, let’s use the answer below as an example. A carriage return (an old typewriter term) is the character that tells line (3) below to be on a different line than line (2). Sometimes, we want to remove those characters — or paragraph marks or other non-printing characters that are used for formatting documents.

Of course, you can always go to each one and then hit the Delete key or the Backspace key and remove them one by one. But, the goal was to do it more efficiently.

So, the answer was:


Steps:
1) Using your mouse, highlight a formfeed/carriage return by starting at the end of one line and highlighting to the beginning of the next line.
2) Control-C to copy
3) Control-H to open the Replace dialog box
4) Click in the Find What box
5) Control-V to paste the formfeed/carriage return
6) don’t put anything in the Replace With textbox
7) Click on the Replace All button.

That gets them all.

However, that got me thinking about doing the same thing in other editors like Notepad, Wordpad, Microsoft Word and OpenOffice Writer.

With Notepad, the plain text editor that comes as part of Windows, you’re out of luck — you can’t do this. The "Replace" dialog box does not work with non-printing characters. The only possible option is to take them out one by one.

With Wordpad, the simple word processor that is also included in Windows (as C:\Windows\wordpad.exe or C:\Windows\write.exe), you’re similarly also out of luck.

At least Microsoft Word and OpenOffice Writer are consistent here — the Replace dialog box doesn’t work for any of them.

Microsoft Word has a better solution than OpenOffice for this function. Word has an Advanced button on its Replace dialog box that looks like the following images. The image on the left is the dialog box and the top of the Special (characters) list. The image on the right is the full Special list.

Word 2003 Replace dialog box
(click on the image for a larger version)

Word 2003 Replace dialog box Special characters
(click on the image for a larger version)

Microsoft Word also has a non-obvious solution: open the Find and Replace dialog box, click on the Replace tab, and type "^p" (that’s shift-6 and a lower-case P) in the "Find what:" box. Word will recognize this as a code for Control-p and will interpret it as a paragraph mark.

Tech Tip
Why is this non-obvious? Doesn’t Control-p mean paragraph? NO,it doesn’t. Control-p (which generates the same computer code as Control-P) is implemented in Windows and DOS as a command to Print. In Windows, if you type Control-P, it opens the Print-To dialog box.

OpenOffice Writer addresses this task in a a very strange way… Perhaps the best way to describe it is by quoting the help file, where I found the following procedure:

Removing Line Breaks
Use the AutoFormat feature to remove line breaks that occur within sentences. Unwanted line breaks can occur when you copy text from another source and paste it into a text document.

This AutoFormat feature only works on text that is formatted with the “Default” paragraph style.

1.Choose Tools – AutoCorrect.
2.On the Options tab, ensure that Combine single line paragraphs if length greater than 50% is selected. To change the minimum percentage for the line length, double-click the option in the list, and then enter a new percentage.
3.Click OK.
4.Select the text containing the line breaks that you want to remove.
5.In the Apply Style box on the Formatting bar, choose Default.
6.Choose Format – AutoFormat – Apply.

It actually worked! I assumed that my default installation of OpenOffice 2.2 was set up properly (for this procedure), so I started at the 4th step. It worked just fine. But, this was a very unusual, non-intuitive process.

Winner, among Notepad, Wordpad, Word and Writer: Microsoft Word.

 


Let me know what you think of this article - please post your comment below....

Comments

  1. Thanks for the Notepad++ tip. Found this by looking for this exact thing. Appreciated.

  2. Charles Heineke says:

    I’ve been using a free program for that purpose and for email cleanup for years. It’s called emailStripper, available from http://www.papercut.com/emailStripper.htm.

    You can take those snippets from an email you want to send somewhere else and paste it into emailStripper’s dialog box, and it will strip out all of those > characters and strip out unneeded line endings, reformatting it into continuous text.

    Another free utility I use inconjunction with things like this is PureText. It lets me paste copied text into a document either as text formatted as it was originally or as “pure text”, which then adopts the formatting of the document into which it’s being pasted. You select which by which hot key you use to paste it. I use it frequently during my day, for various copy and paste operations. It’s available from http://www.stevemiller.net/puretext/. I wouldn’t want to be without it.

  3. Word 2010, Windows 7 Pro x64 – works perfectly, although the screenshots are obviously a little different. I was fooled by “Paragraph Character” which produces ^v to be replaced. As noted, it is “Paragraph Mark” ^p that gets the job done. THANKS!

  4. Word! Brilliant! This article was a gemstone in the pile of rubble the internet is! Thank you!

  5. Echoing everything above – thank you for the solution in Word!

  6. Here’s an easy-to-use utility for removing unwanted line breaks:

    http://www.gcbhacks.dreamhosters.com/scheme-web-apps/line-unwrap.cgi

    It attempts to leave formatting such as paragraph breaks, lists, and headings intact. I’m the author and I’d be interested in anyone’s feedback for improving it.

    • Thanks for the online tool – very useful, definitely one of the best tools I’ve found for getting copied .pdf text cleaned up before pasting. Thanks, hope you get lots of hits and the inspiration to keep it available!!!

  7. In Notepad++, go to replace window (Ctrl+H) and select extended mode (Alt+X)
    Then enter “\n” (without quotes, of course) in the ‘find’ text input and “\r” in ‘replace with’ text input. Then click “Replace All”

    Tip 1: you can always do the reverse or just remove double breaks using: Find= “\r\r”, Replace=”\r”, or similarly for “\n” as well…

    Tip 2: Regular expression mode is also very handy in most cases… look at that.

  8. William Plumer, Jr. says:

    try email strippit to remove carriage returns

  9. William Plumer, Jr. says:

    sorry, emailstripper

Speak Your Mind

*