HavenRock.com: home
 
Asia ... Documentation ... Downloads ... DSSSL ... EFL/ESL ... HTML tools ... Japanese text ... Linux ... Photography ... Python ... SGML ... Tcl/Tk ... TeX ... Tips ... Typography ... Word processing

What's the problem?

A text file is made up of characters. Which you know, of course -- bear with me just a moment. Some of those are printable characters -- letters, numbers, punctuation, and symbols; others are control characters. Internally, the computer sees everything as a number -- a sequence of 1s and 0s. So there's one number that represents the letter K, and another number for the end of a line -- called, appropriately, the newline character.

So, when your word processor or text editor sees a newline character, it ends one line and starts a new one, right?

Well, not exactly. Unfortunately -- and I don't know why, so don't ask -- UNIX, DOS and MS-Windows (hereafter lumped together as DOS), and MacOS use different control characters to indicate the end of a line. UNIX uses just the newline character (sometimes written as LF, \n, or ^J); MacOS uses just the carriage return character (CR, \r, or ^M), and DOS uses both (CRLF, \r\n, or ^M^J).

So, unless your software is smart enough to deal with the differences, a UNIX or Mac text file will appear as one very long line on a DOS system. A DOS text file displayed on UNIX or a Mac will have the correct line breaks, but will also have an extra control character at the end of every line -- which, if your computer displays a symbol for the control character, is annoying to read. More importantly, the extra character may cause errors in automated text processing.


	      
Matt Gushee

Last modified: Sun Oct 17 11:08:43 EDT 1999