## Wednesday, September 19, 2007

### Mixing DOS and Unix

I wrote a java program using Notepad in Windows and when I opened it in vi editor on unix machine, I got ^M character at end of each line.

Eg:
import java.io.*;^M
^M
/**^M
*^M
*This class is used to manage the o/p stream of the application.^M
* ^M
*@author Niketan Pansare^M
*@version 1.0 ^M
* ^M
*/^M
public class OutputManager^M
{^M
OutputStream outputStream; ^M
^M
/**^M
* This constructor uses standard i/o as default stream.^M
*/^M

and so on ...

You can replace all the extra ^M in vi editor by:
1. Go in command mode (Press Esc - If you already are in command mode, you will hear a beep)
2. Then type
:%s/^M$//g  Don't copy and paste above lines. To add ^M, press (CTRL+V) + (CTRL+M) ie ^V+^M What does above command mean: For substitution you use following command :[range]s/[pattern]/[string]/[options] 's' here mean substitute a pattern with a string. range can be {number} an absolute line number . the current line$ the last line in the file
% equal to 1,$(the entire file) You can also use # instead of / as seperator. Technically you can define pattern as: The definition of a pattern: *search_pattern* Patterns may contain special characters, depending on the setting of the 'magic' option. */bar* */\bar* 1. A pattern is one or more branches, separated by "\|". It matches anything that matches one of the branches. Example: "foo\|beep" matches "foo" and "beep". 2. A branch is one or more pieces, concatenated. It matches a match for the first, followed by a match for the second, etc. Example: "foo[0-9]beep", first match "foo", then a digit and then "beep". 3. A piece is an atom, possibly followed by: magic nomagic */star* */\star* * \* matches 0 or more of the preceding atom */\+* \+ \+ matches 1 or more of the preceding atom {not in Vi} */\=* \= \= matches 0 or 1 of the preceding atom {not in Vi} Examples: .* .\* matches anything, also empty string ^.\+$ ^.\+$matches any non-empty line foo\= foo\= matches "fo" and "foo" 4. An atom can be: - One of these five: magic nomagic ^ ^ at beginning of pattern, matches start of line */^*$ $at end of pattern or in front of "\|", */$*
matches end of line
. \. matches any single character */.* */\.*
\< \<> \> matches the end of a word */\>*
\i \i matches any identifier character (see */\i*
'isident' option) {not in Vi}
\I \I like "\i", but excluding digits {not in Vi} */\I*
\k \k matches any keyword character (see */\k*
'iskeyword' option) {not in Vi}
\K \K like "\k", but excluding digits {not in Vi} */\K*
\f \f matches any file name character (see */\f*
'isfname' option) {not in Vi}
\F \F like "\f", but excluding digits {not in Vi} */\F*
\p \p matches any printable character (see */\p*
'isprint' option) {not in Vi}
\P \P like "\p", but excluding digits {not in Vi} */\P*
\e \e */\e*
\t \t */\t*
\r \r */\r*
\b \b */\b*
~ \~ matches the last given substitute string */~* */\~*
  A pattern enclosed by escaped parentheses */*
(e.g., "$$^a$$") matches that pattern
x x A single character, with no special meaning,
matches itself
\x \x A backslash followed by a single character, */\*
with no special meaning, matches the single
character
[] \[] A range. This is a sequence of characters */[]*
enclosed in "[]" or "\[]". It matches any */\[]*
single character from the sequence. If the
sequence begins with "^", it matches any
single character NOT in the sequence. If two
characters in the sequence are separated by '-', this
is shorthand for the full list of ASCII characters
between them. E.g., "[0-9]" matches any decimal
digit. To include a literal "]" in the sequence, make
it the first character (following a possible "^").
E.g., "[]xyz]" or "[^]xyz]". To include a literal
'-', make it the first or last character.

If the 'ignorecase' option is on, the case of letters is ignored.

It is impossible to have a pattern that contains a line break.

Examples:
^beep( Probably the start of the C function "beep".

[a-zA-Z]$Any alphabetic character at the end of a line. \<\I\i or $$^\|[^a-zA-Z0-9_]$$[a-zA-Z_]\+[a-zA-Z0-9_]* A C identifier (will stop in front of it). $$\.\|\.$$ A period followed by end-of-line or a space. Note that "$$\. \|\.$$" does not do the same, because '$' is not end-of-line in front of '\)'. This was done to remain Vi-compatible. [.!?][])"']*$$\|[ ]$$ A search pattern that finds the end of a sentence, with almost the same definition as the ")" command. Technical detail: characters in the file are stored as in memory. In the display they are shown as "^@". The translation is done when reading and writing files. To match a with a search pattern you can just enter CTRL-@ or "CTRL-V 000". This is probably just what you expect. Internally the character is replaced with a in the search pattern. What is unusual is that typing CTRL-V CTRL-J also inserts a , thus also searches for a in the file. {Vi cannot handle characters in the file at all}

Seems complex ??
Let us stick to our example for time being.

$mean end of line Therefore, ^M$ mean any ^M character at end of line.
It is supposed to be replaced by nothing (since in our example, string is empty).

g at end mean "global"