Save a Unix manpage as plain text
Posted January 19th, 2004 in Linux/Unix/BSD and Man Pages (Updated May 24th, 2005)
Manpage is an abbreviation for manual page, Unix and Linux help
documentation files. There are man pages for just about every Unix command
line and utility command. To view a man page from a command line you simply
enter man followed by the command. For example,
man man
man ls
man
find
would bring up the man page for the "man", "ls" and "find" commands respectively.
Manpages are stored in nroff format, which is a type of plain text with formatting information to indicate when text should be bold or in a particular colour. You could output a man page to a text file by issuing the following (which would output the manpage for "man" to the file man.txt)
man man > man.txt
The only problem with this is that is still contains a lot of additional formatting garbage and repeated characters. The first few lines from the resulting file above looks like so:
man(1)
man(1)
N^HNA^HAM^HME^HE
man - format and display the on-line manual pages
manpath - determine user's search path for man pages
S^HSY^HYN^HNO^HOP^HPS^HSI^HIS^HS
m^Hma^Han^Hn [-^H-a^Hac^Hcd^Hdf^HfF^HFh^Hhk^HkK^HKt^Htw^HwW^HW]
[-^H--^H-p^Hpa^Hat^Hth^Hh] [-^H-m^Hm
_^Hs_^Hy_^Hs_^Ht_^He_^Hm] [-^H$
[-^H-M^HM
_^Hp_^Ha_^Ht_^Hh_^Hl_^Hi_^Hs_^Ht] [-^H-P^HP
_^Hp_^Ha_^Hg_^He_^Hr] [-^H-S^HS
_^Hs_^He_^Hc_^Ht_^Hi_^Ho_^Hn_^H__^Hl_^Hi_^Hs_^Ht$
D^HDE^HES^HSC^HCR^HRI^HIP^HPT^HTI^HIO^HON^HN
m^Hma^Han^Hn formats and displays the on-line manual pages.
If you
specify _^Hs_^He_^Hc_^H-
_^Ht_^Hi_^Ho_^Hn, m^Hma^Han^Hn only looks
in that section of the manual. _^Hn_^Ha_^Hm_^He is
normally the name
of the manual page, which
is typically the name of a command, function
or file. However, if _^Hn_^Ha_^Hm_^He
contains a slash (/^H/) then
m^Hma^Han^Hn
interprets it as a file specification, so that you can do
m^Hma^Han^Hn .^H./^H/f^Hfo^Hoo^Ho.^H.5^H5 or even
m^Hma^Han^Hn
/^H/c^Hcd^Hd/^H/f^Hfo^Hoo^Ho/^H/b^Hba^Har^Hr
.^H.1^H1.^H.g^Hgz^Hz.
The correct way to output a man page into a plain text file is by issuing the following command, which outputs the man command into a file called man.txt:
man man | col -b >
man.txt
This will now correctly look like so (the same lines as in the above example are displayed):
man(1)
man(1)
NAME
man -
format and display the on-line manual pages
manpath - determine user's search path for man pages
SYNOPSIS
man [-acdfFhkKtwW]
[--path] [-m system] [-p string] [-C config_file]
[-M pathlist] [-P pager] [-S section_list] [section] name
...
DESCRIPTION
man formats and
displays the on-line manual pages. If you specify sec-
tion, man only looks in that section of the
manual. name is normally
the name of
the manual page, which is typically the name of a command,
function, or file. However,
if name contains a slash (/) then man
interprets it as a file specification, so that you can do
man ./foo.5
or even man
/cd/foo/bar.1.gz.
Subscribe!
If you found this post interesting and would like to be notified the next time something is posted, please subscribe to my RSS Feed. Thanks for visiting!


