UNHTML
is a command line application that can turn an HTM file (a file for the www) into a simple text (txt) file. Released as Open source under the GNU GPL License, source code included.
Program is distributed as ZIP package: download to temporary directory and unpack to destination folder. See below for download link(s).
Following ones are the download links for manual installation:
UNHTML (23/1/2021, A. Doff) | ||
hobbes.nmsu.edu/download/pub/os2/apps/webdev/UNHTML_1996-07-19.zip | ||
UNHTML v. 1.5 (19/7/1996, Stephen Loomis) | Readme/What's new |
--==| UNHTML v1.5 |==--
(C)opyright 1996 by Jawed Karim <Jawed.Karim-1@Umn.edu>
What's New
==========
UNHTML v1.3 --> v1.5 :
o The DOS executable has been compiled with a newer version
of djgpp.
o The manual editing request by unhtml is now optional and can
be turned on and off with the -e command line option.
o Displaying the output file while processing the input file on
screen is optional and is controlled with the -d command line
option.
o Unhtml's output style has been modified and a small help screen
was added.
UNHTML v1.0 --> v1.3 :
o The output files contain fewer empty lines, thus
reducing its size.
o An ELF executable for Linux is included.
o An editor can be launched after completion to
manually edit the output file.
o UNHTML counts how many HTML tags were removed.
o Special character symbols '&' and ';' no longer
cause trouble within '<' and '>'.
Instructions
============
UnHTML v1.5 (C)opyright 1996 Jawed Karim <Jawed.Karim-1@Umn.edu>
Usage: unhtml <inputfile> <outputfile> [-d][-e]
<inputfile> : The file that contains HTML code.
<outputfile>: After removing the HTML code, the text
will be written to this file.
[-d] : Tells Unhtml to display the output file on screen
while processing it.
[-e] : Causes unhtml to ask the user for manual editing.
==> Edit index.txt manually [y] ?
If you would like to edit the output file manually with a text
editor, press 'y' at this point. If not, just hit enter. UNHTML
will attempt to execute a file, depending on which system you are
using.
under Linux: command 'pico' will be executed
under MSDOS: command 'edit' will be executed
under OS/2 : command 'tedit' will be executed
Should you get an error message under MSDOS or OS/2, make a
batchfile that points to an editor such as the following
example of a DOS BATCHFILE:
---CUT HERE---
c:\dos\edit %1
---CUT HERE---
Save this file as 'EDIT.BAT' in the same path as UNHTML, or have
it in a path that is contained in your PATH variable.
Accordingly the OS/2 BATCHFILE would look like this:
---CUT HERE---
c:\os2\tedit.exe %1
---CUT HERE---
Save this file as 'TEDIT.CMD' in the same path as UNHTML, or have
it in a path that is contained in your PATH variable.
Under Linux, if you get an error message, make a symbolic link
that points to whichever editor you use. Name the link 'pico'.
For more help, see: man ln
OS/2 Warp
=========
This executable requires you to have the EMX Runtime version v0.9b or
higher. It is available at:
ftp://hobbes.nmsu.edu/os2/unix/emx09b/emxrt.zip
This is worth getting since you will be able to use long filenames with
UNHTML for OS/2.
Linux
=====
This ELF executable has been tested under Linux 1.2.13.
MSDOS
=====
Unless you are running UNHTML for MSDOS in an OS/2, or Windows(95/3.1/NT)
DOS window, you need to have the file CWSDPMI.EXE in your path variable,
or in the same directory as UNHTML.
Where to find updates
=====================
New UNHTML versions will be posted on:
http://umn.edu/~kari0022
or search for "Jawed Karim" on Yahoo! (http://www.yahoo.com)
or email Jawed Karim at:
Jawed.Karim-1@umn.edu
kari0022@gold.tc.umn.edu |
hobbes.nmsu.edu/download/pub/os2/apps/webdev/UnHTML_1-5.zip |
This work is licensed under a Creative Commons Attribution 4.0 International License.
Comments
A. Doff
Sat, 24/12/2022 - 16:18
Permalink
A new release with a bug fix
Add new comment