What are the functional differences between the Windows 95/98,
Windows NT, and OS/2 versions of WebMaven?
There are no functional differences using WebMaven on the different
platforms.
Is a WebMaven registration key good for all of the operating
system platforms supported by WebMaven or is a different key needed to
register each variant?
A single WebMaven key can be used across all operating system
platforms. Also, if both a personal registration key exists along with
an enterprise key, the enterprise key takes precedence.
After purchasing a personal (non-enterprise) key, is there an
upgrade path to purchasing an enterprise key?
Yes, 70% of the actual purchase price of the personal registration
key can be applied to the suggested retail price of the enterprise key.
Can I run multiple copies of WebMaven concurrently?
Yes, subject to the resources available to your computer (line speed, available
disk space, etc.), there is no limit to the number of instances of
WebMaven which can run concurrently.
WebMaven can be installed on a server and run concurrently from
any number of clients. The only restriction regarding multiple copies
of WebMaven running concurrently is that only one copy may reference
a specific local path at a time.
What happens if my Internet dial-up connection breaks while WebMaven is
retrieving files?
So long as the dialer you are using to connect with your ISP
(Internet services provider) automatically re-dials on a broken
connection and completes a new connection within 75 seconds, WebMaven
will continue as if the interruption had not occurred. If the
interruption lasts longer than 75 seconds, WebMaven will, of necessity,
prompt you whether to continue after a re-connection has been made.
In the event that a new connection is not made within the allotted
time, you are prompted if WebMaven should terminate. When WebMaven is
ended before all of the appropriate files are retrieved, a WebMaven.CHK
file is created. the .CHK file allows WebMaven to be restarted from where
it left off.
Why does the WebMaven main task appear to stall periodically when
retrieving certain sites?
WebMaven has to determine the IP address from the domain
name server you have specified in your configuration. Though WebMaven
only checks each domain name once, it is still dependent on responses
from the domain name server. Therefore, a page with numerous
different domain names will have an excessive amount of wait time.
This is the same reason why pages with a lot of active links
(non-clickable references like images) load into your browser slowly.
WebMaven reports this time in the
Domain Name Server Lookup Time Report
which is available with the enterprise version of WebMaven.
Can I specify multiple remote paths on the same domain to be
retrieved together?
Yes, with an enterprise registration
key you can specify any number of paths on the same, or
different, domains to be retrieved together. Continuity of links
between the specified paths is preserved. Domain names, with and
without leading www. and ftp. are
considered synonymous so long as the names resolve to the same IP
address.
How does WebMaven handle non-HTML related protocols -
specifically mailto:?
In general, all non-HTML
protocol tags result
in the creation of out of tree pages. Mailto: links are left intact
in the localized files. Therefore, mail messages can be created off
line if your mail program supports deferred transmission of messages.
How does WebMaven handle forms on a retrieved page?
It depends on the METHOD= attribute in the form. If METHOD=POST,
the form submission can only be handled by the server; the URI
specified in the ACTION= attribute is dead on the local page.
If METHOD=GET then the URI specified in the ACTION= attribute is
handled like any other link.
Why do links that are specified as options in a form selection
list (e.g. <OPTION VALUE=file_name>) function properly
when the remote page is accessed via a browser but WebMaven reports
that the files could not be retrieved (HTTP error 403 or 404)?
More than likely the form processing script on the server
compensates for the path value specified in the VALUE clause whereas
WebMaven references the remote file exactly as it is stated.
This is the result of poor Web design.
Why is there sometimes a difference between where a link takes me
with a page that WebMaven has localized vs. the same link when I look at
the on-line page?
If a page contains erroneous HTML (e.g. http:/www.abc.com
- note the omitted slash) WebMaven, as well as different browsers,
may interpret the incorrect HTML differently.
Why does the WebMaven Syntax Error report
indicate numerous HTML errors yet the referenced page renders
correctly both in the on-line page as well as the page that has been
retrieved and localized by WebMaven?
The WebMaven Syntax Error Report includes HTML tags that contain
strings which do not comply with the HTML specification. The pages
are rendered correctly because of the "forgiving" style in which
most popular browsers operate.
In many instances, correcting these HTML errors will result in a
page rendering more quickly by your browser. In some cases,
correcting the HTML errors (particularly ambiguous paths - i.e. URIs
that end with a directory name and no trailing slash) will also eliminate
the need for your browser to request the page twice.
Why are there so many INDEX!.HTM files in the directories created
by WebMaven?
When an HTML link is specified with a path but no file name (e.g.
http://cfsrexx.com/WebMaven/) neither WebMaven nor your favorite
browser can identify the file name that will be served. Since
WebMaven must have a local file name to save the file, it uses a
default value of INDEX!.HTM. Registered
copies of WebMaven allow the user to specify any default file name,
overall or by site.
Why do I see file names which have an underscore followed by 7 characters
inserted before the file extension?
When possible, the same directory and file names used on the server
are used by WebMaven (case preserved) when it retrieves files.
When WebMaven encounters a name on a server that is not compatible
with the local client file system, it must create a suitable, unique name.
Why do some of the directory names created by WebMaven include the domain
name (without a preceding www. or ftp.)?
Out of tree directories include the abbreviated domain name to assure that paths are unique.
Why do I get different results with WebMaven when I specify
a textual domain name (e.g. http://disneyland.com) than when
I specify the dotted IP address for that domain name (e.g. http://208.218.3.18)?
This has nothing to do with WebMaven; rather, it is the way that
Internet routing and Web servers work. To see for yourself,
specify the dotted IP address for Disneyland (as shown above) and see
what you get for a page. Then, enter the textual domain name (as shown
above) and you will wind up at http://www.disney.com/Disneyland/.
Can I retrieve files with WebMaven onto a drive that uses the 8.3
file naming convention?
Yes and no. All of the Window's file systems (FAT16, FAT32, & NTFS) support long file names;
however, OS/2 FAT partitions can NOT be used to retrieve Web sites. OS/2
users must specify HPFS partitions in the Local path value when
a local path is specified in the Site Properties notebook for WebMaven.
Why does the time stamp (date and time) differ between the server
and the retrieved file?
WebMaven converts the server time stamp, specified in GMT, to its
local equivalent.
Why do the files created by WebMaven for out of tree links and
HTTP errors use a single line feed character (vs. a carriage return /
line feed pair) to terminate each line of the HTML file even though I
am running on a Windows or OS/2 client?
WebMaven uses the same line termination sequence (either line feed or
carriage return / line feed) that was used in the retrieved page.
How does WebMaven handle cookies?
WebMaven processes cookie requests independently from your
browser. In other words, WebMaven manages cookies by receiving SET
COOKIE data and returning that cookie data when appropriate.
The cookie data collected by WebMaven is not kept from run to run but
is detailed in the Cookie Report.
The WebMaven Cookie Report requires the appropriate WebMaven registration key.
Can WebMaven process Web sites that use dynamically built URIs?
Unfortunately the answer is sometimes. Dynamically constructed
URIs are a very complex issue and can result in URIs that WebMaven cannot
process correctly.
WebMaven has a lite JavaScript interpreter built into it that
can handle some URIs created within a script. However, if a script file
is in a different directory than the page that references it (i.e. <SCRIPT SRC="directory/file_name.js">)
WebMaven will not be able to resolve the dynamically created URIs
What happens if I retrieve files from a Unix server to a local
client where the file system doesn't distinguish case yet
preserves the case of file names and directories (i.e. Windows 95/98, NT,
OS/2)?
With a WebMaven registration key, WebMaven defaults to preserving
case for all retrieved files and directories. The default
case preservation setting can be changed
for each site. When two different file names or directories
are found with the same characters except for case, WebMaven will
alter the second and subsequent occurrence of the name with trailing
exclamation points (!) until the name is unique. Telling WebMaven to
preserve case may result in multiple occurrences of the same
file being stored on the local client if the Web site contains the
same URIs with differing case. If you are going to retrieve a site
known to have a non-case sensitive file system (e.g.
Windows NT), space and download time will be saved by specifying
non-case sensitive URIs for the site.
Running WebMaven without a registration key causes a file named
FileName.DOC and a file named filename.doc
to be considered the same file; and only the first occurrence found
by WebMaven will be retrieved.
Can I use WebMaven to verify the links on my own Web site?
Yes, WebMaven creates a bad HTML link report.
WebMaven also creates a bad IP
address report. This report details any unresolved domain names
or IP addresses for the retrieved site.
Both of these reports also have the ability, at the user's option,
of e-mailing a copy of the report detail to "Webmaster" at the domain.
If "Webmaster" does not appear as an e-mail address within the retrieved
files, the most commonly used e-mail address in the retrieved files at
the domain will be used as the mailto: address for the reports.
With Navigator 3 or later, or Internet Explorer 4.0 or later, this
e-mail message is completely constructed by WebMaven. All that is
necessary to send the report to the Webmaster, or alternate recipient, is to open the report
file (WebMavenDomainException.htm or WebMavenLinksException.htm in the local path
directory), click on "E-mail report to ...", and then send it
from your mail program.
Note:
E-mail programs other than Netscape or
Internet Explorer may not support this facility in which case the
contents of the report must be cut and pasted into the mail message.
The To: and Subject: fields of the mail message should be correct as
generated by WebMaven.
Is there any convenient way to see what Java classes WebMaven has
retrieved?
Yes, with a WebMaven registration key the Java class report can be generated.
What HTML tags does WebMaven handle?
WebMaven complies with the published HTML 4.0
specification. Therefore, it processes all HTML 4.0 tags which
reference URIs. See the HTML tag
processing table.
What HTTP client level does WebMaven present to a server?
If WebMaven is unregistered, it will function only as an HTTP/1.0 client.
With a WebMaven registration key, the user can select
either HTTP/1.1 (default) or HTTP/1.0
globally or per site.
When running as an HTTP/1.1 client, WebMaven is fully compliant with
RFC2068 (the HTTP specification).
If I enter my ID and password in the WebMaven site properties
notebook, is it exposed to snoopers?
No, WebMaven encodes the password in all of the WebMaven.IPT
files. The WebMaven.IPT files represent the settings notebook
repository for each respective local path. Though the .IPT files
themselves are plain ASCII and can be changed with your favorite
editor, the password entry MUST never be changed.
How does WebMaven improve download time?
WebMaven downloading is multi-tasked. That means that while the
main task parses the HTML pages, the download tasks
retrieve files from the Web server asynchronously.
However, sites that have an exceptionally large number of links to
other domains -- particularly domains that do not respond, defeat all
of the overlap that WebMaven would normally realize. This delay time is
spent communicating with the domain name server used by your ISP.
This is one of the reasons that some Web pages, particularly those with
a large number of links to banner ads, take so long for your browser to render.
This name server time is reported as a separate value in the property
values for the site.
Can I use WebMaven with a proxy server?
All registered versions of WebMaven can be used through a proxy server.
What is the WebMaven.!!! file in the local path directory?
The WebMaven.!!! file is a sentinel that the download subtasks use to
determine that the main task is still running. Removing this file
causes the subtasks to erroneously terminate.
In the event that WebMaven terminates abnormally, which we hope never
happens, the WebMaven.!!! file should have been erased. If it is not, it
indicates a malfunction in the WebMaven engine.
Why do I occasionally see the
image on localized pages but not on the live page?
This image replaces graphic files that WebMaven could not retrieve.
The main reason why WebMaven can't retrieve an image is because
the URI for the image was constructed with either JavaScript or Java.
However, it is still possible for a broken link to appear in pages retrieved by WebMaven
if an HTML file is sent by the server in place of the graphic file. This rarely occurs.
Why, on occasion, are there missing images on a retrieved page?
If WebMaven finds a URI constructed with JavaScript and rendered
via the document.write() or
document.writeln() functions, it brackets the JavaScript
function with comment indicators.
Why does WebMaven look so complicated? I simply want to retrieve Internet site information.
Quite the contrary -- WebMaven is not complicated at all. Simply install it, specify a
local and remote path and push "Start".
The options that are available are intended for the more
experienced user, or the novice user after he/she becomes comfortable
with WebMaven. Even most enterprise users will run with the default
options.
Also, there is both hint text as you navigate around WebMaven along
with context sensitive help. Simply give any object of the WebMaven
windows the focus (either with your mouse or the tab key) and press
F1. A full explanation of the item will be displayed.
Is there anything that WebMaven can't do (within its stated
purpose, of course)?
Yes, there are some things that are beyond WebMaven's capability:
- WebMaven cannot retrieve some URIs which are dynamically created via JavaScript or Java. groups.