| PRELIMINARY
On the end-user side, Introit has built all of their products around an
Introit product core. The product core automatically acquires content, the
binning engine places the acquired datasheets into a taxonomy, and the
extraction engine semi-automatically extracts text and figures and table
cell content from electronic component datasheets.
Figure 1 illustrates the Introit product core, and includes the
following:
- Web agent a pdf, html, doc, xls and txt web agent with a
search engine defeat mechanism and built-in DOS (denial of service)
control; by giving the web agent a single generic part number or many
part numbers, the web agent will defeat the target website search
engine, and will return with the target document and the associated URLs.
- Binning this module will automatically place the downloaded
datasheets into a taxonomy based on generic number and by manufacturer.
- Extraction engine this module will extract appropriate text
strings, images, and table cell content from the datasheet based on
pre-defined text types, image types, and attributes.

Figure 1. Introit Product Core.
|