Web spiders updating dynamic pages
Or you could enter a crawl depth of 4 to reach four levels deep into the site.For more information on web site indexing options, see: After a search, dt Search Spider will display retrieved HTML or PDF files with hit highlighting, and all links and images intact.A repository is similar to any other system that stores data, like a modern day database.The only difference is that a repository does not need all the functionality offered by a database system.If the Web site has changed since the indexing, then hit highlighting will be on an incorrect word.To ensure that highlighting is correct, you can use the caching feature in dt Search to have dt Search store the web pages as they are indexed so hit highlighting is done using the stored data.dt Search 6, dt Search 7 dt Search includes a built-in web spider for indexing and searching internal or publicly-accessible Web sites.The dt Search Spider automatically recognizes and supports HTML, PDF, XML, as well as other online text documents, such as word processor files and spreadsheets.
dt Search uses built-in HTML file converters to convert other text formats, such as word processor and spreadsheet files, to HTML for display with highlighted hits.
The archive is known as the repository and is designed to store and manage the collection of web pages.
The repository only stores HTML pages and these pages are stored as distinct files.
For developers, the dt Search Text Retrieval Engine includes a . For API documentation, click here or see the dt Search Net Api2help file. The crawl depth is the number of levels into the web site dt Search will reach when looking for pages.
To index a web site in dt Search , select "Add web" in the Update Index dialog box. You could spider a crawl depth of 1 to reach only pages on the site linked directly to the home page.