In this article I will guide you through the steps of creating a simple but effective
search engine. We will divide our work into two steps :
- Step 1 : Create a set of ASP pages to index the site content.
- Step 2 : Create a search engine to offer keyword specific database dependant
search to our visitors.
Requirements :
- PWS / IIS
- Microsoft Access Database
We will begin with creating a set of ASP pages to index the site content and then insert into
the database. Here you will learn how easy it practically is to deal with databases in ASP.
All statements of SQL select, insert, update and delete will come into play.
Creating HTML Form Page
We begin by creating an HTML Form page called 'addtodb.htm'. As it's name suggests you will
use it to enter URLs of pages to index. So create a new HTML page and name it 'addtodb.htm'
and then copy paste the following HTML code into it and then save it :
<html>
<head>
<style>
body { font-family : Verdana; font-size : 8pt; }
input { font-family : Verdana; font-size : 8pt;
height : 20; width : 250; }
</style>
</head>
<body>
Insert / Update File : ( using HTTP )
<form action="addtodb.asp">
<input type="text" name="look_for"><br>
<input type="submit" value=" " style="height : 17;
width : 17;"> Submit ->
</form>
<br><br><br>
Insert / Update File : ( using FileSystemObject )
<form action="addtodb_fso.asp">
Base URL :-<br>
<input type="text" name="base_url"><br>
Absolute path to the File :-<br>
<input type="text" name="look_for"><br>
<input type="submit" value=" " style="height : 17;
width : 17;"> Submit ->
</form>
<br><br><br>
Delete File :
<form action="delfromdb.asp">
<input type="text" name="del"><br>
<input type="submit" value=" " style="height : 17;
width : 17;"> Submit ->
</form>
</body>
</html>
What does Form 1 do ?
As you might have guessed by seeing the code, it offer three Forms to site admin ( you ).
The first one is used to enter complete URLs including HTTP ( e.g. http://www.stardeveloper.com/default.asp ).
This Form will take you to 'addtodb.asp' page which will index the page using HTTP protocol. This is the
easiest and most effective way to index pages. It enables you to index any of your site pages spanning across
multiple hosts. It is also useful when title, description and keyword tags info in your pages is dynamically
generated.
What does Form 2 do ?
The second Form instead of asking the complete URL ( like the first Form ), asks for base URL and
absolute path to the page to index. This Form is optional. You should use it only when you cannot use
HTTP protocol to index site pages ( more on that ). In the base URL field, you should enter the base URL
of your site e.g. http://www.yoursite.com. And in the absolute path field, you should enter the absolute
path to the page on your site e.g. /default.asp. This Form will lead to 'addtodb_fso.asp' page which
indexes the pages using FileSystemObject. FileSystemObject is one of the scripting objects provided to you
by ASP.
What does Form 3 do ?
The third Form asks for complete URL to the page e.g. http://www.stardeveloper.com/default.asp. You
should use it when you want to delete a page entry in the database. Note this action will not delete the page,
instead it will delete the indexed info of that page in the database. After this action that page will not be
shown in the search engine.
Creating Access Database
Start Microsoft Access and create a new database 'directory.mdb'. Now create a table in
the design view and name it 'all_pages'. Now put six fields in this table with 'id' field being
the primary key. The names of the fields and their data types are shown in the image below :
"all_pages" Table
Note if you are unsure or find it difficult, you can download the Access database discussed in
this tutorial at the end. So stay cool.
Creating 'addtodb.asp' page
Open note pad or your favorite text editor and create a new page. Save it as 'addtodb.asp'.
Now copy paste the following code into it :
<!--#include file="editme.asp"-->
<html>
<head>
<style>
body { font-family : Verdana; font-size : 8pt; }
</style>
</head>
<body>
<%
On Error Resume Next
Dim geturl, title, description, keywords, strURL, strDB, con, results
' URL
strURL = Request.QueryString("look_for")
Set geturl = CreateObject("Stardeveloper.GetURL")
strFileContents = geturl.Get(strURL)
Set geturl = Nothing
' Keywords
key1 = InStr(1, strFileContents, "<meta name=""keywords"" content=""", 1)
key1 = key1 + Len("<meta name=""keywords"" content=""")
key2 = InStr(key1, strFileContents, """>", 1)
keywords = "," & Trim(Mid(strFileContents, key1, (key2 - key1))) & ","
keywords = Replace (keywords, "'", " ")
' Description
desc1 = InStr(1, strFileContents, "<meta name=""description"" content=""", 1)
desc1 = desc1 + Len("<meta name=""description"" content=""")
desc2 = InStr(desc1, strFileContents, """>", 1)
description = Trim(Mid(strFileContents, desc1, (desc2 - desc1)))
description = Replace (description, "'", " ")
' Title
tit1 = InStr(1, lcase(strFileContents), "<title>", 1)
tit1 = tit1 + Len("<title>")
tit2 = InStr(tit1, strFileContents, "</title>", 1)
title = Trim(Mid(strFileContents, tit1, (tit2 - tit1)))
title = Replace (title, "'", " ")
' Our Connection Object
Set con = CreateObject("ADODB.Connection")
con.Open strDB
Set results = con.Execute("select title, description, keywords _
from all_pages where url = '" & strURL & "'")
' If the returning recordset is empty the add the URL with accompanying
' info to the database
If results.EOF Then
con.Execute("insert into all_pages (title, description, keywords, url, _
mydate) values ('" & title & "', '" & description & "', '" & _
keywords & "', '" & strURL & "', '" & date & "')")
Set rs = con.Execute("select count(url) as total_count from all_pages")
cnt = rs("total_count")
Set rs = Nothing
Response.Write "<b>New account successfully created for " & strURL & " _
.</b>" & vbcrlf
Response.Write "<br>"
Response.Write "Total Pages Indexed : " & cnt & vbcrlf
Else
' But if the returning recordset is not empty i.e. we have already added _
' title, desc, keywords etc into it then update that information with the _
' new one.
con.Execute("update all_pages set title = '" & title & "', description = _
'" & description & "', keywords = '" & keywords & "', mydate = #" _
& date & "# where url = '" & strURL & "'")
Response.Write "<b>Account updated successfully.</b>"
End If
' Done. Now release Objects
Set results = Nothing
con.Close
Set con = Nothing
%>
</body></html>