Appendix G

Searching the Web

One of the often asked question by the users is that how does one search for information on the Internet? This appendix contributed by Ms. Vasumathi Srigenesh covers a topic which is often by the most books on Internet. She gives general guidelines about searching the WEB and how to use search engines effectively. This is a very welcome addition to the users guide.

Ms. Vasumathi Srigenesh can be reached at:
vsri@bom2.vsnl.net.in

Searching the Web

Imagine this hypothetical situation. A gentleman - let’s call him "Mr. Searcher" - feels that he would like to find ALL the information in the world about "love". He puts out inquiries through various media. ALL the people in the world responded to his query and sent him books, videos, pamphlets, folders, floppy disks, CD’s and so on with information on love.

Going through a few of these, he finds that there are some book catalogues, some love poems, some fiction pieces on love in marriage, among families, friends ……. and so on. Oops - he realizes - what he actually needs is only ‘love and romance in teenage". The quest changes - he starts filtering and separating such documents And - horrors - there are about 100 relating to ‘his quest’ and even these are in different contexts, applications…….. and there are still a few thousands that to be scanned !

This is almost what exactly happens when one searches on the net. Remember - Internet is a collection of computers - world wide. Into each computer has gone information compiled by a number of people. The compilation is of work done by still larger numbers of folks.

To a novice, it seems wildly exciting at first - to have a huge array of "hits" to be scanned. Wow, a few keystrokes and the world is at one’s feet! He starts going through them. Somewhere down the line, a gnawing feeling creeps in -"while it is interesting, it is not exactly what he is looking for". There are usually two result of most searches performed:

i) One finds something worthwhile
ii) One does not.

The later happens because :

People do not know how to search.

More important:

People are not able to clearly state what they want

Let us try some demystifying :

Searching the net is a lot like searching a library.

You can either browse the library or

Search for a specific book by its author or title, or

Search for books on a subject.

The last is a very useful search, but few people realize that librarians put in very hard work to make this search easy.

Barring a few exceptions, most areas in the web have not been organized. Internet is thus a huge unindexed, uncatalogued library. Obviously, because there is no control.

How do you browse on the net ? Try

http://www.yahoo.com/

One finds a whole lot of categories - Business, Computers, Health and so on . One could browse these with their sub-categories.

a. How do you search? (The tricky part).

In the web, a search is usually done by putting in a keyword in a ‘search box’ and getting ‘hits’. When the computer retrieves documents on the search term - (let us take Mr. Searcher’s word - ‘love’) the hits contain documents which contain the ‘word’ love. The documents may or may not be about love. (You’d be surprised when you see how many are not).

Why the ‘may not’ - did you ask? The explanation is simple - a search engine pulls out and ‘ranks’ the retrieved documents by the number of times the word has appeared. It also gives a priority if the word has appeared in the title or in the URL (whatever appears as http:// blah blah…….)

With such criteria, the search will yield all sites which have this word; doesn’t matter what context it appears in. "Mr. Searcher", for example, may come up with a ‘hit’ which reads something like this

Newton’s love of physics

Newton said ‘Today I love myself for discovering gravity, I love the world, because I feel so good. I love science and I especially love Physics. I do hope in future my kids love Physics as much as I do. If they do I’ll love them more than ever….. and so on).

Get the point? Yes - the word ‘love’ has appeared - ooh - ever so many times!

(The really cool example is of a searcher who keyed in the word "Penthouse" (looking for the *magazine, and retrieved a site of a real estate agent)!!

b. How does one refine a search?

Important:Refine the word search

It is a very useful exercise to print out the HELP area of a few search engines and follow instructions on how to key in search terms. Most search engines have similar commands, but there may be minor differences.

Just one example here - let us take Altavista - and put in the following in the search box

+love +teenage +coping

and click on "search".

This would retrieve sites which have all these words (oh no, not concepts yet - that comes later). Automatically, sites not having ‘teenage’ and ‘coping’ would be eliminated. That many sites less to be scanned by Mr. Searcher !!

In the final analysis, remember : The web searches for a "WORD" and not the concept with its synonyms, antonyms and related words being involved.

c. Concept searching:

Search engines like "Excite" tackle this to some extent - you can put in a word and ask it to be searched ‘in context’. But the best search engine in this regard is "Yahoo". Yahoo is a combination of a search engine and a directory.

For instance, Mr. Searcher could browse the categories under children’s health, or psychology and even search for "teenager" or "adolescents" within these categories. The search is then done "in context".

d. Specific searches

For a very specific search, it is always better to search a very specific area - just like one would normally look up a telephone directory for a phone number of a VIP, and not a book which is a biographical sketch about him (the book may contain his phone number, but that is not the obvious source, as is the telephone directory).

In the context of the medical profession (in which I specialize in information searches), I use the MEDLINE database, which contains references and abstracts of articles published in standard medical journals, for searching information relating to diseases and their management. Since this is a ‘limited database’ (about 4000 journals published in the world, covering the period 1966 to present) compiled by an institution (the National Library of Medicine in the USA), it is very structured, and organized, so searches can be very specific.

There are authoritative sources like these in different fields. Some of these require additional passwords, registration and payment, some are available without extra cost. One must find out these valuable pieces of information, and choose the right paths to search.

Limitations with the Internet

I have emphasized that the web searches concentrate on searching for a "word". Good usage of a language commands that one uses a variety of words, and not the same word again and again, as the effect would be jarring. A really good writer would have used each of these words and phrases just once So, for a really true search, Mr. Searcher would need to search for words like ‘teenager’, ‘adolescent’, ‘young girl’, ‘young adult’ or ‘young lady’ in addition to ‘teenage’. Also the words ‘romance’ ‘deeply affectionate’, ‘strong liking’ and a host of others which are related to love. A relevant, good site is likely to be missed or ranked low, when one searches only with one term.

  1. Internet is not a controlled institution. No ‘publisher’ checks for the authenticity of contents and so on. So, there is no guarantee that everything found through a net search is of value or authentic.
  2. Everything on the net is not free. There are some areas which require additional payment. Yes there are a number of electronic journals which are free, and yet, there are a number which are available only on extra payment, or only partially free and so on.
  3. URLs move or are removed. Contents change. So what one finds today, may be difficult to find tomorrow, or even a few hours later, if the site manager / web master has decided to change the page.

In a nutshell, remember -

For further inquiries contact

Vasumathi Sriganesh: vsri@bom2.vsnl.net.in