TELUSA Archives and their FATE - Database & Search
PALANA (nparinand@cas.org)
Tue, 22 Apr 1997 14:37:15 -0400
Chy. Sreenivas Parachuri has been constantly raising this issue with me.
So, I wrote a reply to him. Smt. Savitri raised a very genuine issue with
regards to the searching and I took it a bit more seriously. I think we should
do something about the search strategies. This is what I feel. Let us put a side
what Smt. Savitri remarked and what we are going to say in reply to her message.
She left something to us think about and act upon.
Please read this.
Regards
pAlana
--------------cut-------------------cut----------------------cut--------------
What Smt. Savitri Machiraju mentioned about finding an article drowned in the
muddle of nonsensical posts is also my concern. We Telugus in this country
comprise a major chunk of the computer scientists in the US and certainly most
of them are in Telusa. They can bring an elephant onto our screens. They can
come up with a nice tool to help our readers/serachers/users to search and find
a topic of their interest/need.
With my experience in database building and search strategies, I propose the
following:
1) We need a controlled terminology to seach the articles/messages archived in
Telusa database.
Different authors call a "thing" or an "idea" by different names.
e.g "raamuDu", "daSaradha tanayuDu", "sItApati", "Bharata's Bro" - all should
be equated to "raamudu" and "Rama". Then it is easy to search the idea.
e.g. "Sri Sri", "SrIramgam", "mahAkavi", "prajAkavi", "Sriirangam Sriinivaasa
raavu" - should all be equated to "Sri Sri".
2) Keeping that in mind, the natural language (terminology) should be
considered to establish links with the controlled vocabulary.
e.g. Sreenivas Parachuri says in his essay "telugu calanacitramlO palle
paaTalu". That is his terminology. As a database builder, I would maintain an
index :1) Folk songs in Telugu Movies/Motion Pictures/Cinema.
2) Colloquial hymns in Andhra Movies/Motion Pictures/Cinema.
3) Village songs in Telugu Movies/Motion Pictures/Cinema.
4) Ja(a)napada Telugu literature and music on Telugu Silver Screen.
5) Andhra Cinema and Folklore Vocal Music.
6) Composers of Common Folk Telugu and Andhra Music in Telugu Cine
Field.
7) Farmers and workers of Andhra, their songs and Telugu Movie
Industry.
See! How many variations are there. Author of an essay has his/her liberty.
The author may call the same thing in many ways. If the posts on a single
topic "Telugu Folk Music in Films" is posted by Sreeni, and if you search
using the words "Andhra Janapada Songs in Telugu Cinema" - Good Luck! You
will never find.
Proposal: What we have to do is:
1) Set up "Global Search Strategy".
2) Index topics under controlled terminology.
We have to come up with controlled terminology.
Controlled terminology should be above 50% in circulation/useage.
3) Set up links between author's "natural language" and
archive's "controlled terminology/vocabulary".
4) Build a Search Engine for that.
With HTML it is very easy to do.
Now there is a meaning for our archives.
3) For information purpose, there is nothing such as a "Good Posting" and "Junk
Posting". All are same. The author of a posting should do the following"
1) Please give a short and condensed title.
2) Give Key words for your title and message.
e.g. "telugu raamaayaNaalalO strI citraNamu".
Keywords: telugu raamaayaNam strI SIlamu
Andhra raamaayaNam aaDavaari caaritra citraNa
Telugu Ramayana Woman Character
Andrha Ramayanam Female Character Literary Criticism
Please be liberal in using both English Equivalents and Telugu
Terms. It will be easy for those Non-Telugu speaking researchers
in Telugu and Andhra literature, culture, and history. We are
doing a favor to all of us and them.
This will also help the indexers of the archives to build links
between the natural language and controlled terminology.
4) Telusa List Operators (Chy. Ratnakar) should build a couple of fields:
In the message, at the top, the writer should do the following:
Type: essay/discussion/article/story/poem/culture/music/drama/discussion
Keywords: pertinent keywords
This will enable the indexer to build a search strategy.
We should do this ASAP for the new postings.
5) Once everything above is set and done, we have to visit the old archives.
Remove the non-telugu postings.
Keep all the discussions, essays, and other stuff.
Categorize them.
Keep them in packages in designated boxes (dockets).
Name the boxes.
Provide key words from the titles.
Read the essays quickly and pull out the natural language terms.
Give the controlled vocabulary terms for Telusa Archives.
Set up links between natural language and controlled vocabulary.
Build a search program.
Come up with a search engine.
Release it on HTML WWW.
Hope people will be interested in this. This is a Herculian task.
Prasad Chodavarapu discussed with me a long time ago. He is willing to help
in this. Madan Parigi (who is enjoying the Purna Market Mangoes and Pulla
Reddy Sweets now) has some exposure in what I talked about. Sreenivas
Parachuri discussed with me a number of times before and he hit me hard now
again on this issue. Ram Sanka (where are you) is also knowledgeable in this.
Bhishmacharya Kanneganti Ramarao and Dronacharya, of Computer Science, are
there to help us. Our Abhimanya, Suresh Kolichala, has already done a great
deal for both Telusa and SCIT, knows not only how to get into the byte vyUha
but also seasoned now how to get out and get around the program-kauravaas.
We have a wealth of Computer-Brains. Please do something.
I am willing to provide the background and strategies to search the archives.
Hope I make some sense.
regards
narasi
Disclaimer: Opinions are mine only.