Wednesday, February 2, 2011

Programming Ideas for Text Simplification

Hellooo........

Today I must say is a "THINKING DAY" for all text simplify mates . From 2 hours we have been continuously thinking on how to implement the simplification of text.


The task given to us is to find possible solution for the following question:
How to simplify a text by replacement of synonyms and check for the appropriateness of that sentence after replacement. The ultimate output should be simplified text which makes sense!
and here is my part on the same......

Given a text, we first scan the text sentence wise.

In each we identify the complicated word by taking the frequency count of all words in the sentence excluding the prepositions and conjunctions.

we take only the top three complicated words i.e the last 3 frequency count of words

In this combination of three words replace the word with least frequency and then check for appropriateness of the word in Google search which shows the number of hits.
Next consider the next least frequency word and replace ,do the same for the third word too.
(ex : given a,b,c replace a and check ,replace b and check ,replace c and check)

Then take two words and replace (a,b,c------ replace ab and check ,replace bc and check ,replace ac and check)

Finally replace all the 3 words and check.

Store all the results from the Google hits and replace the one with maximum count......


Hope this bit fits in the large piece of cake!!...:P and the coding part starts from tomorrow which we are supposed to individually!..
The above idea is implemented using python, Wordnet ,Google API etc and this is not the end!, the research continues..............
Hope for the best .....
Cya..

No comments:

Post a Comment