Tables
February 2nd, 2006
Antabuse Online Buy Erythromycin Zyban Online Buy Soma Prednisone Online Buy Lotrisone Lipitor Online Buy Lipitor Erythromycin Online Buy CoumadinThe goal of my research is to be able to extract information from tables and lists on web pages. For example, from the table below, it is possible to understand that there is someone called Albert Einstein who was born in 1879, died in 1955, and whose major work was a certain special theory of relativity.
| First name | Last name | Born in | Died in | Most famous work |
|---|---|---|---|---|
| Albert | Einstein | 1879 | 1955 | Special theory of relativity |
| Isaac | Newton | 1643 | 1727 | Three laws of motion |
What we want is for a computer to be able to figure these relations out. That is not as easy as it sounds.
To complicate things further, a major problem is the large number of tables which do not contain any real information. As a matter of fact, the vast majority of tables on the internet today are used for formatting purposes, i.e. to position various elements of a web page properly on the screen, rather than displaying tabular data.
So the first thing I have to do is to write a program which can classify tables as being used for layout, or for containing data. Once I have that, I can start with the process of actually extracting information from the interesting tables. On the positive side, at least I get some use for the classification techniques I learned in the Artifical Intelligence course.
Entry Filed under: Studies
Leave a Comment
You must be logged in to post a comment.
Trackback this post | Subscribe to the comments via RSS Feed