Tables

February 2nd, 2006

Antabuse Online Buy Erythromycin Zyban Online Buy Soma Prednisone Online Buy Lotrisone Lipitor Online Buy Lipitor Erythromycin Online Buy Coumadin

The goal of my research is to be able to extract information from tables and lists on web pages. For example, from the table below, it is possible to understand that there is someone called Albert Einstein who was born in 1879, died in 1955, and whose major work was a certain special theory of relativity.

First name Last name Born in Died in Most famous work
Albert Einstein 1879 1955 Special theory of relativity
Isaac Newton 1643 1727 Three laws of motion

What we want is for a computer to be able to figure these relations out. That is not as easy as it sounds.

To complicate things further, a major problem is the large number of tables which do not contain any real information. As a matter of fact, the vast majority of tables on the internet today are used for formatting purposes, i.e. to position various elements of a web page properly on the screen, rather than displaying tabular data.

So the first thing I have to do is to write a program which can classify tables as being used for layout, or for containing data. Once I have that, I can start with the process of actually extracting information from the interesting tables. On the positive side, at least I get some use for the classification techniques I learned in the Artifical Intelligence course. :-)

Entry Filed under: Studies

Leave a Comment

You must be logged in to post a comment.

Trackback this post  |  Subscribe to the comments via RSS Feed


Calendar

February 2006
M T W T F S S
« Jan   Mar »
 12345
6789101112
13141516171819
20212223242526
2728  

Most Recent Posts