Inquiry Basket Login Signup Contact Us Help
Arab Trade Zone.com Middle East B2B Marketplace  
Home Trade Leads Product Catalogs Company List Importer Directory  
  Extracting Structured Data from Web Page
Welcome
        My Home
        Contact Details
My Products
        Software
Trade Leads
        My Sell Offers
        My Buy Offers
Search
 
Description

Many web sites contain large sets of pages generated using a common template or layout. For example, Amazon lays out the author, title, comments, etc. in the same way in all its book pages. The values used to generate the pages (e.g., the author, title,...) typically come from a database. In this paper, we study the problem of automatically extracting the database values from the web pages without any learning examples or other similar human input. We formally define the notion of a template, and propose a model that describes how values are encoded into pages using a template. We present an extraction algorithm that uses sets of words that have similar occurrence pattern in the input pages, to construct the template. The constructed template is then used to extract values from the pages. We show experimentally that the extracted values make semantic sense in most cases.

Send Inquiries Add to Basket Add Contact Add to Favorites
Price 0.00

Location

Minimum Order 0

Quantity 100

Shipping Cost 0.00

Samples Available no

Product Status New

Posted On May 15th,2008 12:00 AM

Posted By webmining [ Free Member ]

Payments Mode

Categories Home > Computer Hardware & Software > Internet Service

 

 

 
  Home | Sell Offers | Buy Offers |Products| Companies | Login
Post Sell Offers | Post Buy Offers | Post Products | Check Inquiries | Site Map | My Favorites | Upgrade Membership
Latest News |Press Releases | Success Stories | Trade Show Calender | FAQ
Terms and Conditions |Disclaimer | About Us
Global | China Town | Souq Arabia