![]() |
![]() |
||
![]() |
|||
|
Welcome to ΔEiXTo!Getting data from many unstructured web pages, probably in a repetitive fashion with extensive copy-paste operations, is tedious and time consuming. Wouldn't it be nice to specify the content you want from a web page once and then have an application to do the laborious job for you? ΔEiXTo (or DEiXTo) is a powerful web data extraction tool that is based on the W3C Document Object Model (DOM). It allows users to create highly accurate extraction rules (wrappers), which describe what pieces of data to scrape from a web page. DEiXTo consists of two separate standalone components:
DEiXTo can contend with a wide range of web sites with high precision and recall, since it provides the user with an arsenal of features aiming at the construction of well-engineered extraction rules. Web content extracted with DEiXTo can be saved in either RSS, XML or tab delimited text format. Wrappers built by DEiXTo can be scheduled to run automatically and thus provide automated access to resources of interest, saving users a lot of time, energy, and repetitive effort.
ÄEiXTo was developed by Kostas Ntonas and Fotis Kokkoras under the supervision of Assistant Professor Nick Bassiliades in the Informatics Department (LPIS Group) of the Aristotle University of Thessaloniki, Greece.
Note: If you think that certain content on our site violates copyrights that you own |
|||||||||||||||||