REVISTA
Information

TODAS

Redirigiendo al acceso original de articulo en 16 segundos...

Inicio / Information / Vol: 5 Par: 4 (2014) / Artículo

ARTÍCULO

TITULO

Deep Web Search Interface Identification: A Semi-Supervised Ensemble Approach

Hong Wang

Qingsong Xu and Lifeng Zhou

Resumen

To surface the Deep Web, one crucial task is to predict whether a given web page has a search interface (searchable HyperText Markup Language (HTML) form) or not. Previous studies have focused on supervised classification with labeled examples. However, labeled data are scarce, hard to get and requires tedious manual work, while unlabeled HTML forms are abundant and easy to obtain. In this research, we consider the plausibility of using both labeled and unlabeled data to train better models to identify search interfaces more effectively. We present a semi-supervised co-training ensemble learning approach using both neural networks and decision trees to deal with the search interface identification problem. We show that the proposed model outperforms previous methods using only labeled data. We also show that adding unlabeled data improves the effectiveness of the proposed model.

Palabras claves

semi-supervised learning - Deep Web mining - search interface identification - ensemble learning

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 5 Parte: 4 (2014)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Information
Applied Sciences
AI

DOI

https://doi.org/10.3390/info5040634

Artículos similares

Using Machine Learning and Routing Protocols for Optimizing Distributed SPARQL Queries in Collaboration

Acceso

Benjamin Warnke, Stefan Fischer and Sven Groppe

Due to increasing digitization, the amount of data in the Internet of Things (IoT) is constantly increasing. In order to be able to process queries efficiently, strategies must, therefore, be found to reduce the transmitted data as much as possible. SPAR... ver más

Revista: Computers

Numerical Investigation on Effect of Opening Ratio on Structural Performance of Reinforced Concrete Deep Beam Reinforced with CFRP Enhancements

Acceso

Yasar Ameer Ali, Lateef Najeh Assi, Hussein Abas, Hussein R. Taresh, Canh N. Dang and SeyedAli Ghahari

Reinforced concrete deep beams are a vital member of infrastructures such as bridges, shear walls, and foundation pile caps. Thousands of dollars and human lives are seriously threatened due to shear failure, which have developed in deep beams containing... ver más

Revista: Infrastructures

Unbalanced Web Phishing Classification through Deep Reinforcement Learning

Acceso

Antonio Maci, Alessandro Santorsola, Antonio Coscia and Andrea Iannacone

Web phishing is a form of cybercrime aimed at tricking people into visiting malicious URLs to exfiltrate sensitive data. Since the structure of a malicious URL evolves over time, phishing detection mechanisms that can adapt to such variations are paramou... ver más

Revista: Computers

Deep Learning Approaches for Big Data-Driven Metadata Extraction in Online Job Postings

Acceso

Panagiotis Skondras, Nikos Zotos, Dimitris Lagios, Panagiotis Zervas, Konstantinos C. Giotopoulos and Giannis Tzimas

This article presents a study on the multi-class classification of job postings using machine learning algorithms. With the growth of online job platforms, there has been an influx of labor market data. Machine learning, particularly NLP, is increasingly... ver más

Revista: Information

Optimization of Computational Resources for Real-Time Product Quality Assessment Using Deep Learning and Multiple High Frame Rate Camera Sensors

Acceso

Adi Wibowo, Joga Dharma Setiawan, Hadha Afrisal, Anak Agung Sagung Manik Mahachandra Jayanti Mertha, Sigit Puji Santosa, Kuncoro Budhi Wisnu, Ambar Mardiyoto, Henri Nurrakhman, Boyi Kartiwa and Wahyu Caesarendra

Human eyes generally perform product defect inspection in Indonesian industrial production lines; resulting in low efficiency and a high margin of error due to eye tiredness. Automated quality assessment systems for mass production can utilize deep learn... ver más

Revista: Applied System Innovation

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas