Macquarie Home | Course Handbook | Library | Campus Map | Macquarie Contacts
Home page

Macquarie University ResearchOnline

Home
Add
-List Of Titles -A Framework for learning comprehensible theories in XML document classification

Please use this identifier to cite or link to this item: http://hdl.handle.net/1959.14/174964

OpenURL Link
42 Visitors 48 Hits 0 Downloads
Title
A Framework for learning comprehensible theories in XML document classification
Related
IEEE transactions on knowledge and data engineering, Vol. 24, No. 1, (2012), p.1-14
DOI
10.1109/TKDE.2011.158
Publisher
IEEE
Date
2012
Author/Creator
Wu, Jemma
Description
XML has become the universal data format for a wide variety of information systems. The large number of XML documents existing on the web and in other information storage systems makes classification an important task. As a typical type of semistructured data, XML documents have both structures and contents. Traditional text learning techniques are not very suitable for XML document classification as structures are not considered. This paper presents a novel complete framework for XML document classification. We first present a knowledge representation method for XML documents which is based on a typed higher order logic formalism. With this representation method, an XML document is represented as a higher order logic term where both its contents and structures are captured. We then present a decision-tree learning algorithm driven by precision/recall breakeven point (PRDT) for the XML classification problem which can produce comprehensible theories. Finally, a semi-supervised learning algorithm is given which is based on the PRDT algorithm and the cotraining framework. Experimental results demonstrate that our framework is able to achieve good performance in both supervised and semi-supervised learning with the bonus of producing comprehensible learning theories.
Description
14 page(s)
Subject Keyword
knowledge representation
Subject Keyword
machine learning
Subject Keyword
semi-supervised learning
Subject Keyword
XML document
Resource Type
journal article
Organisation
Macquarie University. Dept. of Environment and Geography

Identifier
http://hdl.handle.net/1959.14/174964
Identifier
ISSN:1041-4347
Identifier
mq-rm-2011007096
Identifier
mq_res-ext-2-s2.0-82155192311
Language
eng
Reviewed
Reviewed
Save/E-mail Citation
Citation Format
E-mail Address
Subject
"IEEE transactions on knowledge and data engineering"
 
OR
  • Show All  
  • Show My Selections 
Advanced Search

Search

XML document

Browse

  • By Title 
  • By Author/Creator 
  • By Department/Centre 
  • By Subject Keyword 
  • By Journal/Conference 
  • By FoR/RFCD codes 
  • By Resource Type 
  • By Date 

Highlights

  • Most Accessed Objects 
  • Recent Additions 
  • Pending Publications 
  • Author Profiles 

Resources

  • About ResearchOnline 
  • FAQ 
  • Open Access 
  • Open Access-FAQs 
  • Copyright 
  • Contribute 
  • Help 
  • Contact
  • Terms and Conditions 
Valid XHTML 1.0 Strict Powered by VITAL

Copyright Macquarie University | Privacy Statement | Accessibility Information

ABN 90 952 801 237 | CRICOS Provider No 00002J

Library Staff Sign In