Reading XML with Groovy and SAX
JJBugでWeb Beansリファレンスガイドを翻訳するにあたって用語集が必要です。DocBookから
(4/11追記)
import javax.xml.parsers.SAXParserFactory import org.xml.sax.helpers.DefaultHandler import org.xml.sax.* class RecordsHandler extends DefaultHandler { def titles = def emphasises = def currentMessage def titleFlag = false def emphasisFlag = false def literalFlag = false void startElement(String ns, String localName, String qName, Attributes atts) { switch (qName) { case 'title': titleFlag=true break case 'literal': literalFlag=true break case 'emphasis': emphasisFlag=true break } } void characters(char[] chars, int offset, int length) { if (titleFlag || emphasisFlag) { currentMessage += new String(chars, offset, length) } } void endElement(String ns, String localName, String qName) { switch (qName) { case 'title': currentMessage = removeLF(currentMessage) titles << currentMessage currentMessage = "" titleFlag=false break case 'literal': literalFlag=false break case 'emphasis': currentMessage = removeLF(currentMessage) emphasises << currentMessage currentMessage = "" emphasisFlag=false break } } String removeLF(str) { return str.trim().replaceAll("\n") {""} } } def handler = new RecordsHandler() def reader = SAXParserFactory.newInstance().newSAXParser().XMLReader reader.setContentHandler(handler) //def fromdir = "C:/svnwork/webbeans-reference-guide/test" def fromdir = "C:/svnwork/webbeans-reference-guide/en-US" def glossary = new File("webbeans-glossary.csv") if (glossary.exists()) glossary.delete() new File(fromdir).eachFile { file -> def filename = file.name if (filename =~ ".xml") { println "File: " + filename def inputStream = new FileInputStream(file) reader.parse(new InputSource(inputStream)) } } println "*** title ***" handler.titles.each { glossary.append("title, $it, \n") } println "*** empahsis ***" handler.emphasises.each { glossary.append("emphasis, $it, \n") }
生成されたファイルはこれ。ゴミが多いので用語集にするにはかなり修正が必要そうですが、作業のベースとしては使えそうです。