XPath and Clojure

9 11 2008

I’ve modified previous script to use Java’s XPath to extract RSS titles.

(import '(javax.xml.parsers DocumentBuilderFactory DocumentBuilder)
        '(org.w3c.dom Document Node)
        '(javax.xml.xpath XPathFactory XPath XPathExpression XPathConstants)
  )

(let [domFactory (doto (. DocumentBuilderFactory newInstance) (setNamespaceAware true))
      builder (. domFactory newDocumentBuilder)
      doc (. builder parse "/Users/jgoamakf/hotnews.rss")
      factory (. XPathFactory newInstance)
      xpath (. factory newXPath)
      expr (. xpath compile "//title/text()")
      result (. expr evaluate doc (. XPathConstants NODESET))
      ]
   (loop [index 0
          len (. result getLength)]
     (if (< index len)
       (do
         (println (. (. result item index) getNodeValue))
         (recur (inc index) len)))
))




RSS titles with Clojure

2 11 2008

This script prints every title appeared in RSS 2.0 feed.

(clojure/refer 'clojure.xml)

(let 
  (doseq entry content
    (let [el  (get (get entry :content) 0)]
      (if (= (get el :tag) :title)
        (println (get (get el :content) 0))
      )))
)

You need a RSS 2.0 XML file on your local hard drive before invoke.

% ftp http://images.apple.com/main/rss/hotnews/hotnews.rss
% java -cp clojure/clojure.jar clojure.lang.Script rsstitles.clj