How do I parse an HTML table with Nokogiri?

#!/usr/bin/ruby1.8 require ‘nokogiri’ require ‘pp’ html = <<-EOS (The HTML from the question goes here) EOS doc = Nokogiri::HTML(html) rows = doc.xpath(‘//table/tbody[@id=”threadbits_forum_251″]/tr’) details = rows.collect do |row| detail = {} [ [:title, ‘td[3]/div[1]/a/text()’], [:name, ‘td[3]/div[2]/span/a/text()’], [:date, ‘td[4]/text()’], [:time, ‘td[4]/span/text()’], [:number, ‘td[5]/a/text()’], [:views, ‘td[6]/text()’], ].each do |name, xpath| detail[name] = row.at_xpath(xpath).to_s.strip end detail end pp details … Read more

nokogiri gem installation error

2020 April 6th Update: macOS Catalina 10.15 gem install nokogiri — –use-system-libraries=true –with-xml2-include=/Applications/ macOS Mojave 10.14 gem install nokogiri — –use-system-libraries=true –with-xml2-include=/Applications/ macOS High Sierra 10.13 gem install nokogiri — –use-system-libraries=true –with-xml2-include=/Applications/ macOS Sierra 10.12: gem install nokogiri — –use-system-libraries=true –with-xml2-include=/Applications/ OS X El Capitan 10.11 gem install nokogiri — –use-system-libraries=true –with-xml2-include=/Applications/ Consider to add … Read more

Nokogiri, open-uri, and Unicode Characters

Summary: When feeding UTF-8 to Nokogiri through open-uri, use open(…).read and pass the resulting string to Nokogiri. Analysis: If I fetch the page using curl, the headers properly show Content-Type: text/html; charset=UTF-8 and the file content includes valid UTF-8, e.g. “GenealogĂ­a de Jesucristo”. But even with a magic comment on the Ruby file and setting … Read more

Installing Nokogiri on OSX 10.10 Yosemite

I managed to install Nokogiri under Yosemite (OS X 10.10 Preview). Step 1: Install Brew Skip this if brew was installed. ruby -e “$(curl -fsSL” Step 2: Install brew libs brew tap homebrew/dupes brew install libxml2 libxslt brew install libiconv Step 3: Download and install Apple Commandline Tools for 10.10 It’s important that you … Read more

Error installing Nokogiri 1.5.0 with rails 3.1.0 and ubuntu

You need to have all the necessary libraries installed on your machine. When you installed RVM , it should have listed this for you. On the current version of rvm, you can run rvm requirements to see the exact list. Right now, that list is: sudo apt-get install build-essential openssl libreadline6 libreadline6-dev curl git-core zlib1g … Read more

Mac user and getting WARNING: Nokogiri was built against LibXML version 2.7.8, but has dynamically loaded 2.7.3

If you installed Nokogiri with gem install nokogiri, you can resolve this warning by running gem pristine nokogiri to recompile the gem’s C extension. If you installed Nokogiri with bundle install, you can resolve this warning by running bundle exec gem pristine nokogiri to recompile the C extension of the gem wherever Bundler installed it.

Nokogiri/Xpath namespace query

All namespaces need to be registered when parsing. Nokogiri automatically registers namespaces on the root node. Any namespaces that are not on the root node you have to register yourself. This should work: puts doc.xpath(‘//dc:title’, ‘dc’ => “URI”) Alternately, you can remove namespaces altogether. Only do this if you are certain there will be no … Read more

Why does installing Nokogiri on Mac OS fail with libiconv is missing?

I had the same issue. Unfortunately the “Installing Nokogiri” doesn’t cover Iconv issues. Here’s how I resolved the issue. First install homebrew, it’ll make your life easier. If you already have it installed, be sure to grab the latest formulae by updating like so: brew update Note: In OSX 10.9+ you may need to install … Read more