how to parsing HTML with Nokogiri


Installation is very easy. Just add to your Gemfile.

gem "nokogiri"

Learn how to Generate HTML.

Quick start to parsing HTML

Parsing HTML is easy, and you can take advantage of CSS selectors or XPath queries to find things in your document:

require 'open-uri'
require 'nokogiri'

# Perform a google search
doc = Nokogiri::HTML(open(''))

# Print out each link using a CSS selector
doc.css('h3.r > a.l').each do |link|
  puts link.content

Here is an example parsing some HTML and searching it using a combination of CSS selectors and XPath selectors:

require 'nokogiri'

doc = Nokogiri::HTML.parse(<<-eohtml)
    <title>Hello World</title>
    <h1>This is an awesome document</h1>
      I am a paragraph
        <a href="">I am a link</a>

# Search for nodes by css
doc.css('p > a').each do |a_tag|
  puts a_tag.content

# Search for nodes by xpath
doc.xpath('//p/a').each do |a_tag|
  puts a_tag.content

# Or mix and match.'//p/a', 'p > a').each do |a_tag|
  puts a_tag.content

# Find attributes and their values'a').first['href']

About me
Interested about Ruby on Rails

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: