You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

3.8 KiB

Microformats2

A Microformats2 parser for Elixir.

Installation

This parser is available in Hex:

  1. Add microformats2 to your list of dependencies in mix.exs:

    def deps do
      [{:microformats2, "~> 0.2.0"}]
    end
    
  2. If you want to directly parse from URLs, add httpotion to your list of dependencies in mix.exs:

    def deps do
      [{:microformats2, "~> 0.2.0"},
       {:httpotion, "~> 3.1"}]
    end
    

Usage

Give the parser an HTML string and the URL it was fetched from:

Microformats2.parse("""<div class="h-card">
  <img class="u-photo" alt="photo of Mitchell"
       src="https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg"/>
  <a class="p-name u-url"
     href="http://blog.lizardwrangler.com/">Mitchell Baker</a>
  (<a class="u-url" href="https://twitter.com/MitchellBaker">@MitchellBaker</a>)
  <span class="p-org">Mozilla Foundation</span>
  <p class="p-note">
    Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities.
  </p>
  <span class="p-category">Strategy</span>
  <span class="p-category">Leadership</span>
</div>
""", "http://localhost")

It will parse the object to a structure like that:

%{rels: [],
  rel_urls: [],
  items: [%{type: ["h-card"],
            properties: %{photo: ["https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg"],
                          name: ["Mitchell Baker"],
                          url: ["http://blog.lizardwrangler.com/",
                                "https://twitter.com/MitchellBaker"],
                          org: ["Mozilla Foundation"],
                          note: ["Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities."],
                          category: ["Strategy",
                                     "Leadership"]}}]}

You can also provide HTML trees already parsed with Floki:

Microformats2.parse(Floki.parse("""<div class="h-card">...</div>"""), "http://localhost")

Or URLs if you have HTTPotion installed:

Microformats2.parse("http://localhost")

Dependencies

We need Floki for HTML parsing and HTTPotion for fetching URLs.

Features

Implemented:

Not implemented:

License

This software is licensed under the MIT license.