You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 3.8 KiB

5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
5 years ago
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
  1. # Microformats2
  2. A [Microformats2](http://microformats.org/wiki/microformats-2) parser for Elixir.
  3. ## Installation
  4. This parser is [available in Hex](https://hex.pm/packages/microformats2):
  5. 1. Add microformats2 to your list of dependencies in `mix.exs`:
  6. def deps do
  7. [{:microformats2, "~> 0.2.0"}]
  8. end
  9. 2. If you want to directly `parse` from URLs, add httpotion to your list of dependencies in `mix.exs`:
  10. def deps do
  11. [{:microformats2, "~> 0.2.0"},
  12. {:httpotion, "~> 3.1"}]
  13. end
  14. ## Usage
  15. Give the parser an HTML string and the URL it was fetched from:
  16. Microformats2.parse("""<div class="h-card">
  17. <img class="u-photo" alt="photo of Mitchell"
  18. src="https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg"/>
  19. <a class="p-name u-url"
  20. href="http://blog.lizardwrangler.com/">Mitchell Baker</a>
  21. (<a class="u-url" href="https://twitter.com/MitchellBaker">@MitchellBaker</a>)
  22. <span class="p-org">Mozilla Foundation</span>
  23. <p class="p-note">
  24. Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities.
  25. </p>
  26. <span class="p-category">Strategy</span>
  27. <span class="p-category">Leadership</span>
  28. </div>
  29. """, "http://localhost")
  30. It will parse the object to a structure like that:
  31. %{rels: [],
  32. rel_urls: [],
  33. items: [%{type: ["h-card"],
  34. properties: %{photo: ["https://webfwd.org/content/about-experts/300.mitchellbaker/mentor_mbaker.jpg"],
  35. name: ["Mitchell Baker"],
  36. url: ["http://blog.lizardwrangler.com/",
  37. "https://twitter.com/MitchellBaker"],
  38. org: ["Mozilla Foundation"],
  39. note: ["Mitchell is responsible for setting the direction and scope of the Mozilla Foundation and its activities."],
  40. category: ["Strategy",
  41. "Leadership"]}}]}
  42. You can also provide HTML trees already parsed with Floki:
  43. Microformats2.parse(Floki.parse("""<div class="h-card">...</div>"""), "http://localhost")
  44. Or URLs if you have HTTPotion installed:
  45. Microformats2.parse("http://localhost")
  46. ## Dependencies
  47. We need [Floki](https://github.com/philss/floki) for HTML parsing and
  48. [HTTPotion](https://github.com/myfreeweb/httpotion) for fetching URLs.
  49. ## Features
  50. Implemented:
  51. - [parsing depth first, doc order](http://microformats.org/wiki/microformats2-parsing#parse_a_document_for_microformats)
  52. - [parsing a p- property](http://microformats.org/wiki/microformats2-parsing#parsing_a_p-_property)
  53. - [parsing a u- property](http://microformats.org/wiki/microformats2-parsing#parsing_a_u-_property)
  54. - [parsing a dt- property](http://microformats.org/wiki/microformats2-parsing#parsing_a_dt-_property)
  55. - [parsing a e- property](http://microformats.org/wiki/microformats2-parsing#parsing_an_e-_property)
  56. - [parsing implied properties](http://microformats.org/wiki/microformats-2-parsing#parsing_for_implied_properties)
  57. - nested properties
  58. - nested microformat with associated property
  59. - dynamic creation of properties
  60. - [rel](http://microformats.org/wiki/rel)
  61. - nested microformat without associated property
  62. - [normalize u-* property values](http://microformats.org/wiki/microformats2-parsing-faq#normalizing_u-.2A_property_values)
  63. Not implemented:
  64. - [value-class-pattern](http://microformats.org/wiki/value-class-pattern)
  65. - [include-pattern](http://microformats.org/wiki/include-pattern)
  66. - recognition of [vendor extensions](http://microformats.org/wiki/microformats2#VENDOR_EXTENSIONS)
  67. - backwards compatible support for microformats v1
  68. ## License
  69. This software is licensed under the [MIT license](https://choosealicense.com/licenses/mit/).