searchhut/templates/about.html

{{template "head.html" .}}
<header>
  <h1>
    <a href="/">
      <span class="icon">
        <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><path d="M256 8C119 8 8 119 8 256s111 248 248 248 248-111 248-248S393 8 256 8zm0 448c-110.5 0-200-89.5-200-200S145.5 56 256 56s200 89.5 200 200-89.5 200-200 200z"></path></svg>
      </span>
      <span>searchhut</span>
    </a>
  </h1>
</header>
<main>
  <h2>About searchhut</h2>
  <p>
    SearchHut is a curated
    <abbr title="'Free' as in freedom, in that we provide public access to our software source code.">free software</abbr>
    search engine developed and operated by
    <a href="https://sourcehut.org">SourceHut</a>.

  <h3>About the search engine</h3>
  <p>
    The search engine itself is pretty basic at the moment. In the future, it
    will be expanded to support narrowing down your search terms by applicable
    tags (e.g. #docs #python), filtering for sites with and without JavaScript,
    searching specific sites (e.g. @wikipedia.org), and other features. The
    service does not (and will never) have advertising and is directly
    subsidized by SourceHut.

  <h3>About the index</h3>
  <p>
    SearchHut indexes from a <a href="/about/domains">curated set of domains</a>.
    The quality of results is higher as a result, but the index covers a small
    subset of the web. The index prioritizes authoritative, high-quality, and
    informative sources. Any websites engaging in SEO spam are rejected from
    the index. This instance is maintained by free software developers and
    biases towards indexing websites that serve their needs and interests. If
    you would like a website added to the index, fill out the
    <a href="/request">indexing request form</a>.

  <h3>About the crawler</h3>
  <p>
    The SearchHut crawler is very simple. It crawls websites by queuing
    first-party links only, and stores data in a simple Postgres
    full-text-search index. The crawler respects robots.txt Allow, Deny, and
    Crawl-Delay directives. For the full details on how the crawler works, and
    for information for web admins of indexed sites, see
    <a href="/docs/crawler.html">the documentation</a>.

  <p>
    The crawler's User-Agent is:

  <pre>SearchHut Bot 0.0 (GNU AGPL 3.0); https://sr.ht/~sircmpwn/searchhut &lt;sir@cmpwn.com&gt;</pre>

  <h3>About the API</h3>
  <p>
    The search engine provides a public GraphQL API for anonymous use, allowing
    users to conduct web searches programmatically. For information about the
    API, see
    <a href="/docs/api.html">the documentation</a>.

  <h3>About the software</h3>
  <p>
    SearchHut is an AGPL 3.0-licensed free software project hosted
    <a href="https://sr.ht/~sircmpwn/searchhut">on SourceHut</a>, which provides
    git repositories, a bug tracker, and mailing lists for development &amp;
    discussion. Patches are welcome, and users are encouraged to set up their
    own search engines crawling whatever subset of the web they like. It could
    be easily repurposed to create an academic-focused search engine, for
    instance. For information about deploying your own instance, see
    <a href="/docs/deploy.html">the documentation</a>.

</main>
{{template "footer.html" .}}