searchhut/templates/about.html
2022-07-10 13:07:00 +02:00

73 lines
3.2 KiB
HTML

{{template "head.html" .}}
<header>
<h1>
<a href="/">
<span class="icon">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><path d="M256 8C119 8 8 119 8 256s111 248 248 248 248-111 248-248S393 8 256 8zm0 448c-110.5 0-200-89.5-200-200S145.5 56 256 56s200 89.5 200 200-89.5 200-200 200z"></path></svg>
</span>
<span>searchhut</span>
</a>
</h1>
</header>
<main>
<h2>About searchhut</h2>
<p>
SearchHut is a curated
<abbr title="'Free' as in freedom, in that we provide public access to our software source code.">free software</abbr>
search engine developed and operated by
<a href="https://sourcehut.org">SourceHut</a>.
<h3>About the search engine</h3>
<p>
The search engine itself is pretty basic at the moment. In the future, it
will be expanded to support narrowing down your search terms by applicable
tags (e.g. #docs #python), filtering for sites with and without JavaScript,
searching specific sites (e.g. @wikipedia.org), and other features. The
service does not (and will never) have advertising and is directly
subsidized by SourceHut.
<h3>About the index</h3>
<p>
SearchHut indexes from a <a href="/about/domains">curated set of domains</a>.
The quality of results is higher as a result, but the index covers a small
subset of the web. The index prioritizes authoritative, high-quality, and
informative sources. Any websites engaging in SEO spam are rejected from
the index. This instance is maintained by free software developers and
biases towards indexing websites that serve their needs and interests. If
you would like a website added to the index, fill out the
<a href="/request">indexing request form</a>.
<h3>About the crawler</h3>
<p>
The SearchHut crawler is very simple. It crawls websites by queuing
first-party links only, and stores data in a simple Postgres
full-text-search index. The crawler respects robots.txt Allow, Deny, and
Crawl-Delay directives. For the full details on how the crawler works, and
for information for web admins of indexed sites, see
<a href="/docs/crawler.html">the documentation</a>.
<p>
The crawler's User-Agent is:
<pre>SearchHut Bot 0.0 (GNU AGPL 3.0); https://sr.ht/~sircmpwn/searchhut &lt;sir@cmpwn.com&gt;</pre>
<h3>About the API</h3>
<p>
The search engine provides a public GraphQL API for anonymous use, allowing
users to conduct web searches programmatically. For information about the
API, see
<a href="/docs/api.html">the documentation</a>.
<h3>About the software</h3>
<p>
SearchHut is an AGPL 3.0-licensed free software project hosted
<a href="https://sr.ht/~sircmpwn/searchhut">on SourceHut</a>, which provides
git repositories, a bug tracker, and mailing lists for development &amp;
discussion. Patches are welcome, and users are encouraged to set up their
own search engines crawling whatever subset of the web they like. It could
be easily repurposed to create an academic-focused search engine, for
instance. For information about deploying your own instance, see
<a href="/docs/deploy.html">the documentation</a>.
</main>
{{template "footer.html" .}}