Metahint — Frequently Asked Questions
Last updated: December 13th, 2010.
What is Metahint?
Metahint is a new search engine that searches within the contents of your website. It gives you an embeddable widget that you can put on your site to replace the existing search box (if there is any).
How is it different from existing site search?
Chances are that the current search box on your website (if there is any) lacks an autocomplete/suggest feature. With Metahint, your visitors will get suggestions as they start typing a search query. The suggestions are automatically generated from the contents of the website and ranked according to their importance within the website. This allows your visitors to find the content they are looking for even if they don't know the exact terms you used in the text.
Can I have it on my website?
Yes. If you take part in our beta program, you can already experiment with it on your website. Just copy and paste the widget code into your website. We are still testing and fine-tuning parts of Metahint. The beta test helps us in fixing errors and understanding how site search can be improved. If you would like to take part in this beta program, sign up on the front page.
Will it work on my website?
Yes, hopefully. In the beginning we target blogs, because they are among the most popular website formats and in our opinion they are the most sorely missing a usable search. In the future we plan to support other website formats as well.
There are some current limitations:
- Metahint will process correctly only English-language content (there is some English-specific magic for summarizing text).
- Our current capacity allows us to process blogs of moderate size only. That means approximately 100-150 posts.
- The world of Metahint is made of text: it cannot get useful information out of images, videos and dynamic content such as Flash animations.
The capacity constraints will probably change and will be made more precise in the future. We kindly ask you to submit only your own blog(s) or blogs that you administer.
How much does it cost?
Taking part in the beta test is completely free. We will do our best to keep Metahint free for small websites with common needs, even after the beta period is over. This service will be sufficient for blogs with a moderate number of pages and visitors. For larger blogs with special needs we will design a premium version with extra features. We are still working on our pricing plan.
Why do you limit the maximum number of blogs I can submit?
We are using this limit as a measure of precaution and controlled growth. During beta testing, when the chance of errors creeping up from the woodwork is significant, we'd like to keep our workload manageable. Once we are confident that the system is mature enough to handle the load and stress, we might raise or eliminate this limit altogether.
Currently, each user can submit at most three blogs. If you want to use Metahint on more blogs, tell us about it and we will consider raising your limit.
What is the search widget?
The search widget is a rectangular box that you can place into your blog. Visitors can search and explore your blog by entering search queries into this box. You can place the widget into your blog by copying a few lines of code into the page source (see below). This can be done after Metahint has processed the contents of your blog.
How does the search widget look like?
Here's how the search widget looks while idle: Note: the widget is only the bit within the dotted rectangle; the slightly larger crop is only for context.
Here's a snapshot with the widget in action, providing suggestions to the visitor queries:
How does the search widget work?
When you start typing a search query, it suggests continuations from the contents of the website. When you click on one of these suggestions, the widget takes you directly to the given page.
How can I embed the widget into my blog?
Click on the Copy to clipboard button, then paste the clipboard contents without modification into your website. If you have access to the HTML source of the website, you can paste the lines directly into the source code. Most probably you would want the widget to appear in the sidebar of your blog or somewhere near the header, but you can place it anywhere you like. On most blogging platforms there are even easier ways to achieve this:
- In the administrator screen choose Appearance » Widgets from the left menu.
- From the available widgets drag the one labelled TEXT (Arbitrary text or HTML) onto one of the sidebars.
- When the new widget is in place, click the down-arrow on its right. Paste the Metahint code in the text box that opens up, give a title if you like and Save, then Close. You can later drag the widget into its right position.
Just to be sure, try the above steps and see if the widget appears. If you can display other widgets on your blog but not the Metahint search box, let us know.
Embedding Metahint on Blogger (blogspot.com address)
- Log in and go to the blog settings page. Select the tab Design » Page Elements. Chances are that this is the one that opens up by default.
- In the sidebar panel click Add gadget.
Embedding Metahint on other platforms
We will try to add detailed instructions for various blog engines in the future. In the meantime, let us know if you have any difficulty embedding the widget in your website and we will try to help.
Will the search widget interfere with the design of my site?
Most likely not. Since only a fool or a Sith deals in absolutes, we cannot be certain that this won't be the case. We tried to make the widget as simple and unobtrusive as possible, thus reducing the possibility of conflict with other site elements to a minimum.
The reverse is more likely to happen: the widget might adapt to the design of your website. In most cases this is normal, in rare cases this will render the widget in a weird way. Please let us know if the widget looks — for lack of a better word — funky.
Can I customize the appearance of the widget?
Will I have to host the widget?
No. The components which drive the widget are all hosted on the Metahint server and served from there every time the widget is displayed. By the same token, the queries typed into the widget are submitted to the Metahint server, not to the server hosting your site. This is common to all widgets out there.
Will the widget make my site load slower?
Most probably not. When it is first displayed, the total data downloaded by the widget is ~32KB. Since most browsers can load the widget asynchronously, the main contents of the page will be displayed first without apparent slowdown. We have compressed the widget code to reduce bandwidth consumption and used the current best practices to our knowledge. The bulk of the widget code is cached by the browser, subsequent displays use therefore even less bandwidth. We serve the widget from high capacity servers located in the US, and have gone to great lengths to squeeze out every microsecond from its loading time.
What is the Widget Displays Panel?
The Widget Displays Panel lists when and on which pages the Metahint widget was displayed as visitors were browsing your site.
- Host: the visitor's IP address.
- Referrer: the URL of the page which contains the embedded widget. Depending on the user's browser settings, this information might not be always available.
- Time: the time of the visit.
What is the Widget Searches Panel?
The Widget Searches Panel displays a log of queries performed using the Metahint widget on your site. The most popular suggestions are ranked in a separate table. Partial queries are not shown, only events where the visitor selected a suggestion and was redirected to the associated page.
- Host: the visitor's IP address.
- Referrer: the URL of the page where the visitor performed the query. This field is present only if the visitor's browser passed on the HTTP referrer information at the time of the search; if this is the case, the Referrer will be shown in a tooltip, when you hover the cursor over the Host field.
- Term: the suggestion selected by the visitor.
- Redirection: the URL of the page where the visitor was redirected. This field appears in a tooltip if you hover the mouse over the Term field.
- Time: the moment when the search was performed.
What is the Progress Panel?
The Progress Panel lists the events which occurred as Metahint was processing your blog. The events are listed in reverse-chronological order.
- Event: simple label. Tooltip displays what the label represents.
- Comment: a more detailed explanation of what happened and the parties involved.
- Time: the moment when the event occured.
What is the Files Panel?
The Files Panel displays the list of files which have been fetched and processed for your blog. The following information is provided about each file:
- URL: the WWW address of the file. Clicking on this link will display the contents of the file in a popup window.
- Title: the HTML title, if any, of the file. The title is hidden, but appears in a tooltip when the mouse is hovering over the URL field.
- Hints: the number of hints constructed for the file in question. These expressions originate from the content of the file and will be provided as suggestions for the user queries.
- Include: if you would like the hints constructed for this file to be excluded from the search results given to queries, you should untick this checkbox. This also means that search will never redirect to the file in question.
- Retrieved: the moment when the crawlers fetched the file from your blog. The timestamp in the table has a fuzzy format, but if you hover the mouse over it, a tooltip with a more exact format will appear.
Why do certain files have so few hints?
This could mean that:
- the file has limited processable content (e.g. short article, index file, table of contents, etc.). In such cases, this is the normal behaviour. Metahint also ignores images, videos, Flash animations and similar non-textual content.
- the blog is larger than what we are currently able to process. It could be that we found a file but we did not have the capacity to process it. What are the limitations?
- the file has a format which is not understood by Metahint, hence the inability to extract and process its contents. If the blog is not excessively large, the concerned file has plenty of text, yet Metahint builds few hints for it, please bring it to our attention.
What is the Admin Panel?
In the Admin Panel you can tweak, for the time being, a limited set of values which influence how your blog is processed and displayed on Metahint.
What is this link containing the invite code?
During beta testing, this link gives you access to Metahint. Think of it as a one-step authentication, a username and password submission rolled into one. As such, consider bookmarking it and keeping it private.
What is the blog seed?
If your blog has a file containing the links to all blog entries (i.e. index file, list of articles, etc.), you can submit the URL of this file using this editable field. If you specify a valid URL here, Metahint will attempt to fetch the content from your blog by following the links present in the supplied file. In other words, it will use the specified file as a seed, instead of the default recursive discovery performed on the root of your blog. In most cases you can simply leave this field empty.
What is the blog RSS?
If your blog has an RSS feed, that can help the Metahint crawler find the posts more easily. We do our best to automatically detect the RSS feed, but if you know better, you can manually override this setting. The provided RSS feed will be used during the subsequent crawlings.
For those in doubt, this is the same address as the one you would enter in your favorite RSS reader.
What is the blog sitemap?
Similarly to the RSS feed, if your website has an XML sitemap, that can help the crawler find the individual pages quicker and more efficiently.
If you have already submitted your site to a search engine, chances are that you already have a sitemap.xml file. Just enter the URL of it and Metahint will use it during the next crawling.
What is the URL blacklist?
Metahint will skip URLs that match any of the patterns in this list, as they usually refer to some administrative page without useful content. We tried to include patterns covering most common blog engines. If Metahint still fetches irrelevant pages from your blog, or if it misses some page it shouldn't, you can customize this list by adding or removing patterns. Enter each pattern on a separate line. You can use * as a wildcard within patterns, as it will match any sequence of characters.
What is the URL graylist?
The URLs matched by these patterns will not be added to the search results, but they will be fetched during processing as they might contain links to pages with useful content. Typically, these URLs should refer to table-of-contents-type of pages or to category listings.
What is a re-crawl request?
By clicking on the Re-crawl button, you are explicitely instructing the crawlers to fetch and process content from your site. The crawlers will use the seed, RSS and sitemap parameters, if any, to perform their task.
This operation is entirely at your discretion, as the crawlers will visit your blog for fresh material periodically, with or without your intervention. If, however, you wanted the next crawling happening earlier, you should use this feature.
What is the fallback search setting?
If Metahint can't provide any matching suggestions to a query, it humbly steps aside and gives you the opportunity to run the query through other search engines. This setting controls to which search engine should Metahint hand off the task. The generic search engines are configured to search within the confines of your site. Alternatively, if your blog engine is Wordpress or Blogspot, you might try selecting the corresponding options here.
What is a featured blog?
The Metahint front-page can display a search-box for your blog if you choose to activate this setting. Notice the can in the previous sentence: there is no guarantee that there will always be a search-box for every featured blog, since — for the sake of variety — Metahint will display only a handful randomly selected featured blogs every time the front-page is loaded.
If your blog contains objectionable material, please disable this feature.
Is the search-box on the Metahint front-page identical to the embeddable search-widget?
No. Although they share the same roots, they are separate entities, differing in their appearance, functionality and purpose.
What is the blog title?
This is an editable field (click to edit it) which gives you the opportunity to define an optional title for your blog. This title will be displayed in the search box, above the URL of your blog (see the highlighted section in the picture below). It is by no means mandatory to define a title, but quite nice if you do. Should you agree, please also consider that “Brevity is the soul of wit”.
What is the blog description?
This editable field (click to edit it) allows you define a description for your blog, which will be shown in the search box (see the highlighted section in the picture below). The same principle — “Less is more” — applies here as well. Try to be courteous and concise.
What are the implications of deleting a blog?
Alternate title: Oooooo, what does this button dooo?
The removal action is immediate and unrecoverable. Here's what happens:
- we erase all traces of your blog from our database: suggestions, files, statistics, processing logs, settings, the lot.
- we remove the http://www.metahint.com/your-blogs-url page.
- we deactivate every associated search widget, no matter their embedding location. It is up to you to remove the search widget from the source of your pages, otherwise the deactivated widget will appear as blank (i.e. zombie widget).
- we will be sorry.
If you have signed up for beta testing and would like us to remove your email address from the database, please contact us and we will comply.
How can I keep in touch with you?
Email is the best way to reach us. Our clocks tick in Central European Time, but we will try to answer all messages within one day.