Search engines work by crawling hundreds of billions of web pages, indexing them, and serving them to you.
When you type a query into a search engine, web crawlers, known as bots or spiders, crawl thousands, sometimes millions of pages in their index, choose which are most relevant (based on many factors), and serve you an answer.
In this guide, we will provide you with a basic understanding of how search engines work through three steps:
What is a Search Engine?
Search engines are a searchable database of web content. They are made up of two parts, the index, which is a digital library of information about the web pages, and the algorithm, which is the search engine’s program that matches your search with a result from the index.
How Does a Search Engine Work?
Crawling: Can the Search Engine Find Your Site?
For search engines to serve you web pages they must discover them first. As of May 2020, there are an estimated 1.7 billion websites on the internet, resulting in billions of pages. There is no one place where all websites and pages live, so search engines must be constantly looking for new pages and adding them to their index.
Search engines find web pages in many ways. One way is by following a link from a page that has already been found. Another way is by reading a sitemap. A sitemap is a file of information, such as pages, images, or videos on your site, organized in a way that makes it easier for search engine bots to understand.
Many CMS (Content Management Systems), such as WordPress or Squarespace, auto-generate sitemaps. If you are unsure about your sitemap, contact Seer’s Technical SEO team.
Once search engines find pages they crawl them. Simply put, this means that their bots look at them and see what they are about. They analyze the written content, non-written content, visual appearance, and overall layout.
Websites that can be found by search engines are or can be crawled between every few days to every few weeks. Factors such as popularity, seasonality, and structure, all play a role in how often your site is crawled.
If you are curious about how you can fix crawl errors your site might be experiencing, check our Seer’s How To Fix Your Crawl Errors.
Indexing: How Search Engines Store and Organize URLs
Indexing is the process of analyzing a page and storing and cataloging it. After a page is found and crawled, the relevant information is indexed. However, not all crawled information is relevant - just because a page is found and crawled does not mean it will be indexed.
What is a Search Index?
All of the information this is indexed is kept in a search index. Search indexes are massive in size and scale. For example, Google’s is well over 100,000,000 gigabytes and is housed in about 2.5 million servers around the world. Search indexes are designed to map search queries to URLs, making it easy for users to make a search and receive hundreds of billions of results in under 1 second.
To map those URLs to search queries, the index also stores relevant key information about the contents of each URL including:
- What keywords are used on the page to describe its content
- What type of content is on the page
- How recently the content was updated
- How users have previously interacted with the page
Ranking: How Search Engines Rank URLs
Once pages are crawled and indexed, they are eligible to be served in a search engines results page (SERP). SERPs are what you get right after you type a query into a search engine. The relevant results listed on a SERP are essentially ranked - #1 gets listed at the top of the page (Often underneath ads) followed by the other pages, in ascending order.
Search engines determine rankings by many factors. Considerations include relevancy, quality, location, authority, and device to name a few. Decoding ranking factors and determining which your website needs to improve is the basis of search engine optimization (SEO).
Improve your rankings by subscribing to Seer’s newsletter for the latest search engine know-how.
Learn More
Only Got 5 Minutes?