signals of noise

Page 20 of 21

How does a Search Engine work?

How a search engine worksAnyone who has spent some time on the internet would have used some search engine. But we hardly spare a thought to understand how a search engine works. Or for that matter how does a search work. It really doesn’t matter to most as long as the likes of Google and Bingcan dish out relevant results for our queries.

But this question matters to those who are interested in search marketing, to those inquisitive kinds who would not shy away from some extra knowledge and to those who would like to appreciate the incredible technology that goes behind fetching those results. How does a search engine work? What happens when you enter a query? How does the search engine fetch the relevant results? And how are the results ranked? These are some of the questions that I attempt to answer in this article.

What is a search engine? A search engine is a program that automatically browse the world wide web methodically, stores and indexes the browsed data and then allow users to query that data to provide as far as possible relevant results.

The entire search begins long before you have even thought of something to search. It begins with creating an inventory of pages in the search index. The search index comprises of all possible keywords mapped to the websites which contain those keywords. However, to save space, the index does not store the webpage urls, but a unique document ID that identifies those urls in a separate database.

The construction of this search index begins with a spider (or crawler, or bot). The spider starts by examining web pages in a seed list but then discovers sites on its own by following links. The spider identifies links by checking the HTML code of the web pages it visits. Thus, theoretically, given enough time, a spider can find every page in the web (at least those that are linked to at least another page). But that is purely theoretical. Various researches to find how much of the web is actually indexed throw up widely divergent numbers from 0.03% to 16% of the web.

While crawling is probably the most efficient way of discovering web pages, it is definitely not the most efficient when it comes to discovering changes made to a web page. This is simply because there is no surety when the spider will return to a site. By then a web page could have changed dramatically or even ceased to exist. Once the spider has found a web page and added it to the index, it is time for the search engines to analyze those pages.

That is just about the simplest description of what a search engine does to build the search index. Crawling, indexing and analysis could very well be the topic of a dedicated article. But that is not the point of this one. So let’s move ahead to find out what happens when you actually enter a query.

Once you have typed in your query and clicked on the search button (or pressed the enter key), the search engine starts by matching the search query to pages in the search index. The first step in the process is to analyze the query. The search engine examines each word in the query to find the best web pages in the search index that match. Analyses of search queries involve finding word variants, correcting spellings, detecting phrases and antiphrases (words such as ‘what’, ‘is’, ‘the’), examining word order and processing search operators.

Once the analysis is done, the next task for the search engine is to decide which results to present. With hundreds of thousands of possibilities this is a tough task. This is where the search index comes to use. The search engine uses this index to locate the matching pages depending not only on the query as entered by you but also any word variants (e.g. ‘mouse’ and ‘mice’) and words to ignore.

Now comes the most interesting and challenging phase of the search engine’s job. Ranking the matching pages. This is where the ranking algorithm comes to play (the most famous of which is Google’s PageRankTM algorithm). Ranking, very simply put, is just sorting by relevance. There are a variety of factors that go into consideration while ranking the matching pages. These include keyword density, keyword proximity, keyword prominence and link popularity. Link popularity has emerged as the most popular factor in ranking since it can act as a surrogate for quality and reliability.

Sounds simple? This is what Google has to say about their PageRankTM[1]: “We use more than 200 signals, including our patented PageRank™ algorithm, to examine the entire link structure of the web and determine which pages are most important. We then conduct hypertext-matching analysis to determine which pages are relevant to the specific search being conducted. By combining overall importance and query specific relevance, we’re able to put the most relevant and reliable results first”.

So now you know what happens in those milliseconds after you type in your query and hit the enter key and the search engine presents the results to you. This article is more of an attempt to enlighten as many as possible to the intricacies of a piece of technology that has become so ubiquitous in our lives.

PS: About Google’s PageRank™ – PageRank™ mainly relies on the ‘democratic nature’ of the web by using its vast link structure as an indicator of an individual page’s value. Important, high quality sites receive a higher PageRank™. So, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at a lot more than the sheer volume of links a page receives. For example, it also analyzes the page that casts the vote and votes by pages that are important weigh more heavily and help to make other pages important. A site’s rate of link acquisition, the longevity of a link, the text used for the link, whether it’s a ‘deep link’ or to the homepage and whether anyone clicks on the link seem also to count.

That’s about all that we know about PageRank™, the rest of the mystery is safely secured in Mountainview.


How to Create PDF Files without a PDF Writer

And without an internet connection or when you are behind a corporate firewall. There are plenty of online options that you can fall back on. Also this works for Microsoft Office 2007 and above (I don’t think this option is available for Office 2003).

  • Open the Word, Excel or PowerPoint document.
  • Click on the Office Button and select send. Here you will get an option to ‘Email as PDF Attachment’.

Save Word document as PDF

Convert Excel worksheets and Powerpoint presentations to PDF

  • Select this option. This will open the Outlook window with the pdf file as an attachment. Simply right-click and save in to you hard-drive.

Save PDF file with Outlook

You can make this process work with images as well. Just add the image to a PowerPoint slide or a Word document and follow the above steps.

This process may not be very valuable to all, but I found it very useful since I do not have a PDF writer installed. I also tend to avoid the online tools and services due to privacy concerns.

Facebook Questions: New Opportunities

Facebook launched their Questions service last week to a select few. I am still not part of it.

Facebook Questions

Since we like to develop products carefully over time with your help, Facebook Questions is available to a limited number of people right now, and we’ll be developing it rapidly based on their feedback. We’re aiming to bring this product to all of you as quickly as we can.

- Blake Ross, Director of Product

Even without using it yet I have some thoughts on the same. Mostly from reading the aforementioned blog post and other commentaries from around the web.

At first glance Facebook questions provide another consumer-marketer touch-point on the web. If you, as a consumer, post a query then any marketer can step forward to answer it. This is a nice way for a marketer to spread thought leadership and pull consumers towards their brands. Facebook pages are right now the only interaction medium on the network. It is a great way to build community participation but it does not let a marketer attract new members to that community.

Facebook questions could provide exactly that opportunity.

However, the marketers must approach this cautiously. This is not the place to push their messages. They should avoid selling their products and services via their answers. It is better to redirect the consumer to the brand page or website. Otherwise there is the risk of the answer being considered spam and being voted down as irrelevant.

Besides marketers, small businesses and start-ups might also greatly benefit from Facebook questions. They could validate ideas, get recommendations or crowd-source problem solving.

There have been and are similar existing services from Quora to Yahoo! Answers and WikiAnswers. But the reason that Facebook Questions might work better is a number. And its not 42. It is 500 million.

7 Steps to Writing a New Product Point-of-View

As you start work on a new product you may need to create a point-of view document for it. This is really a step before the full fledged business plan and may or may not be required in every organization. But it is a good place to start since you can easily extend it to make the business plan.

New product point of view

Here is what you might want to do:

  1. Start off with snippets of information highlighting the problems your product is trying to solve. Collect real data and spruce it up with appropriate imagery. For example, something like “people spend x% of their work time on social networks to gain knowledge about their work”.
  2. Create a summary slide/table to consolidate those problems.
  3. Highlight “what-if” scenarios to create awareness of the solution that your product will offer. For example, if you are building a enterprise social network you could highlight some thing like “what if your employees could interact with each other to share knowledge, best practices and information”. You get the drift, right.
  4. Introduce the product with its modules/functionality and the relationship between them. (Add a business architecture diagram for those who might want to know about your product in greater detail.)
  5. Add market data about your industry and the size of the addressable market. Revenue projections, costs and other financial details can go in the formal business plan.
  6. Do not forget to add references to the sources from where you got all the information in an appendix slide.
  7. Also add any other relevant information in greater detail in the appendix slides. This is in case you need to mail out the point-of-view to others and you don’t have the opportunity to talk them through it.

[image: Flickr/Richard Moross]

Location Based Services and Marketing

Foursquare logoAdAge had a report a few days back on a Forrester Research report on location based apps/services like Foursquare, Loopt and Gowalla. They have numbers to show that these services are too small for marketers to focus on right now.

Absolutely. Twitter was also small a couple of years back. And Facebook was smaller than MySpace. And location based services has too much of potential to ignore. It increases the probability of conversion/purchase. And now may be the right time to begin testing it. First mover advantage, anyone?

And there is the opportunity to be prepared when location based services cross the tipping point. It may be too late to start then. It is also in the current service providers’ interests to build partnerships now before Facebook, Google and Twitter makes location a part of their offering (completely).

Google Image Search: Binged

Google launched a revamped version of their image search page yesterday.

Google image search interface

Here’s what’s new in this refreshed design of Google Images:

  • Dense tiled layout designed to make it easy to look at lots of images at once. We want to get the app out of the way so you can find what you’re really looking for.
  • Instant scrolling between pages, without letting you get lost in the images. You can now get up to 1,000 images, all in one scrolling page. And we’ll show small, unobtrusive page numbers so you don’t lose track of where you are.
  • Larger thumbnail previews on the results page, designed for modern browsers and high-res screens.
  • A hover pane that appears when you mouse over a given thumbnail image, giving you a larger preview, more info about the image and other image-specific features such as “Similar images.”
  • Once you click on an image, you’re taken to a new landing page that displays a large image in context, with the website it’s hosted on visible right behind it. Click anywhere outside the image, and you’re right in the original page where you can learn more about the source and context.
  • Optimized keyboard navigation for faster scrolling through many pages, taking advantage of standard web keyboard shortcuts such as Page Up / Page Down. It’s all about getting you to the info you need quickly, so you can get on with actually building that treehouse or buying those flowers.

Bing image search interfaceSome updates for advertisers were also announced. (Check lower down in the blog post.)

But here is the surprise. The new interface looks very similar to this.

Surprise. Surprise.

Not much to be surprised about though since this is not the first we are seeing the Google UI being Binged!! The background images came first. See here and here.

Are people starting to realize that there is not much difference between the results of search engines? That for some queries Google is better and for others Bing? So it does not make a difference what you are using as long as you have a good user experience.

[image: Techcrunch]

Choose Bing, Donate $3

In their latest attempt to gain search market share, Bing will be giving you a $3 donation code to fund a classroom project. And there’s a long term vision behind it as well. Catch ‘em young they say.

The Bing-Google battle

From the Bing blog:

Every day students use Bing to learn and explore, so we have a vested interest in seeing them succeed.

This is how it works:

  • Make “Bing” your default browser (by visiting here)
  • Submit your email address to receive the $3 donation code to
  • Check your email for the donation code
  • Then go to and apply your donation code to the project of your choice

Nice way to get some people to change their default search providers. That soft nerve. But will it work.

PS: I am not questioning Microsoft’s intentions here.

PPS: I myself use Bing as my default search provider on all browsers. Only occasionally switching to Google.

[image: Flickr/Manuel Iglesias]

Where is India’s Foursquare?

Foursquare location serviceLocation is a big thing today. Be it Foursquare check-ins, hyperlocal content, Twitter location API. Foursquare just completed a $20 million round from Andreessen Horowitz valuing it at $95 million. And that is based just on the possibilities of location. No proven revenue sources yet. But the potential is there.

But I did not start writing this post to underscore the ‘value’ of Foursquare or Location Based Services (LBS). I was more like thinking, where is India’s Foursquare?

LBS like Foursquare and Gowalla are very focused on smartphone users. However smartphone users form a very small fraction of the Indian mobile user (just 5% according to some estimates). Add to that the relatively high data charges and the average monthly mobile bill of just $5.

There is very little traction a smartphone based location service can gain.

But there is something in which Indians excel. SMS. An average Indian sends 29 SMSs per month (TRAI data). So there is huge opportunity to tap in this area to overcome the absence of smartphones in India.

The only form of location based mobile marketing in India is currently opt-in bluetooth based promotions at some malls (e.g. Forum mall in Bangalore). But as far as SMS marketing goes, it is only and mostly spam.

According to the the Mobile Market Report SMS Usage In Urban India about 51% mobile users received SMS marketing messages of some sort with a 2/3rds of users taking no action. Essentially deleting the message. Just 11% make a purchase.

But all this is primarily because of the amount of irrelevant messaging being pushed to our phones. The conversion rates can be easily increased if the Airtels and Vodafones of the world brought in the location angle.

SMS as a marketing channel is underutilizedIt should not be too difficult to figure out the location of a subscriber based on the nearest mobile tower. And then if they can push out local offers, coupons, etc. from local businesses, my guess is that it would increase conversion. Even if the technology does not exist today it is definitely in the interest of the service providers to work towards this end. It will open up a new source of revenue for them. Which cannot be a bad thing considering the squeezed margins in the overcrowded space today.

Does technology to support this kind of technology exist today? If not, how difficult is it to implement it? More importantly, can this work? Let me know your thoughts.

Update: Just came across this article. There definitely seems to be a market for location based text marketing.

[image: Flickr/Dennis Crowley and Flickr/Katie Lips]

Google and Social-Networking

Social Networks - Where is Google?If we are to believe recent rumors, Google is, again, trying to take a bite at Facebook. And social networking at large. Pete Cashmore of Mashable thinks it is good for consumers as it gives us more choice. Especially when privacy concerns hit. You can head over to the CNN post to read about his thoughts.

Here’s why I think Google will be in social networking sooner rather than later whether ‘Google Me’ is a rumor or not.

Social networking is not new for Google. They have been at it since January 2004 when Orkut was launched. However, if you have been following Orkut over the past 5-6 years it is not difficult to notice the evident lack of innovation in the service vis-à-vis other Google products/services. The only innovation that I can remember was an attempt to copy the Facebook News Feed. All it achieved was to make the user interface even more awful.

Clearly Google wasn’t really interested in Orkut too much and let it remain kind of a hobby project.

Google Buzz was different. This time Google was really interested in tapping the social graph. This time they just succeeded in generating a lot of privacy backlash.

Algorithms may be the right way to approach the solution to search at the moment, but definitely not social networking.

But how long will algorithms be the only way to drive search as well? How long into the future anyways?

Especially with Facebook’s ‘Like’ button proliferating and 500 million users worldwide sharing links on the network directly, or by hooking it up to Twitter, there is a lot of user recommended links out on the web. Users who you know and trust. Backed up by others in your social graph.

And 500 million is like more than a quarter (28%) of the entire internet population (1.8 billion – Source). That’s a lot of real recommendations. Not an algorithm counting links and deciding the right result for you. And Facebook is not slowing down.

So, at least theoretically, Facebook could churn out better search results than Google. And by extension, more targeted ads given they have access to personal data as well. Which essentially means that more people will head to Facebook for information instead of going to Google. Difficult to imagine? At least in the UK, social networks received more visits than search engines in May 2010.

This is a threat to Google’s primary source of revenue. Ads. So they cannot afford to stay away from the social networking business for too long. If not for getting users to switch, at least to slow down Facebook’s growth.

It isn’t any longer a matter of whether ‘Google Me’ is real. It is the simple matter of when Google enters the social networking business for good.

[image: Flickr/Butch Lebo]

Mobile Driven In-Store Retailing

Mobile webI recently wrote about an idea for a mobile app that could improve the in-store experience of consumers.

eMarketer seems to agree. Not in so many words as suggesting as similar app, but suggesting that in-store m-commerce could be a potential game changer for retail.

The use of mobile in-store is still limited to actually calling and/or texting a friend or acquaintance to get product feedback. With the proliferation of smartphones, and retailers realizing the in-store potential of mobile, this ’primitive’ usage would surely shift.

Mobile shopping from in-store is just beginning to achieve its vast potential. The promise for consumers is an interactive and personalized store experience like nothing before. Currently, in- store mobile shoppers can easily retrieve customer product reviews. In the future, they will receive promotions based on their past purchase history and what they are interested in at the moment.

Data on the moveSome of that – the real-time promotions part – I have talked about in my post. And the reasons for retailers to take mobile seriously remain the same. More conversions. More loyalty.

Generation Y consumers, who think of their mobile phone as an extension of themselves, will push development of this market by demanding that retailers offer the interactive experience they expect while shopping in a store.

I do not have paid access to eMarketer. So I haven’t read the report in its entirety. It would be interesting to know how mobile could affect the in-store shopping experience of consumers.

[image: Flickr/Johan Larsson and Wikipedia]

« Older posts Newer posts »

Copyright © 2014 signals of noise

Theme by Anders NorenUp ↑