All Channels

Will Google OCR The Whole Web?

The argument comes from the fact that Scanning all images into text can make Google a lot more superior to other search engines. There are billions of images that have important text in them and the fact that scanning them cannot be ignored in the near future if Google wants to keep a big chunk of the search market.

fatstarr4758d ago

Pretty interesting read should be possible because anytime i make things in html i get yelled at for not marking the picture with an text of what it is. but Google better slow down before some conspirators file for an unfair monopoly and ruin it for the rest of us.

Zeevious4757d ago

Some search engines completely ignore the alt-text fields, so I can see an OCR scan coming in handy.

The problem is it might make the web uglier with designers dropping any custom fonts or styles used so standard fonts can be read.

Still, it's better than not being able to be read at all. I've seen great looking sites that where a brick thrown through a window, overlooking a cliff, from a search-bot's point of view!

Remember : An unscannable a terrible thing to waste.

Stapler4757d ago

I also like the fact that the site has an image,

"This text in image might be readable by Google in the future...."

And all I had to was take a picture of it with Google Goggles (on an Android phone) and it recognized the text immediately. So they already have the tech for text recognition, they would only need to implement it in their indexing process.