In March 2007 Blaise Aguera y Arcas presented Seadragon & Photosynth at TED that created quite some buzz around the web. About a year later, in March 2008, Microsoft released Deep Zoom (formerly Seadragon) as a «killer feature» of their Silverlight 2 (Beta) launch at Mix08. Following this event, there was quite some back andforth in the blogosphere (damn, I hate that word) about the true innovation behind Microsoft's Deep Zoom.

Today, I don't want to get into the same kind of discussion but rather start a series that will give you a «behind the scenes» of Microsoft's Deep Zoom and similar technologies.

This first part of «Inside Deep Zoom» introduces the main ideas & concepts behind Deep Zoom. In part two, I'll talk about some of the mathematics involved and finally, part three will feature a discussion of the possibilities of this kind of technology and a demo of something you probably haven't seen yet.

Background

As part of my awesome internship at Zoomorama in Paris, I was working on some amazing things (of which you'll hopefully hear soon) and in my spare time, I've decided to have a closer look at Deep Zoom (formerly Seadragon.) This is when I did a lot of research around this topic and where I had the idea for this series in which I wanted to share my knowledge.

Introduction

Let's begin with a quote from Blaise Aguera y Arcas demo of Seadragon at the TED conference[1]:…the only thing that ought to limit the performance of a system like this one is the number of pixels on your screen at any given moment.

What is this supposed to mean? See, I have a 24" screen with a maximum resolution of 1920 x 1200 pixels. Now let's take a photo from my digital camera which shoots at 12 megapixel. The photo's size is typically 3872 x 2592 pixels. When I get the photo onto my computer, I roughly end with something that looks like this:

No matter how I put it, I'll never be able to see the entire 12 megapixel photo at 100% magnification on my 2.3 megapixel screen. Although this might seem obvious, let's take the time and look at it from another angle: With this in mind we don't care anymore if an image has 10 megapixel (that is 10'000'000 pixels) or 10 gigapixel (10'000'000'000 pixels) since the number of pixels we can see at any moment is limited by the resolution of our screen. This again means, looking at a 10 megapixel image and 10 gigapixel image on the same computer screen should have the same performance. The same should hold for looking at the same two images on a mobile device such as the iPhone. However, important to note is that with reference to the quote above we might experience a performance difference between the two devices since they differ in the number of pixels they can display.

So how do we manage to make the performance of displaying image data independent of its resolution? This is where the concept of an image pyramid steps in.

The Image Pyramid

Deep Zoom, or for that matter any other similar technology such asZoomoramaZoomifyGoogle Maps etc., uses something called animage pyramid as a basic building block for displaying large images in an efficient way:

The picture above illustrates the layout of such of an image pyramid. The two purposes of a typical image pyramid are to store an image of any size at many different resolutions (hence the term multi-scale) as well as these different resolutions sliced up in many parts, referred to as tiles.

Because the pyramid stores the original image (redundantly) at different resolutions we can display the resolution that is closest to the one we need and in a case where not the entire image fits on our screen, only the parts of the image (tiles) that are actually visible. Setting the parameter values for our pyramid such as number of levels and tile size allows us to control the required data transfer.

Image pyramids are the result of a space vs. bandwidth trade-off, often found in computer science. The image pyramid obviously has a bigger file size than its single image counterpart (for finding out how much exactly, be sure to come back for part two) but as you see in the illustration below, regarding bandwidth it's much more efficient at displaying high-resolution images where most parts of the image are typically not visible anyway (grey area):

As you can see in the picture above, there is still more data loaded (colored area) than absolutely necessary to display everything that is visible on the screen. This is where the image pyramid parameters I mentioned before come into play: Tile size and number of levels determine the relationship between amount of storage, number of network connections and bandwidth required for displaying high-resolution images.

Next

Well, this was it for part one of Inside Deep Zoom. I hope you enjoyed this short introduction to image pyramids & multi-scale imaging. If you want to find out more, as usual, I've collected some links in theFurther Reading section. Other than that, be sure to come back, as the next part of this series – part two – will discuss the characteristics of the Deep Zoom image pyramid and I will show you some of the mathematics behind it.

Further Reading

References

http://www.gasi.ch/blog/inside-deep-zoom-1/的更多相关文章

  1. http://www.gasi.ch/blog/inside-deep-zoom-2/

    Inside Deep Zoom – Part II: Mathematical Analysis Welcome to part two of Inside Deep Zoom. In part o ...

  2. [WPF系列]-Deep Zoom

        参考 Deep Zoom in Silverlight

  3. openseadragon.js与deep zoom java实现艺术品图片展示

    openseadragon.js 是一款用来做图像缩放的插件,它可以用来做图片展示,做展示的插件很多,也很优秀,但大多数都解决不了图片尺寸过大的问题. 艺术品图像展示就是最简单的例子,展示此类图片一般 ...

  4. A SIMPLE LIBRARY TO BUILD A DEEP ZOOM IMAGE

    My current project requires a lot of work with Deep Zoom images. We recently received some very high ...

  5. 零元学Expression Blend 4 - Chapter 23 Deep Zoom Composer与Deep Zoom功能

    原文:零元学Expression Blend 4 - Chapter 23 Deep Zoom Composer与Deep Zoom功能 最近有机会在工作上用到Deep Zoom这个功能,我就顺便介绍 ...

  6. 论文笔记之:Dueling Network Architectures for Deep Reinforcement Learning

    Dueling Network Architectures for Deep Reinforcement Learning ICML 2016 Best Paper 摘要:本文的贡献点主要是在 DQN ...

  7. What are some good books/papers for learning deep learning?

    What's the most effective way to get started with deep learning?       29 Answers     Yoshua Bengio, ...

  8. Life of an Oracle I/O: tracing logical and physical I/O with systemtap

    https://db-blog.web.cern.ch/blog/luca-canali/2014-12-life-oracle-io-tracing-logical-and-physical-io- ...

  9. RNN and LSTM saliency Predection Scene Label

    http://handong1587.github.io/deep_learning/2015/10/09/rnn-and-lstm.html  //RNN and LSTM http://hando ...

随机推荐

  1. Atitit 图像处理类库 halcon11  安装与环境搭建attilax总结

    Atitit 图像处理类库 halcon11  安装与环境搭建attilax总结 正常安装软件,安装前请先退出其它一切正在运行的程序. 先安装halcon-10.0-windows.exe.安装完成后 ...

  2. Ultraedit使用小技巧

    4. 编辑文件如何加入时间戳 ?F7 快捷键即可.你试试看? 5. 为何拷贝(Copy)/粘贴(Paste)功能不能用了?不怕大家笑话,我有几次使用 UltraEdit的过程中发现拷贝与粘贴的内容是不 ...

  3. FFmpeg编译: undefined reference to 'av_frame_alloc()'

    今天使用CMake编译FFmpeg的时候,死活编不过,提示什么“undefined reference to 'av_frame_alloc()” 后来仔细查找,发现是头文件包含错误. 错误的代码: ...

  4. Adding support for distinct operation for table API on DataStream

    https://github.com/apache/flink/pull/6521/files/66c3bd5d52a5e4af1f83406035b95774e8b6f636#diff-680b30 ...

  5. zsh与oh-my-zsh

    在开始今天的 MacTalk 之前,先问两个问题吧: 1.相对于其他系统,Mac 的主要优势是什么?2.你们平时用哪种 Shell?…… 第一个童靴可以坐下了,Mac 的最大优势是 GUI 和命令行的 ...

  6. 如何将 iOS 工程打包速度提升十倍以上

    如何将 iOS 工程打包速度提升十倍以上   过慢的编译速度有非常明显的副作用.一方面,程序员在等待打包的过程中可能会分心,比如刷刷朋友圈,看条新闻等等.这种认知上下文的切换会带来很多隐形的时间浪费. ...

  7. DIOCP开源项目-DIOCP3的重生和稳定版本发布

    DIOCP3的重生 从开始写DIOCP到现在已经有一年多的时间了,最近两个月以来一直有个想法做个 30 * 24 稳定的企业服务端架构,让程序员专注于逻辑实现就好.虽然DIOCP到现在通讯层已经很稳定 ...

  8. js实现冒泡事件,点击ul给子标签添加相同事件和阻止冒泡事件

    $('#LocalLife_PopUp_layer').find('.SelectCity_Cont ul').click(function(e){            var e=e||windo ...

  9. 实现基于最近邻内插和双线性内插的图像缩放C++实现

    平时我们写图像处理的代码时,如果需要缩放图片,我们都是直接调用图像库的resize函数来完成图像的缩放.作为一个机器视觉或者图像处理算法的工作者,图像缩放代码的实现应该是必须掌握的.在众多图像缩放算法 ...

  10. Android线程的创建与销毁

    摘要: 在Android开发中经常会使用到线程,一想到线程,很多同学就立即使用new Thread(){...}.start()这样的方式.这样如果在一个Activity中多次调用上面的代码,那么将创 ...